I’ve been experimenting with Dispatch, and workloops in particular. I gather that they’re similar to serial queues, except that they reorder work items by QoS. I suspect there’s more to workloops than meets the eye, though; calling dispatch_set_target_queue on them has no effect, in spite of the <dispatch/workloop.h> saying that workloops “can be passed to all APIs accepting a dispatch queue, except for functions from the dispatch_sync() family”.
Workloops keep showing up in odd places like Metal and Network.framework backtraces, and <dispatch/workloop.h> includes functionality for tying workloops to os_workgroups (?!).
What exactly is a workloop beyond just a serial queue with priority ordering, and why can’t I set the target queue of one?
Post
Replies
Boosts
Views
Activity
Suppose I want to draw a red rectangle onto my render target using a compute shader.
id<MTLComputeCommandEncoder> encoder = [commandBuffer computeCommandEncoder];
[encoder setComputePipelineState:pipelineState];
simd_ushort2 position = simd_make_ushort2(100, 100);
simd_ushort2 size = simd_make_ushort2(50, 50);
[encoder setBytes:&position length:sizeof(position) atIndex:0];
[encoder setTexture:drawable.texture atIndex:0];
[encoder dispatchThreads:MTLSizeMake(size.x, size.y, 1)
threadsPerThreadgroup:MTLSizeMake(32, 32, 1)];
[encoder endEncoding];
#include <metal_stdlib>
using namespace metal;
kernel void
Compute(ushort2 position_in_grid [[thread_position_in_grid]],
constant ushort2 &position,
texture2d<half, access::write> texture)
{
texture.write(half4(1, 0, 0, 1), position_in_grid + position);
}
This works just fine:
Now, say for whatever reason I want to start using imageblocks in my compute kernel. First, I set the imageblock size on the CPU side:
id<MTLComputeCommandEncoder> encoder = [commandBuffer computeCommandEncoder];
[encoder setComputePipelineState:pipelineState];
MTLSize threadgroupSize = MTLSizeMake(32, 32, 1);
[encoder setImageblockWidth:threadgroupSize.width
height:threadgroupSize.height];
simd_ushort2 position = simd_make_ushort2(100, 100);
simd_ushort2 size = simd_make_ushort2(50, 50);
[encoder setBytes:&position length:sizeof(position) atIndex:0];
[encoder setTexture:drawable.texture atIndex:0];
MTLSize gridSize = MTLSizeMake(size.x, size.y, 1);
[encoder dispatchThreads:gridSize threadsPerThreadgroup:threadgroupSize];
And then I update the compute kernel to simply declare the imageblock – note I never actually read from or write to it:
#include <metal_stdlib>
using namespace metal;
struct Foo
{
int foo;
};
kernel void
Compute(ushort2 position_in_grid [[thread_position_in_grid]],
constant ushort2 &position,
texture2d<half, access::write> texture,
imageblock<Foo> imageblock)
{
texture.write(half4(1, 0, 0, 1), position_in_grid + position);
}
And now out of nowhere Metal’s shader validation starts complaining about mismatched texture usage flags:
2024-06-22 00:57:15.663132+1000 TextureUsage[80558:4539093] [GPUDebug] Texture usage flags mismatch executing kernel function "Compute" encoder: "1", dispatch: 0
2024-06-22 00:57:15.672004+1000 TextureUsage[80558:4539093] [GPUDebug] Texture usage flags mismatch executing kernel function "Compute" encoder: "1", dispatch: 0
2024-06-22 00:57:15.682422+1000 TextureUsage[80558:4539093] [GPUDebug] Texture usage flags mismatch executing kernel function "Compute" encoder: "1", dispatch: 0
2024-06-22 00:57:15.687587+1000 TextureUsage[80558:4539093] [GPUDebug] Texture usage flags mismatch executing kernel function "Compute" encoder: "1", dispatch: 0
2024-06-22 00:57:15.698106+1000 TextureUsage[80558:4539093] [GPUDebug] Texture usage flags mismatch executing kernel function "Compute" encoder: "1", dispatch: 0
The texture I’m writing to comes from a CAMetalDrawable whose associated CAMetalLayer has framebufferOnly set to NO. What am I missing?