maxwellpirtle’s Profile | Apple Developer Forums

Reply to Unexpected behavior for shared MTLBuffer during CPU work

Currently, since this project is a work-in-progress, only a single execution of the image pipeline executes. During the execution, theMTLCaptureManager captures the execution of the command buffer. There is no loop: it executes exactly once, and its execution is analyzed. Within the execution of the image processing pipeline, this is the only spot where the GPU-CPU synchronization occurs with the shared event. The shared event resource, as well as the other resources in the pipeline, are created before the creation of the command buffer. The resources used in the pipeline are all tracked by Metal (hazardTrackingMode = .tracked) (though I hope to change this in the future and use heaps for more efficiency) Here is a brief overview of how the code is organized: preloadResources() // 1. Let CoreImage render the CGImage into the metal texture let commandBufferDescriptor = /// ... enable `encoderExecutionStatus` to capture errors let ciCommandBuffer = commandQueue..makeCommandBuffer(descriptor: commandBufferDescriptor) let ciSourceImage = CIImage(cgImage: sourceImage) ciContext.render(ciSourceImage, to: sourceImageTexture, commandBuffer: ciCommandBuffer, bounds: sourceImageTexture.bounds2D, colorSpace: CGColorSpaceCreateDeviceRGB()) ciCommandBuffer.commit() // 2. Do the rest of the image processing let commandBuffer = commandQueue.makeCommandBuffer(descriptor: commandBufferDescriptor)! try imageProcessorA.encode(commandBuffer: commandBuffer, sourceTexture: sourceImageTexture, destinationTexture: sourceImageIntermediateTexture) try imageProcessorA.encode(commandBuffer: curveDetectionCommandBuffer, sourceTexture: sourceImageIntermediateTexture, destinationTexture: destinationImageTexture) commandBuffer.commit() imageProcessorA contains kernelA and kernelB and performs the synchronization as described above. I suppose I could schedule a technical review session with an engineer to provide more details of the project if more context is needed to resolve the problem.

Programming Languages Swift

Mar ’22

Reply to Unexpected behavior for shared MTLBuffer during CPU work

extension MTLCommandBuffer { func encodeCPUExecution(for sharedEvent: MTLSharedEvent, listener: MTLSharedEventListener, work: @escaping () -> Void) { let value = sharedEvent.signaledValue sharedEvent.notify(listener, atValue: value + 1) { event, _ in work() event.signaledValue = value + 2 } encodeSignalEvent(sharedEvent, value: value + 1) encodeWaitForEvent(sharedEvent, value: value + 2) } } This is the code for encodeCPUExecution my mistake for not making it clear enough. In fact the GPU does wait on value + 2 as you described, yet the behavior still exists. The issue is that the computation is quite suited for CPU execution (it can actually take advantage of dynamic programming for O(n) time) and is not suited for GPU execution, though I suppose you could have a single thread write the result out in a similar way the CPU does (which is probably more performant even) I would still like to figure out why this behavior exists in the first place, even if the computation is pushed to a single thread on the GPU

Programming Languages Swift

Feb ’22

Reply to Metal Quadgroups Example Usage

I found the tech talk "Discover advances in A15 Bionic" which describes one use case of quadgroups and quadgroup functions at around the 21:00 minute mark where they're used to reduce texture reads. If anyone has any other use cases let us know.

Programming Languages Swift

Jan ’22

Reply to Storage of `ray_data` in ray tracing payload

Two reasons: Sheer curiosity I was worried that I could run out of tile memory if ray_data payloads were stored there. I was hoping to implement something like Rich Forster mentioned in the associated talk "Get to know Metal function pointers." Around 18:00 is the relevant section. There, he talks about divergence as a result of the different threads invoking different functions. The solution (described around 19:00) was to use threadgroup memory to pass around relevant data, which I thought could be constrained by the size of the payload Of course, I should maybe figure that this would have been mentioned somewhere if it were something to consider and so I don't have to worry, but it's interesting nonetheless. P.S. I haven't written the ray tracing kernel yet, nor the intersection/visible function(s). But it was something I considered as I was designing my program

Programming Languages Swift

Apr ’21

Reply to MTLSharedEvent scheduled block called before command buffer scheduling and not in-flight

I see. It seems for some reason that I incorrectly assumed that the value would be reset somehow after the buffer finished executing (so that it was monotonically increasing within the "scope" of a single buffer encoding). This works as expected

Programming Languages Swift

Feb ’21

Reply to Metal Debugger Issues

So unfortunately I will not be able to submit a debug request since I updated to Big Sur before seeing this post. However, I am happy to say that the debugger is working flawlessly in Xcode 12.3 with Big Sur 11.1.

Developer Tools & Services Xcode

Dec ’20

Reply to Metal Debugger Issues

macOS 10.15 is the deployment target

Developer Tools & Services Xcode

Nov ’20

Reply to Metal debugger error Xcode 12.0.1: GPU frame capture unable to create shader debug session

My sessions also crash occasionally. Hopefully the next version fixes this critical debug tool as it's nearly impossible to debug shaders without it.

Programming Languages Swift

Oct ’20

maxwellpirtle

Post

Replies

Boosts

Views

Activity