Does Metal on iOS do async compute by default?

Hi,

Let's assume I commit 2 CommandBuffers to the same CommandQueue:

  • The 1st buffer contains a render pass
  • The 2nd buffer contains a compute pass.

I used enqueue() to make sure the buffers would run in that order.

Will the GPU wait for the end of the render pass (1) before to run the compute pass (2)? Or will it run them concurrently?

I am asking because while porting our game to iOS we ran into flickering artefacts. We took a capture and it seems 2 passes with dependencies are running concurently (see screenshot). Yet, they belong to separate command buffers.

Do I need to add MTLEvent to prevent this from happening? I thought there was no need for dependency tracking between command buffers running on the same queue.

Thank you for your help.

After some investigation, it looks like this overlap in the graph is a bug in the Metal profiler.

Let Texture_A be the output of the render pass (1) and the input of the compute pass (2). Playing with the content of Texture_A confirms that the compute pass runs after the render pass.

Clearing Texture_A early in the frame has no effect on the compute pass. Clearing Texture_A between the render pass and the compute pass shows that the compute pass input has changed.

I'll report this to Apple.

MTLEvent is what you need. You have to prevent concurrent execution of commands when you have two separate command buffers. Even with one, the compute and render passes can overlap, so then fences/barriers are needed.

Thanks Alecazam. One important thing I didn't mention is that our game uses "untracked" ressources.

In this same thread mentioned earlier, the Apple engineer mentions :

You must use a fence to sychronize untracked resource across command buffers from the same queue. So if command buffer 1 writes to an untracked texture and later excute command buffer 2 which reads from it, you need a fence

That is a little confusing. Apple's documentation states that the scope of a fence is limited to a single command buffer:

An MTLFence synchronizes access to one or more resources across different passes within a command buffer. Use fences to specify any inter-pass resource dependencies within the same command buffer.

I made a request to Apple to add a note in the documentation about that. Adding a fence fixed all the flickering on our side.

My understanding is that MTLFence don't work across command buffers. That's so you don't get a race on a resource that is changed within a command buffer. So when you go to untracked resources, Metal stops injecting fences for you. You said you had two command buffers. MTLEvent for cross command buffer, and MTLSharedEvent for synchronization across command queues, but few people do that.

Does Metal on iOS do async compute by default?
 
 
Q