Understanding Xcode GPU Capture "Group by API Call" timings

I see a lot of timings like this in Xcode GPU Frame Capture.

Specifically what I want to call attention to is that the sum of the parts of the command buffer time does not add up to the total that is displayed. I notice that for GPU frame time, Xcode adds up all the command buffer times. But if I only add up the shader times, I get a much lower time (1.5-2 ms lower). I am trying to understand what is going on, I mean where is the GPU time going if it isn't going to the shaders?

Notice how the total time is listed as 0.211 ms, but the sum of the parts is only 0.047 ms.

I just want to better understand what is going on.

Thanks.

Hi jwilde,

Thank you for bringing this to our attention. Command encoders may have some additional driver overhead that could cause the sum of the dispatch times to be smaller than the total time of the encoder, but in this case the difference is pretty significant. Can you you provide the version numbers of Xcode/iOS/macOS and the device you are using, so we can investigate? And if possible, please use the Feedback Assistant to report this issue along with a gputrace file reproduces this issue.

Thank you!

alright thanks, I appreciate any help I can get! Yeah so I did some profiling on an iPhone XS Max running iOS 14.4.2 and an iPhone X running iOS 14.4.2. I am running Xcode version 13.0 (13A233).

on the iPhone XS Max, the command encoder timings match very well with the shader sum times (max < 0.5 microsecond difference). On the iPhone X, it is very far off (the example image I posted in the post is from the iPhone X)

and actually my iPhone 7 matches timings very well also. So seems that the iPhone X is the outlier...

Glad to hear that the timings do make sense for the XS Max and iPhone7. We will try to investigate why the iPhoneX numbers are so far off. If possible, we would really appreciate if you could file this issue using the feedback assistant, together with a gpu trace that exhibits the issue. That will greatly help our investigation. Thank you!

Understanding Xcode GPU Capture "Group by API Call" timings
 
 
Q