I spend the majority of my time these days working on and optimizing shaders on iOS.
Xcode GPU Profiler has been very helpful for the most part, however one problem that still plagues me is that profiling is very inconsistent. When I first start up an app I generally get a pretty good boost in GPU performance, then it slows down after 30 seconds or so. I imagine this is due to increased workload at the beginning.
This means that I will get lower GPU timings for shaders during the first ~30 seconds of the app, but then if I let it sit for longer until I profiler, I get about 10% worse performance.
I believe viewing the GPU clock speed would help me better profile this. However, I have not found any way to view this metric.
Is there any way to view current GPU clock speed on iOS devices? Even if just through private methods for debugging purposes.
Thanks!
Hi jwilde,
I can acknowledge that it can be tricky to compare timings of two different runs of a workload when trying to profile shader optimizations, as the performance state of the device can vary for a number of reasons. Unfortunately there is no way to query the GPU clock speed (or more accurately the performance state of the device, as the GPU clock speed won't give you the complete picture).
In order to compare apples to apples when timing two different runs, the latest Xcode developer tools allow you to select the performance state (maximum, medium and minimum) of the device. By locking the performance state to either of these three options, you will be able to obtain consistent results, allowing the making fair comparisons. This is explained in the this presentation.
In particular, a workflow that I can recommend for optimizing shaders, is to use the shader editing feature of Xcode's GPU debugger. You start off by taking a gpu capture of your workload and profile it using the stopwatch button at maximum or medium performance state (as explained in the presentation) to get baseline timings. After this you make alterations to the shader that needs optimization and use the replace function and hit the stopwatch button again to profile the changes. This allows very quick iteration time, as you don't need to recompiling/restart your applications for each run.
If for any reason this method of profiling doesn't align with your workflow, you can always use the feedback assistant to explain your situation and workflow and request a feature that will providing more insight into the performance state of the device.
Hope this answers your question!