There is currently an ongoing discussion about the validity of GPU compute performance estimates like those offered by popular benchmarking tools such as Geekbench 5. It has been observed that Apple GPUs have a relatively slow frequency ramp up do not reach their peak performance if the submitted kernels have a runtime under a few seconds. I understand that these GPUs are designed for throughtput rather than latency, but sometimes one does work with “small” work packages (such as processing a single image). Is there an official way to tell the system that it should use peak performance for such work? E.g. some sort of hint along the lines of “I will now submit some GPU work and I want you to power up all the relevant subsystems” instead of relying on the OS to lazily adjust the performance profile?
Ensuring peak M1 GPU performance for short running kernels
Hi jcookie,
There is currently no way to hint the GPU to ramp up before executing work. Last year however, we introduced GPU Performance State inducer in our GPU tooling. The WWDC session "Discover Metal debugging, profiling, and asset creation tools" explains how to use these tools.