Metal compute performance using Xcode 9 betas

Did anybody else compare performance of compute kernels on Mac building with Xcode 9 vs Xcode 8? I'm a bit shocked right now seeing that 2 different kernels show less than half the throughput when built with Xcode 9b6 compared to Xcode 8.3.3. I've tried varying the optimization options which seem to be new in Xcode 9 but no influence on the result. And yes I know - guess it's time for another Radar

Replies

Is that in release mode?


On intel-GPU's I am also noticing this, but only on High Sierra. In fact, my kernels become so slow that they are timed-out (without further error).

Yes this is with release builds and testing on High Sierra. It does not seem to be very GPU dependent, same tendency on Intel HD Graphics 630, Radeon Pro 555 and the external GPU kit with Radeon RX580.

I am also seeing this with OpenCL.


Now, perhaps some things have changed in High Sierra. For example, while running compute kernels, you can still switch to another app/desktop, so the GPU is shared. Perhaps it is my imagination, but I have the feeling that pre-High Sierra it stops the compute kernels and resumes them when you switch back to your app, while in High Sierra they seem to run concurrently with other stuff. I am not sure, just feels that way. So, it could be that the kernels are running just as fast (per kernel), but that less GPU-threads are available for your app and that the GPU is shared with more other stuff.

If it is really the kernel that is (at least) twice as slow, then it would have to be a compiler bug of some sorts.

That could very well be a factor but still a bit weird that apps built with Xcode 8 does not suffer. Having tested further on Sierra I can say that an app built with Xcode 9 performs the metal compute pretty much the same as one built with Xcode 8 . But running the exact same apps on High Sierra performance is halved for the one built with Xcode 9. I should perhaps also add that the drastic difference is only for one of the 2 kernels that I'm testing - the other is less affected.

Xcode 9.0 GM does not solve it for me. Are your two kernels that are halved in speed also the most computational intensive?

For me, short running kernels are fine, it is the longer ones that are now running _very_ slow (on High Sierra).

I guess Xcode compiles metal and opencl to intemediate code, and the operating system drivers convert that to the final code that runs on the GPU.

That Xcode-version makes a difference for you is a bit strange. I have the opposite, the xcode 9 release version runs fine on El Capitan and Sierra, but very slow on High Sierra.

The kernel that is most affected is the most complex one. I have previously tried splitting it into 2 passes which made it perform good on High Sierra but worse on Sierra 😠 From your description you do seem to have the same symptoms as I do - the version build with Xcode 9 is the one that performs bad on High Sierra but good on Sierra. But do you see a difference at all between builds made with Xcode 8 and 9?

Unfortunately I can not test that, I moved my code-base to swift 4.0 and can not (easily) revert back to xcode 8.3 (tried to manually install swift 4.0, but then get an 'swift does not support the SDK 'MacOSX10.12.sdk'' error in xcode 8).


Just tested Xcode 9.0 GM and High Seirra 10.13 GM -> not solved. Still the same issue, sigh.

I've submitted rdar://34679825 including a sample project