How many of your issues have you raised directly w/DTS?
Are you referring to code level support? We have used code level support to get help with issues in our code and it has been helpful, but we haven't used these to report metal driver bugs. That would get pretty expensive to be honest.
My mistake, sorry, I thought your main problem was risk to your users.
As a routine, feedback/bug reports only evolve into dialog if the engrs. need more info, I think. Otherwise, status changes, when they occur, are normally communicated. Thanks for helping test.
It's no fun at all testing when reporting a kernel panic driver bug on recent hardware gets no attention at all for months. Delivering a professional product based on Metal is a complete nightmare.
Mac OS seems to be the "Ugly Duckling" at Apple (remember the Hans Christian Anderson fairy tale).
I have logged almost 50 bugs in the last 10 years -- almost none of them have ever been closed.
To make it worse, you occasionally get feedback on some bugs, so you know somebody at Apple is looking at them.
But you are never given any feedback about "if" or "when" the bugs will be fixed.
Since I have learned to not expect Apple to fix these bugs, its time to look for workarounds.
The three GPU vendors on MacOS, AMD, Intel, and Nvidia, all behave differently and have different sets of bugs.
Intel's drivers for OpenCL/OpenGL and Metal all work well. Its run time compiler has big problems with large complicated shaders.
For my app, recently AMD's Metal shader compiler has serious problems, but the AMD OpenCL compiler/drivers work very well.
(My rendered output looks different if rendered using Metal versus OpenCL. AMD has told me they think it is bugs in the shader compiler's optimizer. This affects all apps -- perhaps you are having the same problem.)
Also I believe that the GPU supplier vendors build and maintain the Metal drivers for Apple.
With Apple not being responsive about Metal developer support, you may want to approach the GPU suppliers, AMD, Intel, Nvidia.
Another important factor is that your kernels are not necessarily executed in the order they are submitted. This is true especially if you are queuing kernels from multiple threads. The order of thread execution in general is not predictable.
So you need to add explicit synchronization to your kernel work flows to ensure the execution order is what you expect.
Modern GPUs can process multiple shader instances in parallel, so this can lead to big surprises, if you dont do synchronization.
But improper synchronization can theoretically lead to Deadlock, which would most likely cause GPU panic or even system kenel panic.
Check your logs for signs of GPU reset from an internal GPU panic.
Hi, my app, Fractal Architect, has been a showcase for the immense power of Metal/OpenCL/CUDA GPU compute shaders
for the last 10 years on MacOS and the app's render engine is in use on Windows as well (OpenCL/CUDA). The app is now in Beta on iOS/iPadOS with Metal as well.
So I know exactly what you are going through.
Apple's infamous "Wall of Secrecy" means that I have never had an internal contact inside of Apple.
I have been directly contacted by AMD, but their hands are tied with their working relationship with Apple.
Oddly enough, my testing partner for Fractal Architect, is Lennart Ostman from Harnosand, Sweden. It is a small, small world.
I am in the USA. I see that your company is based out of Sweden.
The pattern on Mac OS is that Metal/OpenCL is stable for 2-3 years, followed by 6-9 months of driver ****. This pattern has repeated several times.
Metal is far better supported on iOS/iPadOS. My app's large and very complex compute shaders ported easily to iOS 12/13.
I was really surprised by how robust Metal has been on iOS.
Catalina has been frankly extremely buggy. But for my app Metal on Catalina has not been problematic. I did have to workaround a couple of serious Catalina OpenCL/Metal bugs.
My app also uses classic vertex and fragment shaders on both Metal and OpenGL. I am using them for both 2D and 3D model visualization.
We should make contact off this forum. We might be able to help each other out.
Indeed, we are based in Sweden!
Before we switched to Metal we had a fair amount of issues with OpenGL as well, but I don't think we used to be able to kernel panic the entire OS, it usually just resulted in glitches / undefined behaviour.
I'll drop you a mail off-forum!
To the OP, can you post the IDs of the showstopper issues?
Mathias here, graphics programmer at Capture.
FB7432403 is our most important showstopper right now, internal errors on newer Intel GPUs, and we don’t have any workarounds for this.
FB7466370 - Driver hangs/crashes, or even kernel panics on AMD hardware. (I have a sketchy workaround, but we keep triggering it when adding new features.)
FB6101284, internal error on GeForce GPUs identified by a customer. (Haven’t heard anything since I reported it in May.)
Also, FB6344520 is not a complete showstopper but will prevent us from launching a new feature on AMD hardware in the near future, unless I can find workarounds for it. (And just as I verified if this had been fixed yet or not I found a similar issue on Intel GPUs that I need to report.)