Loading precompiled shader fails on 27'' retina iMac

I am trying to use Metal to speed up the display of parts of my app. It works great on the 15'' TouchBar MacBook Pro, the first-generation retina MacBook Pro, as well as on the current Macbook. However, when trying to run the exact same code on a 27'' retina iMac (late 2014), the call to newComputePipelineStateWithDescriptor fails like so:


Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED

Error when compiling simulation pipeline state: Error Domain=CompilerError Code=1 "Compiler encountered an internal error" UserInfo={NSLocalizedDescription=Compiler encountered an internal error}


How could this even happen, given that the shader is bundled in precompiled form? I assume that newComputePipelineStateWithDescriptor would just translate whatever Bytecode the metallib file is using to actual machine code for the current GPU. If this fails, does this mean that this translator contains a bug? I've tried playing around a bit on the iMac and it is not clear at all what causes the problem. My shader file is not very long (about 150 lines of code), but the problem goes away if I make it shorter (except that then of course the shader doesn't do what it's supposed to do anymore). It doesn't seem to be any specific piece of the code that causes the problem: I can make it go away by commenting out either one of several disjoint sections of code. So it almost looks like code length / complexity is the problem, but given that my compiled shader is only 17kB in size, this also seems rather implausible...


Given the "internal error" message, it does sound like a bug, but then I would have expected it to go away by making some changes to the code. My shader function contained a rather large conditional, so I broke it up into two distinct functions which are then chosen by the CPU depending on context. Unfortunately, this didn't solve the problem either, so I am a bit at a loss...

Replies

OK, so this is definitely a rather nasty bug in Metal. Eventually, I narrowed it down to the following.

The code snippet

float boundary()
{
        constant float* data = (gridPosition.y == 1?right:left);
        int offset = parameters.oversampling * gridPosition.x;
        return data[offset + (gridPosition.x > 0?-1:0)];
}

works perfectly fine. However, if I replace it in my Metal code by the equivalent code snipped

float boundary()
{
        int offset = parameters.oversampling * gridPosition.x;     
        constant float* data = (gridPosition.y == 1?right:left) + offset;
        return data[gridPosition.x > 0?-1:0];
}

then it causes MTLCompilerService to crash when loading the Metal library. This is pretty scary,

especially given that it works fine on most machines...

I encounter this problem on the latest MBP 2017 model. I filed radar 32962102.

Not as lucky as you are, I cannot narrow down to a few lines of code. It's the combination of some if-else branch and some simple arithmetic cause the error. It's extremely suprising that this serious issue escape Apple's testing.

Yes this looks like a bug in the driver's compiler. When you get XPC_ERROR_CONNECTION_INTERRUPTED, it indicates the compiler (which runs in another process) has crashed.