MTLTexture getBytes() silently fails for NVIDIA?

Me again 😉,


while the Intel driver for OS X seem fairly usable I'm having big issues with the NVIDIA drivers. This time I cannot read any color texture (not even a simple RGBA8) that has been used as render target in Metal back to main memory. I just get a blank image. This all works fine on the iPad and the Intel driver on OS X. Is there something I need to ensure to have it working with NVIDIA as well?


Many thanks!

Accepted Reply

Do you also make sure to wait until the command buffer has completed before reading the texture back? (e.g. using -[MTLCommandBuffer addCompletedHandler:] with a block which reads the texture's bytes, and/or using -[MTLCommandBuffer waitUntilCompleted]).

Replies

i have the same problem. When i stared to port my image processing framework based on MTL I faced with the nonsense.

i think you can try to use the follow trick i've used to solve the issue:


1. create MTLBuffer

let imageBuffer = device.newBufferWithLength( imageByteCount, options: MTLResourceOptions.CPUCacheModeDefaultCache)

2. create Blit operation encoder

let blitEncoder = commandBuffer.blitCommandEncoder()

3. copy texture to shared buffer

blitEncoder.copyFromTexture(texture,
                    sourceSlice: 0,
                    sourceLevel: 0,
                    sourceOrigin: MTLOrigin(x: 0, y: 0, z: 0),
                    sourceSize: MTLSize.width: width, height: height, depth: 1),
                    toBuffer: imageBuffer,
                    destinationOffset: 0,
                    destinationBytesPerRow: bytesPerRow,
                    destinationBytesPerImage: 0)
            
blitEncoder.endEncoding()

4. do somthing with buffer content

            var rawData   = [UInt8](count: width*height*components, repeatedValue: 0)
            if texture.pixelFormat == .RGBA16Unorm {
                for var i=0; i < rawData.count; i++ {
                    var pixel = UInt16()
                    let address =  UnsafePointer<UInt16>(imageBuffer.contents())+i
                    memcpy(&pixel, address, sizeof(UInt16))
                    rawData[i] = UInt8(pixel>>8)
                }
            }
            else{
                memcpy(&rawData, imageBuffer.contents(), imageBuffer.length)
            }
         
            let cgprovider = CGDataProviderCreateWithData(nil, &rawData, imageByteCount, nil)



I soppouse it works slower then getBytes work version can, but it works. I hope this solution helps to solve your task.

Many thanks for your suggestion. Unfortunately, it also only works for me with the Intel driver. The NVIDIA driver doesn't get me any valid data in the buffer. Did you test it on OS X with different chips, like on a MacBook Pro?

Do you ever call -[MTLBLitCommandEncoder synchronizeTexture: slice: level] (or -[MTLBLitCommandEncoder synchronizeResource:]) after rendering to the texture? Assuming your textures are created with the default storage mode, MTLResourceStorageModeManaged.


https://developer.apple.com/library/mac/documentation/Metal/Reference/MTLBlitCommandEncoder_Ref/#//apple_ref/occ/intfm/MTLBlitCommandEncoder/synchronizeTexture:slice:level:

Yes, i did.

i tested with intel/nvidia (book pro mid 2012) / amd radeon (book pro 15'' 2015) and inel iris ( book pro 13'' 2015) chips and this solution works well.


There is the @slime' answer placed below sounds very resonable. it may be that we were inattentive in studying the apple manuals 🙂

slime, thank a lot! your solution is right and works faster. i've just tested with my jpegturbo texture writer implementation.

    /
    /
    /
    id<MTLCommandQueue> queue             = [texture.device newCommandQueue];
    id<MTLCommandBuffer> commandBuffer    = [queue commandBuffer];
    id<MTLBlitCommandEncoder> blitEncoder = [commandBuffer blitCommandEncoder];
    [blitEncoder synchronizeTexture:texture slice:0 level:0];
    [blitEncoder endEncoding];
   
    [commandBuffer commit];
    [commandBuffer waitUntilCompleted];
    void       *image_buffer  = malloc(row_stride);
   
    int j=0;
    while (cinfo.next_scanline < cinfo.image_height) {
       
        MTLRegion region = MTLRegionMake2D(0, cinfo.next_scanline, cinfo.image_width, 1);
       
        [texture getBytes:image_buffer
                       bytesPerRow:cinfo.image_width * 4 * componentSize
                        fromRegion:region
                       mipmapLevel:0];
       
        if (texture.pixelFormat == MTLPixelFormatRGBA16Unorm) {
            uint16 *s = image_buffer;/
            for (int i=0; i<counts; i++) {
                tmp[i] = (s[i]>>8) & 0xff;
                j++;
            }
            row_pointer[0] = tmp;
        }
        else{
            row_pointer[0] = image_buffer;/
        }
        (void) jpeg_write_scanlines(&cinfo, row_pointer, 1);
    }
   
    free(image_buffer);
    if (tmp != NULL) free(tmp);

@fritzt, i've just tetsted @slime's solution on NVIDIA GeForce GT 650M 1024 МБ, and one works perfect. i think it wil help solve your issue better then i firstly offered.

well then I guess "just" the NVIDIA driver is utterly broken

sounded too good, but unfortunately, it doesn't help with reading back the texture on NVIDIA. Good that it works for others, I give up on Metal for OS X by now... if the NVIDIA drivers aren't working (I can easily crash the OS too) it will be a nightmare to support an app... I'm just very disapointed that Apple is so silent on their own forum, not the best way to foster trust in a new technology... Anyway, thanks for your help

Do you also make sure to wait until the command buffer has completed before reading the texture back? (e.g. using -[MTLCommandBuffer addCompletedHandler:] with a block which reads the texture's bytes, and/or using -[MTLCommandBuffer waitUntilCompleted]).

Fritzt, if you are still experiencing issues would you mind filing a radar with your example attached? Thanks!

Yes of course, I'm trying to read back the texture after calling [MTLCommandBuffer waitUntilCompleted].

While getting everything ready for the bug-report I extracted the code and also cleaned up the underlying drawing code (altrough it's a computation filter, i use the rendering pipeline for performance reasons). After writing a unit-test for it I was surprised to find the NVidia driver suddenly working. Slime is right in that the -[MTLBLitCommandEncoder synchronizeResource:] call is required. I've added it before with no success. Altrough I could always see the results in the Metal Debugger, I could not read them back in code. Now the code not only works on the iPad and the OS X Intel driver, but also on NVidia. I'm a bit puzzeled now but more than happy that things look much more promising... Thanks to all that motivated me to go on!