Set camera feed as texture input for CustomMaterial

Hello,

in this project https://developer.apple.com/documentation/arkit/content_anchors/tracking_and_visualizing_faces there is some sample code that describes how to map the camera feed to an object with SceneKit and a shader modifier.

I would like know if there is an easy way to achieve the same thing with a CustomMaterial and RealityKit 2.

Specifically I'm interested in what would be the best way to pass in the background of the RealityKit environment as a texture to the custom shader.

In SceneKit this was really easy as one could just do the following:

material.diffuse.contents = sceneView.scene.background.contents

As the texture input for custom material requires a TextureResource I would probably need a way to create a CGImage from the background or camera feed on the fly.

What I've tried so far is accessing the captured image from the camera feed and creating a CGImage from the pixel buffer like so:

guard
    let frame = arView.session.currentFrame,
    let cameraFeedTexture = CGImage.create(pixelBuffer: frame.capturedImage),
    let textureResource = try? TextureResource.generate(from: cameraFeedTexture, withName: "cameraFeedTexture", options: .init(semantic: .color))

else {
    return
}

// assign texture
customMaterial.custom.texture = .init(textureResource)
extension CGImage {
  public static func create(pixelBuffer: CVPixelBuffer) -> CGImage? {
    var cgImage: CGImage?

    VTCreateCGImageFromCVPixelBuffer(pixelBuffer, options: nil, imageOut: &cgImage)
    return cgImage
  }
}

This seems wasteful though and is also quite slow. Is there any other way to accomplish this efficiently or would I need to go the post processing route?

In the sample code the displayTransform for the view is also being passed as a SCNMatrix4. CustomMaterial custom.value only accepts a SIMD4 though. Is there another way to pass in the matrix?

Another idea I've had was to create a CustomMaterial from an OcclusionMaterial which already seems to contain information about the camera feed but so far had no luck with it.

Thanks for the support!

The way you described is likely the best option right now. You could also look at using the new DrawableQueue API, or using the post processing APIs.

As for passing in a float4x4 into your custom shaders, custom.value only supports float4. The only options right now are

Encoding the matrix in a texture.

Breaking up the matrix and storing the data in baseColor.tint, metallic.scale, etc.

I would suggest trying the first option. Also, feel free to file these feature requests on Feedback Assistant.

Hi, so I dug a little bit into DrawableQueue and got it working. Seems really powerful and there is apparently no measure performance hit. However I've got one little issue: the rendered texture of my drawable looks a little too bright or oversaturated. I assume this is some kind of color mapping issue?

My setup looks like the following:

First I setup my DrawableQueue

let descriptor = TextureResource.DrawableQueue.Descriptor(
    pixelFormat: .rgba8Unorm,
    width: 1440,
    height: 1440,
    usage: .unknown,
    mipmapsMode: .none
)

… let queue = try TextureResource.DrawableQueue(descriptor)

Next I setup the MTLRenderPipelineDescriptor and MTLRenderPipelineDescriptor:

        let pipelineDescriptor = MTLRenderPipelineDescriptor()
        pipelineDescriptor.sampleCount = 1
        pipelineDescriptor.colorAttachments[0].pixelFormat = .rgba8Unorm
        pipelineDescriptor.depthAttachmentPixelFormat = .invalid

…

and then at each frame I convert the currents frame pixelbuffer to a MTLTexture like in the ARKit with Metal Xcode sample.

        guard
            let drawable = try? drawableQueue.nextDrawable(),
            let commandBuffer = commandQueue?.makeCommandBuffer(),
            let renderPipelineState = renderPipelineState,
            let frame = arView?.session.currentFrame
        else {
            return
        }

        // update vertex coordinates with display transform
        updateImagePlane(frame: frame)

        let pixelBuffer = frame.capturedImage

        // convert captured image into metal textures
        guard
            !(CVPixelBufferGetPlaneCount(pixelBuffer) < 2),
            let capturedImageTextureY = createTexture(
                fromPixelBuffer: pixelBuffer,
                pixelFormat: .r8Unorm,
                planeIndex: 0
            ),
            let capturedImageTextureCbCr = createTexture(
                fromPixelBuffer: pixelBuffer,
                pixelFormat: .rg8Unorm,
                planeIndex: 1
            )
        else {
            return
        }

        let renderPassDescriptor = MTLRenderPassDescriptor()
        renderPassDescriptor.colorAttachments[0].texture = drawable.texture
        renderPassDescriptor.colorAttachments[0].loadAction = .load
        renderPassDescriptor.colorAttachments[0].storeAction = .store
        renderPassDescriptor.renderTargetHeight = textureResource.width
        renderPassDescriptor.renderTargetWidth = textureResource.height

        guard let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: renderPassDescriptor) else {
            return
        }

        renderEncoder.pushDebugGroup("DrawCapturedImage")
        renderEncoder.setCullMode(.none)
        renderEncoder.setRenderPipelineState(renderPipelineState)
        renderEncoder.setVertexBuffer(imagePlaneVertexBuffer, offset: 0, index: 0)
        renderEncoder.setFragmentTexture(capturedImageTextureY, index: 1)
        renderEncoder.setFragmentTexture(capturedImageTextureCbCr, index: 2)
        renderEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
        renderEncoder.endEncoding()
        
        commandBuffer.present(drawable)
        commandBuffer.commit()

in the fragment shader of my quadmapped texture I perform the ycbcrToRGBTransform.

Then finally in my CustomMaterial fragment shader I just sample the texture and display it:

[[visible]]

void cameraMappingSurfaceShader(realitykit::surface_parameters params) {
    auto surface = params.surface();
    float2 uv = params.geometry().uv0();

    // Flip uvs vertically.
    uv.y = 1.0 - uv.y;

    half4 color = params.textures().custom().sample(samplerBilinear, uv);
    surface.set_emissive_color(color.rgb);
}

Almost everything looks fine, it's just a slight difference in brightness. Do I maybe need to work with a different pixel format? As a test I also used a simple image, loaded it as a texture resource and then replaced it via DrawableQueue and metal texture with the same image. This gave me similar results (too bright).

The encoding of the display transform matrix will be the next step, but for now I'd like to get this working properly.

Thanks for any help!

Glad the API is working well for you. Yes, my initial guess is that you are using the incorrect pixel format/color space. Can you try using the srgb pixel formats (ex rgba8Unorm_srgb)? In the meantime, I'll see if I can find out what pixel format the camera feed is in.

Alright, so I tried adjusting the pixel format from rgba8Unorm to rgba8Unorm_srgb but that didn't make much of a difference. From what I've read it seems that the issue is related to the gamma correction and RealityKit as well as SceneKit render in linear color space?

To work around this I tried converting the color in the fragment shader like so:

/* This conversion method is copied from section 7.7.7 of the Metal Language Spec:( https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf ) */
static float srgbToLinear(float c) {
    if (c <= 0.04045)
        return c / 12.92;
    else
        return powr((c + 0.055) / 1.055, 2.4);
}

[[visible]]
void cameraMappingSurfaceShader(realitykit::surface_parameters params) {

    auto surface = params.surface();

    float2 uv = params.geometry().uv0();
    half4 color = params.textures().custom().sample(samplerBilinear, uv);
    half3 finalColor = color.rgb;

    finalColor.r = srgbToLinear(finalColor.r);
    finalColor.g = srgbToLinear(finalColor.g);
    finalColor.b = srgbToLinear(finalColor.b);

    surface.set_emissive_color(finalColor);
}

The result looks a lot better and pretty close, but is still slightly darker than the background camera feed of the ARView.

Just as a test I tried adjust the exposure a little and got quite close with this setting:

arView.environment.background = .cameraFeed(exposureCompensation: -0.35)

But that is of course a workaround I'd like to avoid.

Attached is an image of how it currently looks.

Also could you give me a hint how I'd do the encoding of the matrix into a texture? Could I write it into a CGImage and pass that as the texture resource?

I inspected the display transform and it seems there are only a couple relevant parameters so I've tried the following:


// this uses a simd_float4x4 matrix retrieved via ARFrame.frame.displayTransform(…  
        let encodedDisplayTransform: SIMD4<Float> = .init(
            x: displayTransform.columns.0.x,
            y: displayTransform.columns.0.y,
            z: displayTransform.columns.3.x,
            w: displayTransform.columns.3.y
        )

        customDrawableMaterial.custom.value = encodedDisplayTransform

// put remaining values into unused material parameters
        customDrawableMaterial.metallic.scale = displayTransform.columns.1.x
        customDrawableMaterial.roughness.scale = displayTransform.columns.1.y

and then I reconstruct the matrix within the geometry modifier.

This looks pretty great! I am told that the ARKit frame is in the rec709 color space, but I am going to send your question to our graphics team to see if they have any other suggestions for this.

Yeah it's getting there but I can't seem to figure out what the last missing step is. The conversion from YCbCr values to sRGB is performed as described here: https://developer.apple.com/documentation/arkit/arframe/2867984-capturedimage

So I guess there is one last conversion that is missing. That srgbToLinear method described above brings it close but darkens too much.

As that conversion matrix from the docs already states that it converts into sRGB would I even need to care about rec709 at all?

Okay thank you, it would be great if they have another suggestion or tip.

Hi! I reached out to our graphics team. Our guess is that we are applying tone mapping to your custom material. However, we do not apply tone mapping to the passthrough texture, hence the slight difference. You can try disabling HDR with https://developer.apple.com/documentation/realitykit/arview/renderoptions/3282004-disablehdr, however we are unsure if that will help. We don't have any other ways to disable tone mapping right now, but you are welcome to file a feature request on feedback assistant.

@CodeName I've just published a little sample project showing how to use DrawableQueue to implement transparent animated GIFs in RealityKit 2. Let me know if it's helpful. :)

https://github.com/arthurschiller/realitykit-drawable-queue

Set camera feed as texture input for CustomMaterial
 
 
Q