Matching Virtual Object Depth with ARFrame Estimated Depth Data

Question

Created Jun ’20

Replies 9

Boosts 0

Views 2.5k

Participants 3

I am trying to do a hit test of sorts between a person in my ARFrame and a RealityKit Entity. So far I have been able to use the position value of my entity and project it to a CGPoint which I can match up with the ARFrame's segmentationBuffer to determine whether a person intersects with that entity. Now I want to find out if that person is at the same depth as that entity. How do I relate the SIMD3 position value for the entity, which is in meters I think, to the estimatedDepthData value?

Answered by DTS Engineer in 619385022

It's difficult to say where you've gone wrong, the following method will extract the value at the provided image coordinate from the depth texture:

Code Block extension CVPixelBuffer {
    func value(column: Int, row: Int) -> Float? {
        guard CVPixelBufferGetPixelFormatType(self) == kCVPixelFormatType_DepthFloat32 else { return nil }
        CVPixelBufferLockBaseAddress(self, .readOnly)
        if let baseAddress = CVPixelBufferGetBaseAddress(self) {
            let width = CVPixelBufferGetWidth(self)
            let index = column + row*width
            let offset = index * MemoryLayout<Float>.stride
            let value = baseAddress.load(fromByteOffset: offset, as: Float.self)
       		 CVPixelBufferUnlockBaseAddress(self, .readOnly)
            return value
        }
        CVPixelBufferUnlockBaseAddress(self, .readOnly)
        return nil
    }
}

I recommend that you start here, make sure you can get valid values, and then work forward from there to see where your issue is. It is likely an error in converting between coordinate spaces somewhere.

Boost

Answer 1

DTS Engineer OP

Apple

Jun ’20

The depth data in the estimatedDepthData pixel buffer is estimated linear depth in meters from the point of view.

So, if you have a pixel where your entity intersects with the segmentationBuffer, you can unproject that position into world space using the estimated linear depth, which you may be able to use as a sort of rough hit test.

This sample contains an unprojection method which may be useful for reference: https://developer.apple.com/documentation/arkit/visualizing_a_point_cloud_using_scene_depth

0

Answer 2

aharriscrowne OP

Jun ’20

Thanks for the suggestion. Since posting this I have indeed been able to get the beginnings of a hit test going with the segmentationBuffer, but then when I try to use the estimatedDepthData, I run into trouble extracting values.

Here's some of my code:

Code Block let segmentationCols = CVPixelBufferGetWidth(segmentationBuffer)
let segmentationRows = CVPixelBufferGetHeight(segmentationBuffer)
let colPosition = screenPosition.x / UIScreen.main.bounds.width * CGFloat(segmentationCols)
let rowPosition = screenPosition.y / UIScreen.main.bounds.height * CGFloat(segmentationRows)
CVPixelBufferLockBaseAddress(segmentationBuffer, .readOnly)
guard let baseAddress = CVPixelBufferGetBaseAddress(segmentationBuffer) else { return }
let bytesPerRow = CVPixelBufferGetBytesPerRow(segmentationBuffer)
let buffer = baseAddress.assumingMemoryBound(to: UInt8.self)
let index = Int(colPosition) + Int(rowPosition) * bytesPerRow
let b = buffer[index]
if let segment = ARFrame.SegmentationClass(rawValue: b), segment == .person, let depthBuffer = frame.estimatedDepthData {
	print("Person!")
	CVPixelBufferLockBaseAddress(depthBuffer, .readOnly)
	guard let depthAddress = CVPixelBufferGetBaseAddress(depthBuffer) else { return }
	let depthBytesPerRow = CVPixelBufferGetBytesPerRow(depthBuffer)
	let depthBoundBuffer = depthAddress.assumingMemoryBound(to: Float32.self)
	let depthIndex = Int(colPosition) * Int(rowPosition)
	let depth_b = depthBoundBuffer[depthIndex]
	print(depth_b)
	CVPixelBufferUnlockBaseAddress(depthBuffer, .readOnly)
}
CVPixelBufferUnlockBaseAddress( segmentationBuffer, .readOnly )

I strongly suspect that my problems are in line 19 and 20 of my code above, but I can't figure out the right values to find the point I want in the estimatedDepthData

0

Answer 3

aharriscrowne OP

Jun ’20

Thanks for the suggestion. Since posting this I have indeed been able to get the beginnings of a hit test going with the segmentationBuffer, but then when I try to use the estimatedDepthData, I run into trouble extracting values.

Here's some of my code:

Code Block let segmentationCols = CVPixelBufferGetWidth(segmentationBuffer)
let segmentationRows = CVPixelBufferGetHeight(segmentationBuffer)
let colPosition = screenPosition.x / UIScreen.main.bounds.width * CGFloat(segmentationCols)
let rowPosition = screenPosition.y / UIScreen.main.bounds.height * CGFloat(segmentationRows)
CVPixelBufferLockBaseAddress(segmentationBuffer, .readOnly)
guard let baseAddress = CVPixelBufferGetBaseAddress(segmentationBuffer) else { return }
let bytesPerRow = CVPixelBufferGetBytesPerRow(segmentationBuffer)
let buffer = baseAddress.assumingMemoryBound(to: UInt8.self)
let index = Int(colPosition) + Int(rowPosition) * bytesPerRow
let b = buffer[index]
if let segment = ARFrame.SegmentationClass(rawValue: b), segment == .person, let depthBuffer = frame.estimatedDepthData {
	print("Person!")
	CVPixelBufferLockBaseAddress(depthBuffer, .readOnly)
	guard let depthAddress = CVPixelBufferGetBaseAddress(depthBuffer) else { return }
	let depthBytesPerRow = CVPixelBufferGetBytesPerRow(depthBuffer)
	let depthBoundBuffer = depthAddress.assumingMemoryBound(to: Float32.self)
	let depthIndex = Int(colPosition) * Int(rowPosition)
	let depth_b = depthBoundBuffer[depthIndex]
	print(depth_b)
	CVPixelBufferUnlockBaseAddress(depthBuffer, .readOnly)
}
CVPixelBufferUnlockBaseAddress( segmentationBuffer, .readOnly )

I strongly suspect that my problems are in line 19 and 20 of my code above, but I can't figure out the right values to find the point I want in the estimatedDepthData

0

Answer 4

DTS Engineer OP

Apple

Jun ’20

It looks like the error is in line 20:

Code Block 	let depthIndex = Int(colPosition) * Int(rowPosition)

You should try:

Code Block 	let depthIndex = Int(colPosition) + Int(rowPosition) * width	// Where width is CVPixelBufferGetWidth(pixelBuffer)

0

Answer 5

aharriscrowne OP

Jul ’20

Hey thanks for the suggestion. That is actually what I had initially, similar to line 10, but it wasn't working so I started messing around with other values to see if I could get something to work. Neither one works though. Any other ideas? Most examples I've come across are Metal implementations and don't have corresponding code to what I'm trying to do.

0

Answer 6

DTS Engineer OP

Apple

Jul ’20

Accepted Answer

It's difficult to say where you've gone wrong, the following method will extract the value at the provided image coordinate from the depth texture:

Code Block extension CVPixelBuffer {
    func value(column: Int, row: Int) -> Float? {
        guard CVPixelBufferGetPixelFormatType(self) == kCVPixelFormatType_DepthFloat32 else { return nil }
        CVPixelBufferLockBaseAddress(self, .readOnly)
        if let baseAddress = CVPixelBufferGetBaseAddress(self) {
            let width = CVPixelBufferGetWidth(self)
            let index = column + row*width
            let offset = index * MemoryLayout<Float>.stride
            let value = baseAddress.load(fromByteOffset: offset, as: Float.self)
       		 CVPixelBufferUnlockBaseAddress(self, .readOnly)
            return value
        }
        CVPixelBufferUnlockBaseAddress(self, .readOnly)
        return nil
    }
}

I recommend that you start here, make sure you can get valid values, and then work forward from there to see where your issue is. It is likely an error in converting between coordinate spaces somewhere.

0

Answer 7

aharriscrowne OP

Jul ’20

That extension was super helpful and solved my problems, so thank you so much! Comparing the extension to my code, I think the key problem was in fact what you highlighted earlier--I needed to account for the pixel buffer width. In my previous implementation, I had been just accounting for the bytes per row which is what I thought you were saying too, but in fact you need to account for both.

Thanks again!

1

Answer 8

aharriscrowne OP

Jan ’21

One tricky bit I have discovered is that when working with an iPhone, the screen aspect ratio does not match the aspect ratio of the depth buffer, so translating from the buffer width to the screen position requires disregarding some of the buffer width on each side.

0

Answer 9

CodeName OP

Apr ’22

@aharriscrowne Please see my solution here:

https://developer.apple.com/forums/thread/705216?answerId=712036022#712036022

0