I've tried with Blender as well and got the same result as Maya.
AR Quick Look method won't give me an accurate result. The goal here is to evaluate how well this new photogrammetry api in RealityKit can actually re-create a mesh in real world scale without any markers.
According to apple, they claim it does use the depth map from iOS devices to "capture" the real world scale, and we want to find out if it really works, and how precise it can be!
The fact that a few simple tests opening in 2 different 3d softwares doesn't bring the model in the correct real world scale is not a good start.
But it could be just a simple missing transformation that needs to happen on the depth maps captured by the CaptureSample app, so the scale is correct.
I'm going to build the CaptureSample original code (since this result is using the front TrueDepth camera instead of the back camera in the original code) and redo the photos and photogrammetry to see if the meshes came out scaled correctly.
If the meshes are correctly scale using the original code, at least I known the problem is the capture using the TrueDepth camera that is calling it wrong!
Post
Replies
Boosts
Views
Activity
I got a brand new iPad Pro with lidar as well, and I'm also not getting depth map when using .builtInDualWideCamera.
It does work fine with .builtInTrueDepthCamera, and I managed to scan an object with it, but the photogrammetry command line sample on OSX outputs the mesh with a wrong scaling, much smaller than the real world scale.
I actually bought the new iPadPro specifically to use with the new photogrammetry API in RealityKit, and it seems quite absurd that a brand new device, with dual camera on the back and Lidar doesn't produce a depth/disparity map at all!
Shouldn't we get an answer from applet about this? Does apple people look at this forum at all?
Are you using depthData and/or masks in your photogrammetrySample?
I'm asking because I'm getting errors when adding depth and/or mask to it.
how many photos are you using? I get that error when there's not many photos, or the photos don't overlap enough.
After
"Data ingestion is complete. Beginning processing..."
I get a few of these:
[espresso] ANE Batch: 1 of the async requests being waited for returned errors
[espresso] ANE Batch: Async request 1 returned error: code=5 err=Error Domain=com.apple.appleneuralengine Code=5 "processRequest:model:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow" UserInfo={NSLocalizedDescription=processRequest:model:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow}
[espresso] [Espresso::overflow_error] :0
messages, but then it goes without any messages for a few seconds.
After about 0.30 of progress, I get this:
ERROR cv3dapi.pg: Internal codes (3): 2504 3501 4011
and I get the usual
[Photogrammetry] No SfM map found in native output!
[Photogrammetry] Reconstruction failure for modelFile bla bla bla
I find very frustrating that there seem to be no documentation whatsoever for those error messages! I wish apple would document they API's better.
Anyhow, I'm using this snippet of code that I've found on a github gist to convert NSImage objects to CVPixelBuffer, booth for the color image and for the depth image:
extension NSImage {
func pixelBuffer() -> CVPixelBuffer? {
let width = self.size.width
let height = self.size.height
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(width),
Int(height),
kCVPixelFormatType_32ARGB,
attrs,
&pixelBuffer)
guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else {
return nil
}
CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
guard let context = CGContext(data: pixelData,
width: Int(width),
height: Int(height),
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {return nil}
// context.translateBy(x: 0, y: height)
// context.scaleBy(x: 1.0, y: -1.0)
let graphicsContext = NSGraphicsContext(cgContext: context, flipped: false)
NSGraphicsContext.saveGraphicsState()
NSGraphicsContext.current = graphicsContext
draw(in: CGRect(x: 0, y: 0, width: width, height: height))
NSGraphicsContext.restoreGraphicsState()
CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
return resultPixelBuffer
}
func depthPixelBuffer() -> CVPixelBuffer? {
let width = self.size.width
let height = self.size.height
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(width),
Int(height),
kCVPixelFormatType_DepthFloat32,
attrs,
&pixelBuffer)
guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else {
return nil
}
CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer)
// let linearGraySpace = CGColorSpace(name: CGColorSpace.linearGray)
let linearGraySpace = CGColorSpaceCreateDeviceGray()
guard let context = CGContext(data: pixelData,
width: Int(width),
height: Int(height),
bitsPerComponent: 32,
bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer),
space: linearGraySpace,
bitmapInfo: CGImageAlphaInfo.none.rawValue | CGBitmapInfo.floatComponents.rawValue)
else {
return nil
}
let graphicsContext = NSGraphicsContext(cgContext: context, flipped: false)
NSGraphicsContext.saveGraphicsState()
NSGraphicsContext.current = graphicsContext
draw(in: CGRect(x: 0, y: 0, width: width, height: height))
NSGraphicsContext.restoreGraphicsState()
CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
return resultPixelBuffer
}
}
The original depth is the TIF image (greyscale floating point) saved out by the TakingPicturesFor3DObjectCapture sample, together with the gravity.txt for each picture which I'm also reading back in the photogrammetrySample.
Turns out the error was the set of photos I was using. Using the same set of .HEIC files on the original HelloPhotogrammetry command line gave me the same ERROR cv3dapi.pg: Internal codes (3): 2504 3501 4011 error!
By using another set of .HEIC + depth.TIF + gravity.txt that worked with the original HelloPhotogrammetry, worked successfully with my version using PhotogrammetrySample as well, with the added feature of using the floating point depth.TIF file as depth instead of the .HEIC embedded depth map as the original HelloPhotogrammetry source code does.
I was able to edit the .HEIC files and re-run the mesh generation successfully. With the original source code, editing the .HEIC files and saving it again would create a new HEIC file without the embedded depth map, and the photogrammetry would fail to scale the mesh correctly.
I'm going to try the mask now asap.
Take a look at this post:
https://developer.apple.com/forums/thread/697968
I copied/pasted my code to read images and convert to color/depth/disparity CVPixelBuffer.
I'm still trying to figure out the mask component though. I'm getting the "sample is not supported" now when I add the mask... There must be some setup that needs to happen on the mask CVPixelBuffer (like for disparity/depth), but there's no documentation whatsoever.
Yes indeed. I added the exif metadata from the original HEIC photos, so the edited images can have the same camera information as the original image.
Although, because I'm editing the original HEIC photos in a Natron scene graph, the edited ones are coming out already properly rotated, so I have to edit the metadata and exchange width<=>heigh to match the size of the edited files.
I'm using this to read the metadata:
import ImageIO
func readEXIF(file: String) -> [ String: Any ]? {
print(file)
var dict:[ String: Any ] = [:]
if let imageSource = CGImageSourceCreateWithURL(NSURL(fileURLWithPath: file) as CFURL, nil) {
let imageProperties = CGImageSourceCopyPropertiesAtIndex(imageSource, 0, nil)
dict = (imageProperties as? [String: Any])!
let w = dict["PixelWidth"]
let h = dict["PixelHeight"]
dict["PixelWidth"] = h
dict["PixelHeight"] = w
dict["Depth"] = nil
//dict["{Exif}"]["PixelXDimension"] = h
//dict["{Exif}"]["PixelYDimension"] = w
//print(dict)
return dict
}
return [:]
}
Without the width<=>height exchange, I would get an Invalid Sample! id=N reason="The sample is not supported." message.
I also deleted the "Depth" data in the original metadata, since the edited HEIC doesn't have it.
The output mesh with and without metadata, booth using the depth data, have a slightly difference in size. I'm guessing the difference in size is caused by removing the lens deformation from the image when the metadata is present... but I'm not sure.