Hello again!Thank you for the suggestion! I downloaded your app back when we were talking about this in 2019 and it was not working. I definitely checked the microphone settings for your app and mine and that was not the issue.I researched all of the info I could find online, got someone on codementor.io to try and help, tried everything I could imagine and was getting some weird errors and got stuck, so I just submitted a TSI to Apple in September 2019. After some back and forth and sending them my XCode project, one of the engineers told me this:“it looks like the problem is with GKVoiceChat and not the audio side. I actually spent most of the afternoon on this Friday, but wasn’t able to come up with a clean summary…In any case, unfortunately, it looks like the underlying problem here a problem on our side…I’m still seeing failures from the audio internals of GKVoiceChat. If you haven’t already, please file a bug at:<https://feedbackassistant.apple.com>…and send me the number once it’s filed. Once the bug is filed I’ll reach out the the GameCenter team and see if they can shed any light on what might be going on and if you have any options for getting things working.”So I filed the “feedbackassistant” report as I was asked to do, and over a month later I got this from the same engineer:“I don’t have any new information I can share. I’ve been told that it’s a bugbut can’t provide any information beyond that...Unfortunately, that’s really all I can say.”And so I started looking into paid options like Agora.io and making a facetime audio call from my app…Until today.Today I simply ran the *same* app that I had made before (which was Not working in September 2019), and without changing *any* code whatsoever, it works just fine now. The audio comes through on both devices.So there you go. I hope that helps. I am super excited to get to use this now!! I hope you guys are too.
Post
Replies
Boosts
Views
Activity
Okay after finding this question and trying what it said I made some progress. However, I am attempting to use arView.session.currentFrame.smoothedSceneDepth and not arView.session.currentFrame.estimatedDepthData.
Here is the updated extension:
extension CVPixelBuffer {
func value(from point: CGPoint) -> Float? {
let width = CVPixelBufferGetWidth(self)
let height = CVPixelBufferGetHeight(self)
let normalizedYPosition = ((point.y / UIScreen.main.bounds.height) * 1.3).clamped(0, 1.0)
let colPosition = Int(normalizedYPosition * CGFloat(height))
let rowPosition = Int(( 1 - (point.x / UIScreen.main.bounds.width)) * CGFloat(width) * 0.8)
return value(column: colPosition, row: rowPosition)
}
func value(column: Int, row: Int) -> Float? {
guard CVPixelBufferGetPixelFormatType(self) == kCVPixelFormatType_DepthFloat32 else { return nil }
CVPixelBufferLockBaseAddress(self, .readOnly)
if let baseAddress = CVPixelBufferGetBaseAddress(self) {
let width = CVPixelBufferGetWidth(self)
let index = column + (row * width)
let offset = index * MemoryLayout<Float>.stride
let value = baseAddress.load(fromByteOffset: offset, as: Float.self)
CVPixelBufferUnlockBaseAddress(self, .readOnly)
return value
}
CVPixelBufferUnlockBaseAddress(self, .readOnly)
return nil
}
}
Note that point.y is associated with column position and point.x is associated with row position, so the buffer appears to be rotated relative to the view.
I suspect there is some conversion between coordinate spaces that I need to be doing that I am unaware of.
To get this close to having it working I had to multiply the normalized Y position by 1.3 and the X position by 0.8, as well as invert the X axis by subtracting from 1.
The app still consistently crashes on this line:
let value = baseAddress.load(fromByteOffset: offset, as: Float.self)
With some help I was able to figure out what coordinates and conversion I needed to use.
The Vision result comes in Vision coordinates: normalized, (0,0) Bottom-Left, (1,1) Top-Right.
AVFoundation coordinates are (0,0) Top-Left, (1,1) Bottom-Right.
To convert from Vision coordinates to AVFoundation coordinates, you must flip the Y-axis like so:
public extension CGPoint {
func convertVisionToAVFoundation() -> CGPoint {
return CGPoint(x: self.x, y: 1 - self.y)
}
}
This AVFoundation coordinate is what needs to be used as input for indexing the depth buffer, like so:
public extension CVPixelBuffer {
///The input point must be in normalized AVFoundation coordinates. i.e. (0,0) is in the Top-Left, (1,1,) in the Bottom-Right.
func value(from point: CGPoint) -> Float? {
let width = CVPixelBufferGetWidth(self)
let height = CVPixelBufferGetHeight(self)
let colPosition = Int(point.x * CGFloat(width))
let rowPosition = Int(point.y * CGFloat(height))
return value(column: colPosition, row: rowPosition)
}
func value(column: Int, row: Int) -> Float? {
guard CVPixelBufferGetPixelFormatType(self) == kCVPixelFormatType_DepthFloat32 else { return nil }
CVPixelBufferLockBaseAddress(self, .readOnly)
if let baseAddress = CVPixelBufferGetBaseAddress(self) {
let width = CVPixelBufferGetWidth(self)
let index = column + (row * width)
let offset = index * MemoryLayout<Float>.stride
let value = baseAddress.load(fromByteOffset: offset, as: Float.self)
CVPixelBufferUnlockBaseAddress(self, .readOnly)
return value
}
CVPixelBufferUnlockBaseAddress(self, .readOnly)
return nil
}
}
This is all that is needed to get depth for a given position from a Vision request.
Here is my body tracking swift package that has a 3D hand tracking example that uses this:
https://github.com/Reality-Dev/BodyTracking
However, if you would like to find the position on screen for use with something such as UIKit or ARView.ray(through:) further transformation is required.
The Vision request was performed on arView.session.currentFrame.capturedImage.
arView.session.currentFrame is an ARFrame.
From the documentation on ARFrame.displayTransform(for:viewportSize:):
Normalized image coordinates range from (0,0) in the upper left corner of the image to (1,1) in the lower right corner. This method creates an affine transform representing the rotation and aspect-fit crop operations necessary to adapt the camera image to the specified orientation and to the aspect ratio of the specified viewport. The affine transform does not scale to the viewport's pixel size. The capturedImage pixel buffer is the original image captured by the device camera, and thus not adjusted for device orientation or view aspect ratio.
So the image being rendered on screen is a cropped version of the frame that the camera captures, and there is transformation needed to go from AVFoundation coordinates to display (UIKit) coordinates.
Converting from AVFoundation coordinates to display (UIKit) coordinates:
public extension ARView {
func convertAVFoundationToScreenSpace(_ point: CGPoint) -> CGPoint? {
//Convert from normalized AVFoundation coordinates (0,0 top-left, 1,1 bottom-right)
//to screen-space coordinates.
guard
let arFrame = session.currentFrame,
let interfaceOrientation = window?.windowScene?.interfaceOrientation
else {return nil}
let transform = arFrame.displayTransform(for: interfaceOrientation, viewportSize: frame.size)
let normalizedCenter = point.applying(transform)
let center = normalizedCenter.applying(CGAffineTransform.identity.scaledBy(x: frame.width, y: frame.height))
return center
}
}
To go the opposite direction, from UIKit display coordinates to AVFoundation coordinates:
public extension ARView {
func convertScreenSpaceToAVFoundation(_ point: CGPoint) -> CGPoint? {
//Convert to normalized pixel coordinates (0,0 top-left, 1,1 bottom-right)
//from screen-space UIKit coordinates.
guard
let arFrame = session.currentFrame,
let interfaceOrientation = window?.windowScene?.interfaceOrientation
else {return nil}
let inverseScaleTransform = CGAffineTransform.identity.scaledBy(x: frame.width, y: frame.height).inverted()
let invertedDisplayTransform = arFrame.displayTransform(for: interfaceOrientation, viewportSize: frame.size).inverted()
let unScaledPoint = point.applying(inverseScaleTransform)
let normalizedCenter = unScaledPoint.applying(invertedDisplayTransform)
return normalizedCenter
}
}
To get a world-space coordinate from a UIKit screen coordinate and a corresponding depth value:
/// Get the world-space position from a UIKit screen point and a depth value
/// - Parameters:
/// - screenPosition: A CGPoint representing a point on screen in UIKit coordinates.
/// - depth: The depth at this coordinate, in meters.
/// - Returns: The position in world space of this coordinate at this depth.
private func worldPosition(screenPosition: CGPoint, depth: Float) -> simd_float3? {
guard
let rayResult = arView.ray(through: screenPosition)
else {return nil}
//rayResult.direction is a normalized (1 meter long) vector pointing in the correct direction, and we want to go the length of depth along this vector.
let worldOffset = rayResult.direction * depth
let worldPosition = rayResult.origin + worldOffset
return worldPosition
}
To set the position of an entity in world space for a given point on screen:
let currentFrame = arView.session.currentFrame,
let sceneDepth = (currentFrame.smoothedSceneDepth ?? currentFrame.sceneDepth)?.depthMap
let depthAtPoint = sceneDepth.value(from: avFoundationPosition),
let worldPosition = worldPosition(screenPosition: uiKitPosition, depth: depthAtPoint)
trackedEntity.setPosition(worldPosition, relativeTo: nil)
And don't forget to set the proper frameSemantics on your ARConfiguration:
func runNewConfig(){
// Create a session configuration
let configuration = ARWorldTrackingConfiguration()
//Goes with (currentFrame.smoothedSceneDepth ?? currentFrame.sceneDepth)?.depthMap
let frameSemantics: ARConfiguration.FrameSemantics = [.smoothedSceneDepth, .sceneDepth]
//Goes with currentFrame.estimatedDepthData
//let frameSemantics: ARConfiguration.FrameSemantics = .personSegmentationWithDepth
if ARWorldTrackingConfiguration.supportsFrameSemantics(frameSemantics) {
configuration.frameSemantics.insert(frameSemantics)
}
// Run the view's session
session.run(configuration)
}
@aharriscrowne Please see my solution here:
https://developer.apple.com/forums/thread/705216?answerId=712036022#712036022
Thanks for your reply.
OcclusionMaterial might be the way to go, but the edge cases will have to be addressed.
As far as how to create one, OcclusionMaterial is provided by RealityKit and is available on VisionOS, so all you have to do is initialize one like this:
OcclusionMaterial()
And pass it to the array of materials on a ModelComponent.
let myEntity = Entity()
let myModelComponent = ModelComponent(mesh: myMesh, materials: [OcclusionMaterial()])
myEntity.components.set(myModelComponent)
OR
let myModelEntity = ModelEntity(mesh: myMesh, materials: [OcclusionMaterial()])
There is a good example on Medium.com titled "Creating an iOS AR Portal App using Occlusion Materials" but it uses the older version of RealityKit for handheld devices.