Background: I am prototyping with RealityKit with ios 14.1 on a latest iPad Pro 11 inches. My goal was to track a hand. When using skeleton tracking, it appears skeleton scales were not adjusted correctly so I got like 15cm off in some of my samples. So I am experimenting to use Vision to identity hand and then project back into 3D space.
1> Run image recognition on ARFrame.capturedImage
let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .up, options: [:])
let handPoseRequest = VNDetectHumanHandPoseRequest()
....
try handler.perform([handPoseRequest])
2> Convert point to 3D world transform (where the problem is).
fileprivate func convertVNPointTo3D(_ point: VNRecognizedPoint,
_ session: ARSession,
_ frame: ARFrame,
_ viewSize: CGSize)
-> Transform?
{
let pointX = (point.x / Double(frame.camera.imageResolution.width))*Double(viewSize.width)
let pointY = (point.y / Double(frame.camera.imageResolution.height))*Double(viewSize.height)
let query = frame.raycastQuery(from: CGPoint(x: pointX, y: pointY), allowing: .estimatedPlane, alignment: .any)
let results = session.raycast(query)
if let first = results.first {
return Transform(matrix: first.worldTransform)
}
else {
return nil
}
}
I wonder if I am doing the right conversion. The issue is, in the ARSession.rayCast document - https://developer.apple.com/documentation/arkit/arsession/3132065-raycast, it says this is converting UI screen point to 3D point. However, I am not sure how ARFrame.capturedImage will be fit into UI screen.
Thanks
Post
Replies
Boosts
Views
Activity
For an image, after getting the VNHumanHandPoseObservation from Vision API call. I would like to know if this is left hand or right hand. How can I do that? Also ideally, I would like to know if fingers are extended or not, the direction of palm. Thanks!
I wrote/copied a code to show major body joint locations. The goal was to track hand location. The hand locations seem off quite a bit. Please let me know what I could change to make it better.
My initial impression was that the skeleton seemed not adjusted to a person's body type - in my case, I was shorter than standard skeleton model.
Device: iPad Pro (11-inch) (2nd generation)
Software Version: 14.0 (18A373)
XCode Version: 12.0(12A7209)
iOS Deployment Target: 14.0
I was standing there without obstruction and I was not moving for a couple of seconds when taking screenshot. I could not attach screenshot. But by my visual estimation, the hand joint locations were about 10-15cm away.
Here is how I coded it -
1> create an entity for each joint interested
2> they are all added to a "VisualSkeleton" (extension of Entity) object
3> Create an AnchorEntity and Place this Entity to the anchorEntity;
4> Refresh each ModelEntity's location based on corresponding joint's location
Configurating ...
// Run a body tracking configration.
let configuration = ARBodyTrackingConfiguration()
configuration.automaticImageScaleEstimationEnabled = true
configuration.automaticSkeletonScaleEstimationEnabled = true
arView.session.run(configuration)
Calculates joint positions
func update(with bodyAnchor: ARBodyAnchor) {
let rootPosition = simd_make_float3(bodyAnchor.transform.columns.3)
let skeleton = bodyAnchor.skeleton
//rootAnchor.position = rootPosition
//rootAnchor.orientation = Transform(matrix: bodyAnchor.transform).rotation
for (jointName, jointEntity) in joints {
if let jointTransform = skeleton.modelTransform(for: ARSkeleton.JointName.init(rawValue: jointName)) {
let jointOffset = simd_make_float3(jointTransform.columns.3)
jointEntity.position = rootPosition + jointOffset // rootPosition
jointEntity.orientation = Transform(matrix: jointTransform).rotation
}
}
if self.parent == nil {
rootAnchor.addChild(self)
}
}
I will be happy to share more code if needed. Thank you so much!
In the skeleton tracking - https://developer.apple.com/documentation/arkit/validating_a_model_for_motion_capture Review Left-Hand Joints
section, there are 25 joints on each hand. I wonder if there are any documents describing the details of the joints. For example, I would like to know the direction of palm or what is the pose of the hand. Is there are easy way to find those out? Thank you.
I was trying to put an anchor at a person's left hand. I am stuck! Please help. Here is what I was able to do so far:
Body Tracking and got body anchor. Works fine.
Got left hand transform relative to body hip. Works fine.
Compute hand transform relative to world - I have questions
At the step 3, since I am not familiar with 3D computation, I am looking for an API to do that. I finally found the following:
https://developer.apple.com/documentation/realitykit/entity/3244058-convert
func convert(normal: SIMD3<Float>, from referenceEntity: Entity?) -> SIMD3<Float>
This API seems exactly fitting my requirements. However, to use it, it seems really strange: I will need to create two entities - I will need to create an empty entity in the World space and then create an entity to be anchored at the hip of the person (for from parameter).
Could someone help me on this?
Thank you!
I am interested in recording a 3D scene (potentially with a person in it). From that, I would like to replay and apply AR objects to the scene. I wonder how I could do that.
Thanks!
I am targeting releasing my app when ARKit 4 is released. I wonder if/how to use ARView and RealityKit in the SwiftUI. Are we still embedding UIView into SwiftUI? Thanks!