Your target problem is the core of spatial computing; Apple Vision Pro seamlessly blends digital content with your physical space.
https://www.apple.com/apple-vision-pro/
Both the known/controllable digital content and the unknown/uncontrollable physical object surface have shape, size, and 6DoF pose.
To seamlessly blend digital content with physical object surface, the shape, size, and 6DoF pose of physical object surface must be accurately recognized and measured in real-time.
Virtual ads inside Chungmuro station Line 3 - iPhone Pro 12
https://youtu.be/BmKNmZCiMkw
FindSurface 1 - Apple Vision Pro
https://youtu.be/p5msrVsEpa0
Post
Replies
Boosts
Views
Activity
The first result of the visionOS app with FindSurface framework.
The shape, size and 6DoF pose of the object in the line of sight are recognized and measured in real time by the FindSurface framework. Inlier vertex points are colored pink.
The project folder including the source code of the App together with the FindSurface framework will be available on GitHub CurvSurf at the end of June 2024.
FindSurface 1 - Apple Vision Pro
https://youtu.be/p5msrVsEpa0
As you mentioned, visionOS ARKit provides developers with restricted sensor data:
WorldAnchor
DeviceAnchor
MeshAnchor
PlaneAnchor
HandAnchor
ImageAnchor.
visionOS app developers must endure the above conditions of app development.
About 2 weeks ago, we got a set of Vision Pro. What we are using are:
DeviceAnchor
MeshAnchor.
DeviceAnchor is generated by using image and IMU sensor data streams.
MeshAnchors are generated by using LiDAR and image sensor data streams.
FindSurface framework recognizes and measures in real time the shapes, sizes, and 6DoFs of object surfaces by processing the vertex points of MeshAnchors.
The overall accuracy of FindSurface framework, image, IMU, LiDAR sensor data has been internally confirmed. The 6DoF poses of the Vision Pro and of the object surfaces are accurate in sub-centimeter accuracy.
But, the problem is that MeshAnchors cut/fill the convex/concave object surfaces. Consequently, the radii of object sphere/cylinder/cone/ring are estimated smaller than those of actual object surfaces by FindSurface.
The project folder including the app source code together with the FindSurface framework will be available on GitHub end June 2024.
For non-dark and non-reflective object surfaces, ARKit ARDepthData delivers depthMap with a wide ‘High’ confidenceMap for target points close to 20 cm away. Additionally, measurement accuracy is a statistical measure that should be handled mathematically by algorithms. A single measurement point may be inaccurate, but the average of 10-time measurements is statistically more stable.
The video below demonstrates a real-time plane detection & measurement from ARKit ARDepthData.
Finding Planes in depthMap - Apple iPad Pro LiDAR
https://youtu.be/8ixtwnfsuzA
As an ARKit developer, I was able to confirm that the overall accuracy of cameras, LiDAR and IMU is good enough for AR applications. Of course, our FindSurface framework for determining the shape, size and 6DoF of objects is also accurate enough for AR applications. If any of the cameras (data capture and content display), LiDAR (3D measurement), IMU (device tracking) and FindSurface (object detection and measurement) are not accurate enough, the digital content cannot be seamlessly blended with the physical space (see the videos below).
FindSurface for Spatial Computing
https://youtu.be/bcIXoTEeKek
Virtual ads inside Chungmuro station Line 3 - iPhone Pro 12
https://youtu.be/BmKNmZCiMkw
It would be great if Apple made the hovoring API available to developers. How to identify the digital content in the scene that is closest to the device and next to the device's line of sight.
To help protect people’s privacy, visionOS ARKit data is available only when your app presents a Full Space and other apps are hidden.
https://developer.apple.com/documentation/visionos/setting-up-access-to-arkit-data
Your device (iPhone, iPad, Vision Pro), hands, and fingers are visible to others, and there are fewer privacy issues. But the directions of your eyeballs are virtually invisible to others, potentially leading to privacy issues.
To protect people's privacy, data are selectively shared with developers.
Data provided to iOS/iPadOS developers but not to visionOS developers:
ARPointCloud
ARDepthData
Body data
Geo-reference data
Image sensor data stream
...
Data provided by visionOS to developers via DataProvider:
https://developer.apple.com/documentation/arkit/dataprovider
MeshAnchor
PlaneAnchor
DeviceAnchor
WorldAnchor
HandAnchor
ImageAnchor.
Data NOT provided to developers by either visionOS or iOS/iPadOS:
LiDAR range point data
Eye tracking data.
Under given conditions, we like to draw a ray pointing a digital content in space. Of course, eye-tracking data would be the first choice but not accessible for developers.
Alternatives to eye tracking data are DeviceAnchor and HandAnchor.
visionOS includes internally the hovering-framework for matching the ray of DeviceAnchor or HandAnchor (and eye-tracking) with a digital content in space.
To have such an unpleasant experience, it is enough to ride an elevator (one of moving vehicles). To solve the problem, the device's motion tracking must rely solely on image sensor data without help from 6DoF (acceleration) HW sensor data.
High-level applications require object information of shape, size, and 6DoF pose:
ARPointCloud: https://youtu.be/h4YVh2-3p9s
ARDepthData: https://youtu.be/zc6GQOtgS7M
ARMeshAnchor: https://youtu.be/sMRfX334blI
App-developers are responsible for how to use the shape, size, and 6DoF pose of object.
There are three sources of measurement points (point cloud) in iOS/iPadOS ARKit:
ARPointCloud: highly noisy and sparse (Debug info for device-tracking. Scattered as needle-shapes in space but mapped to stable screen-points).
ARDepthData: relatively accurate and dense. Provided by processing LiDAR and image sensor data.
ARMeshAnchor: provided by processing ARDepthData. Vertices of meshes are practically points.
visionOS ARKit provides only MeshAnchor.
The sensor data are processed internally by system. The processing results are provided via DataProvider.
https://developer.apple.com/documentation/arkit/dataprovider
We have two alternatives to eye-tracking in visionOS:
Use the device's 6DoF information (DeviceAnchor). The screen center is the default gazing point.
Use the hand-tracking (HandAnchor). The direction of your arm or index finger will provide a ray.
ARKitSession - DataProvider - Anchor.
I hope this helpful.
The mathematical definition is;
A plane has a point and a normal. It's extend is infinte.
A line has a point and a direction. It's extend is infinte.
But the very reality is;
A real plane or a real line has length/width/height.
Faces, normals, vertices are planes, directions, points, respectively.
How to handle mathematically them is an art.
Please try FindSurface Web Demo;
https://developers.curvsurf.com/WebDemo/
Planes, spheres, cylinders, cones, and tori can be accurately detected, recognized, and measured in real-time from 3D point cloud data.
Before attempting the difficult task of mapping a digital map onto a curved surface, I recommend that you successfully perform the relatively easy task of mapping a map onto a plane (ARKit PlaneAnchor).
Even if we can develop 3D technology, dissemination is all a different matter. Several ethical considerations are required when adopting 3D technology. Apple and other players may have already developed or are currently developing various 3D SW technologies available in Vision Pro (visionOS). However, such 3D technology may not be made available to SW app developers and ultimately will not be made available to consumers, even though it has been or will have been developed.
PlaneAnchor, MeshAnchor, face detection, body contour, body motion, video stream, 3D LiDAR stream, ... are available in iOS/iPadOS. However, only PlaneAnchor and MeshAnchor are currently available for visionOS.
There is a big gap in the consideration of private data protection between iOS/iPadOS and visionOS.
It's an Apple dilemma.
visionOS's ARKit offers MeshAnchor.
You can then get the MeshAnchor vertices simply into a point cloud. Next, you need an algorithmic means of calculating the information required for your application, such as: Shape, size, position and orientation of objects in the point cloud.
We have developed such a software library. We are willing to make this available to visionOS developers through Apple if Apple agrees.
Currently, the geometries that can be extracted and measured from point clouds in real time with the highest accuracy according to ISO 10360-6 are planes, spheres, cylinders, cones and rings.
Now we can overlay any virtual information on and around the surfaces of object planes, spheres, cylinders, cones and rings that exist in the real world.
Heat Pump System - Apple iPad Pro LiDAR
https://youtu.be/9QkSPkLIfWU