In my Vision OS app I am using plane detection and I want to create planes that have physics I want to create an effect that my reality kit entities rest on real world detected planes.
I was curious to see that the code below that I found in the Samples is the most efficient way of doing this.
func processPlaneDetectionUpdates() async {
for await anchorUpdate in planeTracking.anchorUpdates {
let anchor = anchorUpdate.anchor
if anchorUpdate.event == .removed {
planeAnchors.removeValue(forKey: anchor.id)
if let entity = planeEntities.removeValue(forKey: anchor.id) {
entity.removeFromParent()
}
return
}
planeAnchors[anchor.id] = anchor
let entity = Entity()
entity.name = "Plane \(anchor.id)"
entity.setTransformMatrix(anchor.originFromAnchorTransform, relativeTo: nil)
// Generate a mesh for the plane (for occlusion).
var meshResource: MeshResource? = nil
do {
let contents = MeshResource.Contents(planeGeometry: anchor.geometry)
meshResource = try MeshResource.generate(from: contents)
} catch {
print("Failed to create a mesh resource for a plane anchor: \(error).")
return
}
var material = UnlitMaterial(color: .red)
material.blending = .transparent(opacity: .init(floatLiteral: 0))
if let meshResource {
// Make this plane occlude virtual objects behind it.
entity.components.set(ModelComponent(mesh: meshResource, materials: [material]))
}
// Generate a collision shape for the plane (for object placement and physics).
var shape: ShapeResource? = nil
do {
let vertices = anchor.geometry.meshVertices.asSIMD3(ofType: Float.self)
shape = try await ShapeResource.generateStaticMesh(positions: vertices,
faceIndices: anchor.geometry.meshFaces.asUInt16Array())
} catch {
print("Failed to create a static mesh for a plane anchor: \(error).")
return
}
if let shape {
entity.components.set(CollisionComponent(shapes: [shape], isStatic: true))
let physics = PhysicsBodyComponent(mode: .static)
entity.components.set(physics)
}
let existingEntity = planeEntities[anchor.id]
planeEntities[anchor.id] = entity
contentEntity.addChild(entity)
existingEntity?.removeFromParent()
}
}
}
extension MeshResource.Contents {
init(planeGeometry: PlaneAnchor.Geometry) {
self.init()
self.instances = [MeshResource.Instance(id: "main", model: "model")]
var part = MeshResource.Part(id: "part", materialIndex: 0)
part.positions = MeshBuffers.Positions(planeGeometry.meshVertices.asSIMD3(ofType: Float.self))
part.triangleIndices = MeshBuffer(planeGeometry.meshFaces.asUInt32Array())
self.models = [MeshResource.Model(id: "model", parts: [part])]
}
}
extension GeometrySource {
func asArray<T>(ofType: T.Type) -> [T] {
assert(MemoryLayout<T>.stride == stride, "Invalid stride \(MemoryLayout<T>.stride); expected \(stride)")
return (0..<count).map {
buffer.contents().advanced(by: offset + stride * Int($0)).assumingMemoryBound(to: T.self).pointee
}
}
func asSIMD3<T>(ofType: T.Type) -> [SIMD3<T>] {
asArray(ofType: (T, T, T).self).map { .init($0.0, $0.1, $0.2) }
}
subscript(_ index: Int32) -> (Float, Float, Float) {
precondition(format == .float3, "This subscript operator can only be used on GeometrySource instances with format .float3")
return buffer.contents().advanced(by: offset + (stride * Int(index))).assumingMemoryBound(to: (Float, Float, Float).self).pointee
}
}
extension GeometryElement {
subscript(_ index: Int) -> [Int32] {
precondition(bytesPerIndex == MemoryLayout<Int32>.size,
"""
This subscript operator can only be used on GeometryElement instances with bytesPerIndex == \(MemoryLayout<Int32>.size).
This GeometryElement has bytesPerIndex == \(bytesPerIndex)
"""
)
var data = [Int32]()
data.reserveCapacity(primitive.indexCount)
for indexOffset in 0 ..< primitive.indexCount {
data.append(buffer
.contents()
.advanced(by: (Int(index) * primitive.indexCount + indexOffset) * MemoryLayout<Int32>.size)
.assumingMemoryBound(to: Int32.self).pointee)
}
return data
}
func asInt32Array() -> [Int32] {
var data = [Int32]()
let totalNumberOfInt32 = count * primitive.indexCount
data.reserveCapacity(totalNumberOfInt32)
for indexOffset in 0 ..< totalNumberOfInt32 {
data.append(buffer.contents().advanced(by: indexOffset * MemoryLayout<Int32>.size).assumingMemoryBound(to: Int32.self).pointee)
}
return data
}
func asUInt16Array() -> [UInt16] {
asInt32Array().map { UInt16($0) }
}
public func asUInt32Array() -> [UInt32] {
asInt32Array().map { UInt32($0) }
}
}
I was also curious to know if I can do this without ARKit using SpatialTrackingSession. My understanding is that using SpatialTrackingSession in RealityKit I can only get the transforms of the AnchorEntities but it won't have geometry information to create the collision shapes.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Posts under ARKit tag
200 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
We are developing VisionOS app now, we have applied the Enterprise API for visionOS, including Main Camera Access for Vision Pro, and already get the "Enterprise.license" in the mail apple sent us, we use the developer account import the license file into Xcode:
but in Xcode, we cannot find the entitlement of Enterprise API:
if we put com.apple.developer.arkit.main-camera-access.allow into Entitlement file of the project manually,Xcode will alarm:
and we find that the app itself dont have "Additional Capabilities" which include the Enterprise API:
what should we do to have the entitlement file for the Enterprise API, so we can use the enterprise API?
I stumbled across the function setWorldOrigin(relativeTransform:) from the ARSession which is documented here:
https://developer.apple.com/documentation/arkit/arsession/2942278-setworldorigin
I made a custom ARSession where i override this function and print and modify the relativeTransform parameter. The print shows that this function is called with an updated relativeTransform value but it seems that it has no impact e.g. on the world origin when starting or continuing a scan, the tiny puppet house in RoomPlan or any tracking position that i get from ARKit.
Has anybody experience with this method or knows what parts are influenced by setWorldOrigin()?
We tried out our Unity-based AR app for the very first time under iOS 18 and noticed an immediate, repeatable crash.
When run in Xcode 16, we get this error message:
Assert: /Library/Caches/com.apple.xbs/Sources/AppleCV3D/library/VIO/CAPI/src/SlamAnchor.cpp:37 : HasValidPose()
Assert: /Library/Caches/com.apple.xbs/Sources/AppleCV3D/library/VIO/CAPI/src/SlamAnchor.cpp:37 : HasValidPose()
That's a blocker to us.
We're using Unity 2022.3.27f1.
Dear all,
We are building an XR application demonstrating our research on open-vocabulary 3D instance segmentation for assistive technology. We intend on bringing it to visionOS using the new Enterprise APIs. Our method was trained on datasets resembling ScanNet which contain the following:
localized (1) RGB camera frames (2) with Depth (3) and camera intrinsics (4)
point cloud (5)
I understand, we can query (1), (2), and (4) from the CameraFrameProvider. As for (3) and (4), it is unclear to me if/how we can obtain that data.
In handheld ARKit, this example project demos how the depthMap can be used to simulate raw point clouds. However, this property doesn't seem to be available in visionOS.
Is there some way for us to obtain depth data associated with camera frames?
"Faking" depth data from the SceneReconstructionProvider-generated meshes is too coarse for our method. I hope I'm just missing some detail and there's some way to configure CameraFrameProvider to also deliver depth and/or point clouds.
Thanks for any help or pointer in the right direction!
~ Alex
I'm experiencing an issue with QuickLook in iOS 18 where.reality files with audio playback are affected. When I open a.reality file that includes audio, the audio track plays twice: once from the moment the file is opened, and again from the start of the animation. This results in a duplicate audio playback.
I've tested this issue on multiple devices running iOS 16, 17, and 18, and the problem only occurs on iOS 18. I've tried restarting the devices and checking for any software updates, but the issue persists.
Steps to reproduce:
Open a.reality file with audio playback in QuickLook on an iOS 18 device.
Observe the audio playback.
Expected result:
The audio track should play only once, from the start of the animation.
Actual result:
The audio track plays twice, once from the moment the file is opened and again from the start of the animation.
Device and iOS version:
I've tested this issue on iPhone 12 Pro, iPhone 13 Pro running iOS 18, iPhone 13 running iOS 16 and iPhone 11 Pro running iOS 17,
Hello,
I want to know who is in charge for the integration of a new ARKit update for the Unreal Engine.
We want to use the AR function from the AVP with Ubreal, but the ARKit version is too old.
I’m developing a visionOS app using EnterpriseKit, and I need access to the main camera for QR code detection. I’m using the ARKit CameraFrameProvider and ARKitSession to capture frames, but I’m encountering this error when trying to start the camera stream:
ar_camera_frame_provider_t: Failed to start camera stream with error: <ar_error_t Error Domain=com.apple.arkit Code=100 "App not authorized.">
Context:
VisionOS using EnterpriseKit for camera access and QR code scanning.
My Info.plist includes necessary permissions like NSCameraUsageDescription and NSWorldSensingUsageDescription.
I’ve added the com.apple.developer.arkit.main-camera-access.allow entitlement as per the official documentation here.
My app is allowed camera access as shown in the logs (Authorization status: [cameraAccess: allowed]), but the camera stream still fails to start with the “App not authorized” error.
I followed Apple’s WWDC 2024 sample code for accessing the main camera in visionOS from this session.
Sample of My Code:
import ARKit
import Vision
class QRCodeScanner: ObservableObject {
private var arKitSession = ARKitSession()
private var cameraFrameProvider = CameraFrameProvider()
private var pixelBuffer: CVPixelBuffer?
init() {
Task {
await requestCameraAccess()
}
}
private func requestCameraAccess() async {
await arKitSession.queryAuthorization(for: [.cameraAccess])
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
print("Failed to start ARKit session: \(error)")
return
}
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions: [.left])
guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return }
Task {
for await cameraFrame in cameraFrameUpdates {
guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue }
self.pixelBuffer = mainCameraSample.pixelBuffer
// QR Code detection code here
}
}
}
}
Things I’ve Tried:
Verified entitlements in both Info.plist and .entitlements files. I have added the com.apple.developer.arkit.main-camera-access.allow entitlement.
Confirmed camera permissions in the privacy settings.
Followed the official documentation and WWDC 2024 sample code.
Checked my provisioning profile to ensure it supports ARKit camera access.
Request:
Has anyone encountered this “App not authorized” error when accessing the main camera via ARKit in visionOS using EnterpriseKit? Are there additional entitlements or provisioning profile configurations I might be missing? Any help would be greatly appreciated! I haven't seen any official examples using new API for main camera access and no open source examples either.
If I import a USDZ model with blendMode set to alpha, occlusion does not work on iPhone with iOS 18. How should transparent materials and occlusion be properly used in the new RealityKit? Additionally, new artifacts have appeared when working with transparent objects overlapping each other. The transparency results do not blend but rather parts of the model just not rendering.
Hello everyone, I'm a Computer Science student. My supervisor has given me some topics for my final year project, and one of them involves using Vision Pro for facial recognition—specifically, identifying a designated face to display specific information.
As a developer, my understanding of Vision Pro is quite limited. I've done some research online and found that Unity and Xcode are used as development tools. Traditionally, facial recognition is done using OpenCV.
However, I've come across articles stating that Apple, due to security reasons, cannot implement facial recognition. I’d like to ask if that’s true. Also, with VisionOS 2 featuring object tracking and image tracking, could these methods potentially replace facial recognition?
Hi everyone, I want to add new joint in addition to joints that provided by ARKit. for example extract the position of wrist and elbow, then add new joint between them in the middle of arm. I can't find a good documentation that can explain ARKit very well. If there is another information that I can use, please share it with me. thanks.
In lots of houses there are different levels but are still on the same floor. What i mean is that there are things like stairs on the entrance that only have a few steps and would count basically as the same story.
RoomPlan already does a nice job recognizing them during the scanning but after the StructureBuilder or the optimization step it is not really satisfying.
Has anyone managed to handle those cases? Or do you have to scan a specific way to capture such small differences within a level?
TLDR: Timeline does not play animation when Repeat Forever is checked.
Hi! I have created a timeline for my model that does a built-in emphasize animation. Then I added a behavior to my model and has set OnAddedToScene with action to run that timeline. It works perfect well on my device. But I want the timeline to be looped. I realized that there's no loop option in the timeline, but I noticed that I can loop it if I insert it into another timeline(The loop checkbox shows up). So I did that and had my model's behavior to run that timeline. But then the model doesn't play the animation as intended.
Note: I am not making a VisionPro app, but an iOS app leveraging ARKit and RealityKit
Environment: iPhone 13 Pro Max with iOS18.0
Code:
struct ARViewContainer: UIViewRepresentable {
func makeUIView(context: Context) -> ARView {
let arView = ARView(frame: .zero)
arView.session.run()
Task {
do {
let anchor = AnchorEntity(plane: .horizontal)
let emojiScene = try await Entity(named: "SunglassesScene", in: bubbleAR
anchor.addChild(emojiScene)
arView.scene.addAnchor(anchor)
} catch {
print("Failed to load models: \(error)")
}
}
return arView
}
}
Thank you!
In my app, I have an ARView that has cameraMode set to nonAR.
I occasionally hide the ARView when it is not needed and reveal it again later.
While the ARView is hidden, I'd like to pause the animation to save iPhone battery life. I'd also like to do this when I know that animation in my scene has paused and the contents of the view, although still visible, is static.
This was possible using SceneKit, but I can't seem to find an equivalent way to do it using RealityKit.
At least as of iOS 18, a hidden ARView with an empty scene appears to use approximately 30% of the CPU.
How can I pause ARView so that it won't use the battery unnecessarily?
Thank you for considering this question.
I would like to implement the following but I am not sure if this is a supported use case based on the current documentation:
Run one ARKitSession with a WorldTrackingProvider in Swift for mixed immersion Metal rendering (to get the device anchor for the layer renderer drawable & view matrix)
Run another ARKitSession with a WorldTrackingProvider and a CameraFrameProvider in a different library (that is part of the same app) using the ARKit C API and using the transforms from the anchors in that session to render objects in the Swift application part.
In general, is this a supported use case or is it necessary to have one shared ARKitSession?
Assuming this is supported, will the (device) anchors from both WorldTrackingProviders reference the same world coordinate system?
Are there any performance downsides to having multiple ARKitSessions?
Thanks
Devices running iOS 18 using RealityKit do not seem to receive lighting supplied via ARKit Environment Texturing (https://developer.apple.com/documentation/arkit/arworldtrackingconfiguration/2977509-environmenttexturing).
Instead just a default IBL is used by RealityKit.
This happens with RealityView as well as ARView.
It also happens when I explicitly opt-in to environment texturing:
let worldTrackingConfig = ARWorldTrackingConfiguration()
worldTrackingConfig.environmentTexturing = .automatic
arView.session.run(worldTrackingConfig)
Even the Xcode AR Template has this issue.
I'm attaching a screenshot of the sample app running on iOS 18 where it's broken and from iOS 17 where it works as expected.
I hope this can get resolved quickly since I see it as a major regression.
Feedback ID: FB15091335
UPDATE:
It works on my older iPhone XS (iOS 18 22A5282m)
Broken on iPad Pro (11-inch) (3rd generation) (iPadOS 18.0 (22A5350a))
Maybe it's related to LiDAR?
Thank you!
iOS 17 (works):
iOS 18 (broken):
Hello
We are exploring the iOS 17 RoomPlan updates that allow for a custom ARSession to be passed into the RoomCaptureSession via the new initializer.
let roomCaptureSession = RoomCaptureSession(arSession: myARSession)
Currently we use our ARSession to extract sceneDepth from the ARFrames via the delegate callback. This works prior to activation of the RoomCaptureSession via session.run(configuration).
However, when we do call run on the RoomCaptureSession, sceneDepth is no longer present on the incoming ARFrames.
Are these mutually exclusive? Should we expect ARFrame depth data to be present when a RoomCaptureSession is running with the shared ARSession?
The RoomPlan API makes it possible to serialize and de-serialize CapturedRoom objects. This opens up the possibility to modify a CapturedRoom (e.g. deleting surfaces/objects) in a de-serialized state and serialize it as a new CapturedRoom. All modified attributes are loaded accordingly, so far so good.
My problem starts with the StructureBuilder and it's merge function capturedStructure().
This function ignores any modifications to attributes of a CapturedRoom. The only data that is considered is encoded in the CoreModel attribute (which is not mentioned in the official documentation).
If someone has more information or a working solution about how to modify CapturedRooms please let me know.
Additionally if there is somewhere a documentation about the CoreModel-attribute please post a link here.
Hello,
Has anyone had success with implementing object tracking in Unity or adding native tracking capability to the VisionOS project built from Unity?
I am working on an application for Vision Pro mainly in Unity using Polyspatial. The application requires me to track objects and make decisions based on tracked object's location. I was able to create an object tracking application on Native Swift, but could not successfully combine this with my Unity project yet. Each separate project (Main Unity app using Polyspatial and the native app on Swift) can successfully build and be deployed onto VisionPro.
I know that Polyspatial and ARFoundation does not have support for ARKit's object tracking feature for VIsion Pro as of today, and they only support image tracking inside Unity. For that reason I have been exploring different ways of creating a bridge for two way interaction of the native tracking functionality and the other functionality in Unity.
Below are the methods I tried and failed so far:
Package the tracking functionality as a Swift Plugin and access this in Unity, and then build for Vision Pro: I can create packages and access them for simple exposed variables and methods, but not for outputs and methods from ARKit, which throw dependency errors while trying to make the swift package.
Build project from Unity to VIsion Pro and expose a boolean to start/stop tracking that can be read by the native code, and then carry the tracking classes into the built project. In this approach I keep getting an error that says _TrackingStateChanged cannot be found, which is the class that exposes the bool toggled by the Unity button press:
using System.Runtime.InteropServices;
public class UnityBridge
{
[DllImport("__Internal")]
private static extern void TrackingStateChanged(bool isTracking);
public static void NotifyTrackingState()
{
// Call the Swift method
TrackingStateChanged(TrackingStartManager.IsTrackingActive());
}
}
This seems to be translated to C++ code in the ill2cpp output from Unity, and even though I made sure that all necessary packages were added to the target, I keep receiving this error. from the UnityFramework plugin:
Undefined symbol: _TrackingStateChanged
I have considered extending the current Image Tracking approach in ARFoundation to include object tracking, but that seems to be too complicated for my use case and time frame for now.
The final resort will be to forego Unity implementation and do everything in native code. However, I really want to be able to use Unity's conveniences and I have very limited experience with Swift development.
We've been using our app for the past year, and a user came back today that after three minutes, their phone starts getting hot and the screen dims. He is using 17.6.1 with an iPhone 14 max. No one else is seeing an issue, but with the posts online about 17.6.1 battery drain, I wonder if our AR app is somehow more sensitive to the issue.