We've been using our app for the past year, and a user came back today that after three minutes, their phone starts getting hot and the screen dims. He is using 17.6.1 with an iPhone 14 max. No one else is seeing an issue, but with the posts online about 17.6.1 battery drain, I wonder if our AR app is somehow more sensitive to the issue.
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Posts under ARKit tag
200 Posts
Sort by:
Hello everyone,
I'm working on developing an app that allows users to share and enjoy experiences together while they are in the same physical locations. Despite trying several approaches, I haven't been able to achieve the desired functionality. If anyone has insights on how to make this possible or is interested in joining the project, I would greatly appreciate your help!
Hi everyone,
I'm developing an ARKit app using RealityKit and encountering an issue where a video displayed on a 3D plane shows up as a pink screen instead of the actual video content.
Here's a simplified version of my setup:
func createVideoScreen(video: AVPlayerItem, canvasWidth: Float, canvasHeight: Float, aspectRatio: Float, fitsWidth: Bool = true) -> ModelEntity {
let width = (fitsWidth) ? canvasWidth : canvasHeight * aspectRatio
let height = (fitsWidth) ? canvasWidth * (1/aspectRatio) : canvasHeight
let screenPlane = MeshResource.generatePlane(width: width, depth: height)
let videoMaterial: Material = createVideoMaterial(videoItem: video)
let videoScreenModel = ModelEntity(mesh: screenPlane, materials: [videoMaterial])
return videoScreenModel
func createVideoMaterial(videoItem: AVPlayerItem) -> VideoMaterial {
let player = AVPlayer(playerItem: videoItem)
let videoMaterial = VideoMaterial(avPlayer: player)
return videoMaterial
Despite following the standard process, the video plane renders pink. Has anyone encountered this before, or does anyone know what might be causing it?
Thanks in advance!
This effect was mentioned in https://developer.apple.com/wwdc24/10153 (the effect is demonstrated at 28:00), in which the demonstration is you can add coordinates by looking somewhere on the ground and clicking., but I don't understand his explanation very well. I hope you can give me a basic solution. I am very grateful for this!
Coordinate conversion was mentioned in https://developer.apple.com/wwdc24/10153 (the effect is demonstrated at 22:00), in which the demonstration is an entity that jumps out of volume into space, but I don't understand his explanation very well. I hope you can give me a basic solution. I am very grateful for this!
In this code:
It contains a physical collision reaction between virtual objects and the real world, which is realized by creating a grid with physical components. However, I don't understand the information in the document very well. Who can give me a solution? Thank you!
In Reality View, I want to move an entity A to the position of entity B, but I can't determine the coordinates of entity B (for example, entity B is tracking the hand). What's the solution?
The light of RealityView can only be effective on virtual objects. I hope it can be projected into the real world. What API can be implemented?
arScnView = ARSCNView(frame: CGRect.zero, options: nil)
arScnView.delegate = self
arScnView.automaticallyUpdatesLighting = true
arScnView.allowsCameraControl = true
arSession = arScnView.session
arSession.delegate = self
config = ARWorldTrackingConfiguration()
config.sceneReconstruction = .meshWithClassification
config.environmentTexturing = .automatic
func session(_ session: ARSession, didAdd anchors: [ARAnchor])
anchors.forEach({ anchor in
if let meshAnchor = anchor as? ARMeshAnchor {
let node = meshAnchor.toSCNNode()
if let environmentProbeAnchor = anchor as? AREnvironmentProbeAnchor {
// Can I retrieve the texture map corresponding to ARMeshAnchor from Environment Probe Anchor?
// Or how can I retrieve the texture map corresponding to ARMeshAnchor?
How can I scan a 3D scene and save it as USDZ?
I want to achieve the following scenario?
From my early testing it seems like the object tracking works best for static objects. For example, if I am holding something in my hand the object tracker is slow to update.
Is there anything that can be modified to decrease the tracking latency?
I noticed that the Enterprise API has some override features is this something that can only be done using Enterprise?
The structure builder provides walls and floors for each captured story, but not a ceiling. For my case it is necessary that the scanned geometry is closed to open up the possibility to place objects on the ceiling for example and therefore it is important that there is an estimated ceiling for different rooms within a story.
Is there any info that apple has something like this on the roadmap in the future because i think this can open opportunities especially when thinking about industrial application of the API.
If somebody has more insights on this topic pls share :)
The following RealityView ModelEntity animated text works in visionOS 1.0. In visionOS 2.0, when running the same piece of code, the model entity move duration does not seem to work. Are there changes to the way it works that I am missing? Thank you in advance.
RealityView { content in
let textEntity = generateMovingText()
_ = try? await arkitSession.run([worldTrackingProvider])
} update: { content in
guard let entity = content.entities.first(where: { $0.name == .textEntityName}) else { return }
if let pose = worldTrackingProvider.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) {
entity.position = .init(
x: pose.originFromAnchorTransform.columns.3.x,
y: pose.originFromAnchorTransform.columns.3.y,
z: pose.originFromAnchorTransform.columns.3.z
if let modelEntity = entity as? ModelEntity {
let rotation = Transform(rotation: simd_quatf(angle: -.pi / 6, axis: [1, 0, 0])) // Adjust angle as needed
modelEntity.transform = Transform(matrix: rotation.matrix * modelEntity.transform.matrix)
let animationDuration: Float = 60.0 // Adjust the duration as needed
let moveUp = Transform(scale: .one, translation: [0, 2, 0])
modelEntity.move(to: moveUp, relativeTo: modelEntity, duration: TimeInterval(animationDuration), timingFunction: .linear)
The source is available at the following:
I’m developing an app for Vision Pro and have encountered an issue related to the UI layout and model display. Here's a summary of the problem:
I created an anchor window to display text and models in the hand menu UI.
While testing on my Vision Pro, everything works as expected; the text and models do not overlap and appear correctly.
However, after pushing the changes to GitHub and having my client test it, the text and models are overlapping.
I’m using Reality Composer Pro to load models and set them in the hand menu UI.
All pins are attached to attachmentHandManu, and attachmentHandManu is set to track the hand and show the elements in the hand menu.
I ensure that the attachmentHandManu tracks the hand properly and displays the UI components correctly in my local tests.
What could be causing the text and models to overlap in the client’s environment but not in mine? Are there any specific settings or configurations I should verify to ensure consistent behavior across different environments? Additionally, what troubleshooting steps can I take to resolve this issue?
Hello, I have received Enterprise.license from Apple and I am trying to implement main Camera access for Vision Pro by following https://developer.apple.com/videos/play/wwdc2024/10139/. Here is my camera function.
func takePicture() async {
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
let cameraFrameProvider = CameraFrameProvider()
var arKitSession = ARKitSession()
var pixelBuffer: CVPixelBuffer?
await arKitSession.queryAuthorization(for: [.cameraAccess])
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
guard let cameraFrameUpdates =
cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else {
for await cameraFrame in cameraFrameUpdates {
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
pixelBuffer = mainCameraSample.pixelBuffer
let image = UIImage(ciImage: CIImage(cvPixelBuffer: pixelBuffer!))
UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
My problem is debug stops at this line.
guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return
Why does it happen so and what else do I need to do?
Hi everyone,
I'm working on an AR application where I need to accurately locate the center of the pupil and measure anatomical distances between the pupil and eyelids. I’ve been using ARKit’s face tracking, but I’m having trouble pinpointing the exact center of the pupil.
My Questions:
Locating Pupil Center in ARKit: Is there a reliable way to detect the exact center of the pupil using ARKit? If so, how can I achieve this?
Framework Recommendation: Given the need for fine detail in measurements, would ARKit be sufficient, or would it be better to use the Vision framework for more accurate 2D facial landmark detection? Alternatively, would a hybrid approach, combining Vision for precision and ARKit for 3D tracking, be more effective?
What I've Tried:
Using ARKit’s ARFaceAnchor to detect face landmarks, but the results for the pupil position seem imprecise for my needs.
Considering Vision for 2D detection, but concerned about integrating it into a 3D AR experience.
Any insights, code snippets, or guidance would be greatly appreciated!
Thanks in advance!
Im not able to get any 3d object visible in ARView.
struct ARViewContainer: UIViewRepresentable {
var trackingState: ARCamera.TrackingState? = nil
func makeUIView(context: Context) -> ARView {
// Create the view.
let view = ARView(frame: .zero)
// Set the coordinator as the session delegate.
view.session.delegate = context.coordinator
let anchor = AnchorEntity(plane: .horizontal)
let box = ModelEntity(mesh: MeshResource.generateBox(size: 0.3), materials: [SimpleMaterial(color: .red, isMetallic: true)])
box.generateCollisionShapes(recursive: true)
// Return the view.
return view
final class Coordinator: NSObject, ARSessionDelegate {
var parent: ARViewContainer
init(_ parent: ARViewContainer) {
self.parent = parent
func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) {
print("Camera tracking state: \(camera.trackingState)")
parent.trackingState = camera.trackingState
func makeCoordinator() -> Coordinator {
func updateUIView(_ uiView: ARView, context: Context) { }
View is loaded correctly but anything cant appear. I also tried to create 3D object in
func updateUIView(_ uiView: ARView, context: Context) {
let anchor = AnchorEntity(plane: .horizontal)
let box = ModelEntity(mesh: MeshResource.generateBox(size: 0.3), materials: [SimpleMaterial(color: .red, isMetallic: true)])
box.generateCollisionShapes(recursive: true)
print("Added into the view")
Print statement is printed but there is still no object in the ARView. Is it bug or what am I missing?
Can we get the raw sensor data from the apple vision pro?
Is there any action that can clone the entity in RealityView to the number I want? If there is, please let me know. Thank you!
Steps to Reproduce:
Create a SwiftUI view that initializes an ARKit session and a camera frame provider.
Attempt to run the ARKit session and retrieve camera frames.
Extract the intrinsics and extrinsics matrices from the camera frame’s sample data.
Attempt to project a 3D point from the world space onto the 2D screen using the retrieved camera parameters.
Encounter issues due to lack of detailed documentation on the correct usage and structure of the intrinsics and extrinsics matrices.
struct CodeLevelSupportView: View {
private var vm = CodeLevelSupportViewModel()
var body: some View {
RealityView { realityViewContent in }
.onAppear {
class CodeLevelSupportViewModel {
let cameraSession = CameraFrameProvider()
let arSession = ARKitSession()
init() {
Task {
await arSession.requestAuthorization(for: [.cameraAccess])
func receiveCamera() {
Task {
do {
try await arSession.run([cameraSession])
guard let sequence = cameraSession.cameraFrameUpdates(for: .supportedVideoFormats(for: .main, cameraPositions: [.left])[0]) else {
print("failed to get cameraAccess authorization")
for try await frame in sequence {
guard let sample = frame.sample(for: .left) else {
print("failed to get camera sample")
let leftEyeScreenImage:CVPixelBuffer = sample.pixelBuffer
let leftEyeViewportWidth:Int = CVPixelBufferGetWidth(leftEyeScreenImage)
let leftEyeViewportHeight:Int = CVPixelBufferGetHeight(leftEyeScreenImage)
let intrinsics = sample.parameters.intrinsics
let extrinsics = sample.parameters.extrinsics
let oneMeterInFront:SIMD3<Float> = .init(x: 0, y: 0, z: -1)
projectWorldLocationToLeftEyeScreen(worldLocation: oneMeterInFront, intrinsics: intrinsics, extrinsics: extrinsics, viewportSize: (leftEyeViewportWidth,leftEyeViewportHeight))
} catch {
//After the function implementation is completed, it should return a CGPoint?, representing the point of this worldLocation in the LeftEyeViewport. If this worldLocation is not visible in the LeftEyeViewport (out of bounds), return nil.
func projectWorldLocationToLeftEyeScreen(worldLocation:SIMD3<Float>,intrinsics:simd_float3x3,extrinsics:simd_float4x4,viewportSize:(width:Int,height:Int)) {
//The API documentation does not provide the structure of intrinsics and extrinsics, making it hard to done this function.
I’m playing around with making an fully immersive multiplayer, air to air dogfighting game, but I’m having trouble figuring out how to attach a camera to an entity.
I have a plane that’s controlled with a GamePad. And I want the camera’s position to be pinned to that entity as it moves about space, while maintaining the users ability to look around.
Is this possible?
From my understanding, the current state of SceneKit, ARKit, and RealityKit is a bit confusing with what can and can not be done.
Full control of the camera
Not sure if it can use RealityKits ECS system.
2D Window. - Missing full immersion.
Full control of the camera* - but only for non Vision Pro devices. Since Vision OS doesn't have a ARView.
Has RealityKits ECS system
2D Window. - Missing full immersion.
Camera is pinned to the device's position and orientation
Has RealityKits ECS system
Allows full immersion