visionOS 3D tap location offset by ~0.35m?

I have a simple visionOS app that uses a RealityView to map floors and ceilings using PlaneDetectionProvider and PlaneAnchors.

I can look at a location on the floor or ceiling, tap, and place an object at that location (I am currently placing a small cube with X-Y-Z axes sticking out at the location).

The tap locations are consistently about 0.35m off along the horizontal plane (it is never off vertically) from where I was looking.

Has anyone else run into the issue of a spatial tap gesture resulting in a location offset from where they are looking?

And if I move to different locations, the offset is the same in real space, so the offset doesn't appear to be associated with the orientation of the Apple Vision Pro (e.g. it isn't off a little to the left of the headset of where I was looking).

Attached is an image showing this. I focused on the corner of the carpet (yellow circle), tapped my fingers to trigger a tap gesture in RealityView, extracted the location, and placed a purple cube at that location.

I stood in 4 different locations (where the orange squares are), looked at the corner of the rug (yellow circle) and tapped. All 4 purple cubes are place at about the same location ~0.35m away from the look location.

Here is how I captured the tap gesture and extracted the 3D location:

var myTapGesture: some Gesture {
    SpatialTapGesture()
        .targetedToAnyEntity()
        .onEnded { event in
            let location3D = event.convert(event.location3D, from: .global, to: .scene)
            let entity = event.entity
            model.handleTap(location: location3D, entity: entity)
        }
}

Here is how I set the position of the purple cube:

func handleTap(location: SIMD3<Float>, entity: Entity) {
    let positionEntity = Entity()
    positionEntity.setPosition(location, relativeTo: nil)
    ...
}
Answered by Vision Pro Engineer in 784853022

Can you try with let location3D = event.convert(event.location3D, from: .local, to: .scene) ? SpatialTapGesture is initialised by default with the local coordinate space: https://developer.apple.com/documentation/swiftui/spatialtapgesture/init(count:coordinatespace:)-75s7q

Do you have more SwiftUI Views inside your ImmersiveSpace (aside from the RealityView?). This could offset the entire RealityView.

No SwiftUI views inside the ImmersiveSpace. I only have Entity and ModelEntity instances created manually with RealityKit.

I'll try some additional experiments. I think I will place a small object programmatically 1.5 meters in front of me (but not have it tappable), look at it, and tap (I assume the gaze will go through to hit the floor below it), and then compare the event's location3D with the object placed at a specific location.

I made the code as simple as possible: a single 5cm target square placed at location (0, 0, -1.5)

I then tapped on target while standing in several locations, and the reported tap location was off along the Z axis by about 0.5m.

Here were several tap locations (Z location is underlined):

tap location: SIMD3<Float>(0.0067811073, 0.019996116, -1.1157947), name: target

tap location: SIMD3<Float>(-0.00097223074, 0.019996116, -1.1036792), name: target

tap location: SIMD3<Float>(0.0008024718, 0.019995179, -1.1074299), name: target

tap location: SIMD3<Float>(-0.009804221, 0.019996116, -1.0694565), name: target

tap location: SIMD3<Float>(-0.0037206858, 0.019995492, -1.0778457), name: target

tap location: SIMD3<Float>(-0.009298846, 0.019996116, -1.0772702), name: target

Here is the code to set up the RealityView:

import SwiftUI
import RealityKit
import RealityKitContent

struct ImmersiveView: View {
    @StateObject var model = MyModel()
    
    /// Spatial tap gesture that tells the model the tap location.
    var myTapGesture: some Gesture {
        SpatialTapGesture()
            .targetedToAnyEntity()
            .onEnded { event in
                let location3D = event.convert(event.location3D, from: .global, to: .scene)
                let entity = event.entity
                model.handleTap(location: location3D, entity: entity)
            }
    }
    
    var body: some View {
        RealityView { content in
            model.setupContentEntity(content: content)
        }
        .gesture(myTapGesture)
    }
}

Here is the model code:

import Foundation
import SwiftUI
import RealityKit
import RealityKitContent
import ARKit
import os.log

@MainActor class MyModel: ObservableObject {
    private var realityViewContent: RealityViewContent?
    
    /// Capture RealityViewContent and create target
    ///
    /// - Parameter content: container for all RealityView content
    func setupContentEntity(content: RealityViewContent) {
        self.realityViewContent = content
        placeTargetObject()
    }
    
    /// Place a small red target at position 0, 0, -1.5
    ///
    /// I will look at this position and tap my fingers. The tap location
    /// should be near the same position (0, 0, -1.5)
    func placeTargetObject() {
        guard let realityViewContent else { return }
        let width: Float = 0.05
        let height: Float = 0.02
        let x: Float = 0
        let y: Float = 0
        let z: Float = -1.5
        
        // Create red target square
        let material = SimpleMaterial(color: .red, isMetallic: false)
        let mesh = MeshResource.generateBox(width: width, height: height, depth: width)
        let target = ModelEntity(mesh: mesh, materials: [material])
        
        // Add collision and target component to make it tappable
        let shapeBox = ShapeResource.generateBox(width: width, height: height, depth: width)
        let collision = CollisionComponent(shapes: [shapeBox], isStatic: true)
        target.collision = collision
        target.components.set(InputTargetComponent())
        
        // Set name, position, and add it to scene
        target.name = "target"
        target.setPosition(SIMD3<Float>(x,y + height/2, z), relativeTo: nil)
        realityViewContent.add(target)
    }
    
    /// Respond to the user tapping on an object by printing name of entity and tap location
    ///
    /// - Parameters:
    ///   - location: location of tap gesture
    ///   - entity: entity that was tapped
    func handleTap(location: SIMD3<Float>, entity: Entity) {
        os_log("tap location: \(location), name: \(entity.name, privacy: .public)")
    }
}

Example of the small red target:

Accepted Answer

Can you try with let location3D = event.convert(event.location3D, from: .local, to: .scene) ? SpatialTapGesture is initialised by default with the local coordinate space: https://developer.apple.com/documentation/swiftui/spatialtapgesture/init(count:coordinatespace:)-75s7q

I have the same problem. Can I have your code?

visionOS 3D tap location offset by ~0.35m?
 
 
Q