Post not yet marked as solved
Hi! With the following code I am able to receive the location of taps in my RealityView.
How do I find out which of my entities was tapped (in order to execute an animation or movement)? I was not able to find anything close to ARView's entity(at:) unfortunately. Am I missing something, or is this not possible in the current beta of visionOS?
struct ImmersiveView: View {
var tap: some Gesture {
SpatialTapGesture()
.onEnded { event in
print("Tapped at \(event.location)")
}
}
var body: some View {
RealityView { content in
let anchor = AnchorEntity(.plane(.horizontal, classification: .table, minimumBounds: [0.3, 0.3]))
// adding some entities here...
content.add(anchor)
}
.gesture(tap.targetedToAnyEntity())
}
}
Post not yet marked as solved
World Anchor from SpatialTapGesture ??
At 19:56 in the video, it's mentioned that we can use a SpatialTapGesture to "identify a position in the world" to make a world anchor.
Which API calls are utilized to make this happen?
World anchors are created with 4x4 matrices, and a SpatialTapGestures doesn't seem to generate one of those.
Any ideas?
Post not yet marked as solved
In ARKit for iPad, I could 1) build a mesh on top of the real world and 2) request a people occlusion map for use with my application so people couls move behind or in fromt of virtual content via compositing. However, in VisionOS, there is no ARFrame image to pass to the function that would generate the occlusion data. Is it possible to do people occlusion in visionOS? If so, how it is done—through a data provider, or is it automatic when passthrough is enabled? If it’s not possible, is this something that might have a solution in future updates as the platform develops? Being able to combine virtual content and the real world with people being able to interact with the content convincingly is a really important aspect to AR, so it would make sense for this to be possible.
Post not yet marked as solved
ARKit is not correctly building "polygonEdge" items with correct Edge enumeration values. It appears to be finding an edge that is of the set Top/Bottom/Left/Right and leaving garbage in the field that should identify where the edge belongs resulting in some confusion later on when processing. I attached the data for the edges of the polygons when I displayed it In the debugger panel. When the debugger finds an edge type that it does not recognize it dumps the raw value in the field out to the text. If it is valid, the enumeration label for the field is dumped. That is why I think that garbage is being passed along because of the early bailout of the edge classification processing and the attributes of the edge not having a known classification processing for the object
This is from a posting for a bug in the Xcode Debugger pane, When running the Debugger for a session, the data for the PolygonEdges field is displayed for enumerated fields that contain unknown enumerations with shown for the label, With the selection of “raw data” one would presumably see the hex representing the value that fit the labeled enumerated field of the structure but that is not what shows. Attached is a somewhat lengthy sample of the data for the field. The fact that the fields are invalid will be taken up with those responsible for the ARKit implementation in another venue.
It appears that formatting preferences really don’t mean anything and are ignored
([RoomPlan.CapturedRoom.Surface.Edge]) polygonEdges = 1456 values {
[0] = (0xa9)
[1] = (0xc5)
[2] = (0xa2)
[3] = (0xe1)
[4] = right
[5] = top
[6] = top
[7] = top
[8] = (0xb2)
[9] = (0x10)
[10] = top
[11] = top
[12] = top
[13] = top
[14] = top
[15] = top
[16] = (0x40)
[17] = (0x1a)
[18] = (0x58)
[19] = (0x4)
[20] = bottom
[21] = top
[22] = top
[23] = top
[24] = (0xa9)
[25] = (0xc5)
[26] = (0xa2)
[27] = (0xe1)
[28] = right
[29] = top
[30] = top
[31] = top
[32] = (0xcb)
[33] = (0x19)
[34] = top
[35] = top
[36] = top
[37] = top
[38] = top
[39] = top
[40] = (0x40)
[41] = (0x1a)
[42] = (0x58)
[43] = (0x4)
[44] = bottom
[45] = top
[46] = top
[47] = top
[48] = (0xa9)
[49] = (0xc5)
[50] = (0xa2)
[51] = (0xe1)
[52] = right
[53] = top
[54] = top
[55] = top
[56] = (0x2f)
[57] = (0x1c)
[58] = top
[59] = top
[60] = top
[61] = top
[62] = top
[63] = top
[64] = (0x40)
[65] = (0x1a)
[66] = (0x58)
[67] = (0x4)
[68] = bottom
[69] = top
[70] = top
[71] = top
[72] = (0xa9)
[73] = (0xc5)
[74] = (0xa2)
[75] = (0xe1)
[76] = right
[77] = top
[78] = top
[79] = top
[80] = (0xd0)
[81] = (0xc)
[82] = top
[83] = top
[84] = top
[85] = top
[86] = top
[87] = top
[88] = (0x40)
[89] = (0x1a)
[90] = (0x58)
[91] = (0x4)
[92] = bottom
[93] = top
[94] = top
[95] = top
[96] = (0xa9)
[97] = (0xc5)
[98] = (0xa2)
[99] = (0xe1)
[100] = right
[101] = top
[102] = top
[103] = top
[104] = (0x92)
[105] = (0xa)
[106] = top
[107] = top
[108] = top
[109] = top
[110] = top
[111] = top
[112] = (0x40)
[113] = (0x1a)
[114] = (0x58)
[115] = (0x4)
[116] = bottom
[117] = top
[118] = top
[119] = top
[120] = (0xa9)
[121] = (0xc5)
[122] = (0xa2)
[123] = (0xe1)
[124] = right
[125] = top
[126] = top
[127] = top
[128] = (0x8)
[129] = (0x1d)
[130] = top
[131] = top
[132] = top
[133] = top
[134] = top
[135] = top
[136] = (0x40)
[137] = (0x1a)
[138] = (0x58)
[139] = (0x4)
[140] = bottom
[141] = top
[142] = top
[143] = top
[144] = (0xa9)
[145] = (0xc5)
[146] = (0xa2)
[147] = (0xe1)
[148] = right
Post not yet marked as solved
Hi there!
From the documentation and sample code (https://developer.apple.com/documentation/avfoundation/additional_data_capture/capturing_depth_using_the_lidar_camera), my understand is that AVFoundation does provide more access to manual control as well as 2x higher resolution of depth image than ARKit.
However, upon reading the https://developer.apple.com/augmented-reality/arkit/ website as well the WWDC vid (https://developer.apple.com/videos/play/wwdc2022/10126/), it looks like ARKit 6 now also supports 4K video capture (RGB mode) while scene understanding is running under the hood. I was wondering if anyone knows if the resolution of depth images is still a limitation of ARKit vs AVFoundation.
I'm trying to build a capture app that relies on high-quality depth image/lidar info. What would you suggest or any other consideration I should keep in mind?
Thank you!
Post not yet marked as solved
I would like to know whether it will be possible to access high-resolution textures coming from ARKit scene reconstruction or by having access to camera frames. In session 10091 it appears that ARFrame (with its camera data) is no longer available on visionOS.
The use cases I have in mind are along the lines of:
Having a paper card with a QR code on a physical table and use pixel data to recognize the code and place a corresponding virtual object on top
Having physical board game components recognized and used as inputs: for example you control white chess pieces physically while your opponent's black pieces are projected virtually onto your table
Having a user draw a crude map on physical paper and being able to use this as an image to be processed/recognized
These examples all have in common that the physical objects serve directly as inputs to the application without having to manipulate a virtual representation.
In an ideal privacy-preserving way it would be possible to ask ARKit to provide texture information a specially-defined volume in physical space or at least a given recognized surface (e.g. a table or a wall).