Hi, I am trying to train an object tracking model on MacBook Air with a M2 silicon. The .usdz file used to train is captured by iPhone 14 pro. However the training process takes too much time. it stay at 0.0% for about an hour and a half. I wonder if there is any other methods to generate reference objects.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Post
Replies
Boosts
Views
Activity
Hi,
With this code
on iPad Pro (11-inch) (4th generation), when capturing in area mode and pressing the "Next" Button, an overlay with a white dot in the center appears. Everything else is blurred out and the app is stuck.
It does seem to work well on iPhones though.
Thanks!
Trouble with Core ML Object Tracking for Spherical Objects Using WWDC Sample Code and Object Capture
Hi everyone,
I'm working with Core ML for object tracking and have successfully trained a couple of models. However, when I try to use the reference object file in the object tracking sample code from the WWDC video, it doesn't work.
I'm training the model on a single-color plastic spherical object. Could this be the reason for the issue? I also attempted to use USDZ 3D assets that resemble the real object. Do these need to be trained with the Object Capture app to work properly?
Speaking of the Object Capture app, my experience hasn't been great. When I scanned my spherical object, the result was far from a sphere—it looked more like a mushy dough.
Any insights or suggestions would be greatly appreciated!
In our VisionOS project we want to apply wall panel on walls in a room , we are occluding furniture kept in front of walls by creating a mesh of occlusion material on everything except walls but this way is not able to occlude objects perfectly , edges are not smooth , panel coming over tv, table etc.
Is there any other way to achieve this.
Hello,
I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below
import ARKit
com.apple.developer.arkit.main-camera-access.allow = true in info.plist
Do below code
func loadCameraFeed() async {
// Main Camera Feed Access Example
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
let cameraFrameProvider = CameraFrameProvider()
var arKitSession = ARKitSession()
var pixelBuffer: CVPixelBuffer?
await arKitSession.queryAuthorization(for: [.cameraAccess])
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
return
}
guard let cameraFrameUpdates =
cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else {
return
}
print(cameraFrameUpdates)
for await cameraFrame in cameraFrameUpdates {
print(cameraFrame)
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
}
}
I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS.
Please guide me on how to achieve my next step. I am blank after this code.
Hello, I am trying to make an app that involves room scanning and then placing of imaginary objects in the room. I had two questions about the specifics behind this.
Is it possible for Roomplan to include the ceiling when scanning the room?
Is it possible to place objects in AR while Room plan is running, or is it necessary to wait until after the scan is done?
At the end of the WWDC 2023 sessoion https://developer.apple.com/wwdc23/10111 the session talks about implementing Virtual Hands. However the Code shown in the Video is not correct anymore with the latest version of VisionOS and also the example “SpaceGloves“ Entity referenced in the Video is not documented and there is no example project resource to reference either.
I have been trying for the last two weeks to implement the same basic example shown in this video, including skinning and rigging my own hand meshes But have had significant issues doing so and found no working code/usdz examples to reference across different internet resource.
Is it possible for a working project/code example including USDZ files for the Hand Meshes to be provided in order to test a working example of this as a starting point for building my own virtual hands?
Thanks for your help.
I have indoor map and I want to draw path between two route location ex. from A to B I want to draw ARKit based Arrow path in ios Application. Currently I am using ARAnchor to achieve this but challenges is if A to B is 10 meter and I am adding Nodes on each one meter so instead of 10 different nodes i am getting single Arrow nodes showing all 10 in it. I am using below code.
// Below Code from where I am calling addArpath function
if let lat1 = mCurrentPosition?.latitude, let long1 = mCurrentPosition?.longitude {
let latEnd = steplocation.latitude
let longEnd = steplocation.longitude
// if let lastLat = arrpath.last?.latitude,let lastLong = arrpath.last?.longitude,let lastAltitude = arrpath.last?.altitude{
let userLocation = CLLocation(latitude: lat1, longitude: long1)
let endLocation = CLLocation(coordinate: CLLocationCoordinate2DMake(CLLocationDegrees(latEnd), CLLocationDegrees(longEnd)), altitude: CLLocationDistance(steplocation.altitude), horizontalAccuracy: CLLocationAccuracy(5), verticalAccuracy: CLLocationAccuracy(0), course: CLLocationDirection(-1), speed: CLLocationSpeed(5), timestamp: Date())
let heading = getHeadingForDirectionFromCoordinate(from: userLocation, to: endLocation)
let lon1 = degreesToRadians(long1) //DegreesToRadians(long1)
let lon2 = degreesToRadians(longEnd); //DegreesToRadians(longEnd);
let lat2 = degreesToRadians(latEnd);
let dLon = lon2 - lon1
let y = sin(dLon) * cos(lat2);
yVal = yVal + y
// let distanceToendpoint = calculateDistance(lat: endLocation.coordinate.latitude, long: endLocation.coordinate.longitude)
let distvalue = Int(distance) + Int(pathlength)
distance += CGFloat(distvalue)
for i in stride(from: 0, to: distance, by:1) {
print("current loop iteration is:" ,i)
let headingValue = heading - self.heading
zValue = zValue + headingValue
distanceVal = CGFloat(i) + distanceVal
zGlobal = zValue
// Calling addARPathtoLocation
addARPathtoLocation(stepLocation:endLocation,zvalue: zValue, yvalue: yVal, distance:Float(i), direction: manuoverType)
}
// }
}
// MARK: - ARSessionDelegate
func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) {
guard !(anchor is ARPlaneAnchor) else { return }
let sphereNode = generateArrowNodes(anchor: anchor)
DispatchQueue.main.async {
node.addChildNode(sphereNode)
}
}
//create ARAnchor to add to nodes
func generateArrowNodes(anchor: ARAnchor) -> SCNNode {
let imageMaterial = SCNMaterial()
imageMaterial.isDoubleSided = true
imageMaterial.diffuse.contents = UIImage(named: "blueArrow")
let plane = SCNPlane(width:0.5, height:0.5)
plane.materials = [imageMaterial]
plane.firstMaterial?.isDoubleSided = true
let blueNode = SCNNode(geometry: plane)
blueNode.name = "blueNode"
blueNode.position = SCNVector3(x:Float(zGlobal), y:0, z:Float(distanceVal))
blueNode.eulerAngles.x = -.pi / 2
blueNode.eulerAngles.y -= Float(CGFloat(CGFloat.pi/4*6))
return blueNode
}
func addARPathtoLocation(stepLocation: CLLocation, zvalue: CGFloat, yvalue: CGFloat, distance:Float, direction:VMEManeuverType) {
// give you the depth of anything ARKit has detected
guard let query = sceneView.raycastQuery(from: sceneView.center , allowing: .estimatedPlane, alignment: .any) else {
return
}
let results = sceneView.session.raycast(query)
guard let hitResult = results.first else {
print("No surface found")
return
}
// Add ARAnchor to Scene
let anchor = ARAnchor(transform: hitResult.worldTransform)
sceneView.session.add(anchor: anchor)
}
func radiansToDegrees(_ radians: Double) -> Double {
return (radians) * (180.0 / Double.pi)
}
func degreesToRadians(_ degrees: Double) -> Double {
return (degrees) * (Double.pi / 180.0)
}
func getHeadingForDirectionFromCoordinate(from: CLLocation, to: CLLocation) -> Double {
let fLat = degreesToRadians(from.coordinate.latitude)
let fLng = degreesToRadians(from.coordinate.longitude)
let tLat = degreesToRadians(to.coordinate.latitude)
let tLng = degreesToRadians(to.coordinate.longitude)
let deltaL = tLng - fLng
let x = sin(deltaL) * cos(tLng) //cos(tLat) * sin(deltaL)
let y = cos(fLat) * sin(tLat) - sin(fLat) * cos(tLat) * cos(deltaL)
let bearing = atan2(x,y)
let bearingInDegrees = bearing.toDegrees
print("Bearing Degrees :",bearingInDegrees) // sanity check
// let degree = radiansToDegrees(atan2(sin(tLng-fLng)*cos(tLat), cos(fLat)*sin(tLat)-sin(fLat)*cos(tLat)*cos(tLng-fLng)))
if bearingInDegrees >= 0 {
return bearingInDegrees
} else {
return bearingInDegrees + 360
}
}
FYI.
The source code of the FindSurface demo app for Apple Vision Pro (visionOS) is available now.
The Swift package of FindSurface™ library is required to build the source code into the demo app.
https://github.com/CurvSurf/FindSurface-visionOS
After starting the app, the floating panels (below) will appear on your right side, and you will see wireframe meshes that approximately describe your environments. Performing a spatial tap (pinching with your thumb and index finger) with staring at a location on the meshes will invoke FindSurface, with an indicator (blue disk) appearing on the surface you've gazed.
Voice commands:
“Tap” – Spatial tap (gazing & pinching). Invoke FindSurface.
“Tap plane” – Plane selection.
“Tap sphere” or “Tap ball” – Sphere selection.
“Tap cylinder” – Cylinder selection.
“Tap cone” – Cone selection.
“Tap torus” or “Tap donut” – Torus selection.
“Tap accuracy” or “Tap measurement accuracy” – Accuracy selection.
“Tap mean distance”, “Tap average distance”, or “Tap distance” – Avg. Distance selection.
“Tap touch radius” or “Tap seed radius” – Touch Radius selection.
“Tap Inlier” – “Show inlier points” toggle.
“Tap outline” – “Show geometry outline” toggle.
“Tap clear” – “Clear Scene” click.
I'm seeking insight on why the new VisionOS Barcode Scanning API is categorized as an Enterprise API and restricted only for proprietary and in-house apps.
I understand Apple's focus on privacy and I can see how this restriction could make sense for other Enterprise APIs like main camera access and passthrough screen capture.
Why is barcode scanning restricted from open apps? What makes barcode scanning more of a risk to privacy versus the unrestricted APIs for object tracking, image tracking, or hand tracking?
I'm working on training an object tracking model in CreateML for visionOS that has fan blades on it and looking to try to train while ignoring a section of the geometry. When I train currently, it can detect the object if the fan blades are in the same orientation as when scanned but if they move it struggles.
I see there is an "objects to avoid" data source that can be added but upon reading the description, I don't think it does what I'm needing but I could be wrong. Is there anyway to have the training ignore a part of the geometry that has a significant effect on the silhouette of the object?
I am using ArKit to create an augmented reality application in Unity. Following the addition of an object reference object Because it tracks the object in front of it slowly and inaccurately, the application does not update the screen quickly.
How can I track objects more quickly?
We are building an app that uses ARKit occasionally, but not always.
We would like to test the non-ARKit parts in the simulator, since it offers more debugging features (e.g. SwiftUI previews or the Thread Sanitizer).
However, we can't even build the app for the simulator, since the simulator SDK does not know about certain classes (e.g. "AnchorEntity"). This also means that none of the SwiftUI previews work, even if the views are not using ARKit.
What is the best approach to test such an app in the simulator, without using any ARKit features?
Hello there,
I am currently experimenting with the body tracking sample from the AR Foundation Example Project.
It works fine, but when there are multiple persons in front of the camera, the mesh jumps randomly from tracked skeleton to tracked skeleton.
So I am looking for a way to define a specific area in front of the camera or to implement some other marker (maybe a body pose) to start and stop tracking.
I am guessing Pose-Tracking could work. Whenever a body stands in a T-Pose, the Mesh gets applied to that body and then the script stops looking for new skeletons until the original skeleton is lost.
Does somebody know which code to look at to achieve this?
Is it possible to both capture the images required for ObjectCapture and the scan data required to create an ARObjectAnchor (and be able to align the two to each other)?
Perhaps an extension of this WWDC 2020 example that also integrates usdz object capture (instead of just import external one)?
https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/scanning_and_detecting_3d_objects?changes=_2
Is it possible to create a roomplan with the texture of a room plan's wall - or some way to combine ObjectCapture results with RoomPlan results?
Hello,
Recently, I’ve been studying point cloud development. As a beginner in this field, I’m seeking some guidance on how to approach this. I want to obtain point cloud data and be able to display and view it as shown in the picture below. I used the "Displaying a Point Cloud Using Scene Depth" code as my starting point and then attempted to add a button to save the data in the Renderer private variable particlesBuffer: MetalBuffer. The particlesBuffer array contains data structured as follows:
struct ParticleUniforms {
simd_float3 position;
simd_float3 color;
float confidence;
};
My understanding is that this data represents the point cloud. If I am wrong, please let me know. Next, I wrote my own code to display this data using SceneKit by creating small spheres based on the position and color values. However, in practice, this method only allows me to display about 30,000 spheres before it becomes very laggy. I believe my implementation might be incorrect because I noticed that using the 3D scanner App to display point clouds achieves a much better performance.
Could you please advise me on how to achieve an effect like the one shown in the image below? Thank you.
Hi,
My goal is to obtain the device location (6 DoF) of the Apple Vision Pro and I find a function that might satisfy my need:
final func queryDeviceAnchor(atTimestamp timestamp: TimeInterval) -> DeviceAnchor?
which returns a device anchor (containing the position and orientation of the headset).
However, I couldn't find any document specify where does the device anchor exactly locate on the headset.
Does it locate at the midpoint between the user's eyes? Does it locate at the centroid of the six world facing tracking cameras?
It would be really helpful if someone can provide a local transformation matrix (similar to a camera extrinsic) from a visible rigid component (say the digital crown, top button, or the laser scanner) to the device anchor.
Thanks.
In RealityKit using visionOS, I scan the room and use the resulting mesh to create occlusion and physical boundaries. That works well and iI can place cubes (with physics on) onto that too.
However, I also want to update the mesh with versions from new scans and that make all my cubes jump.
Is there a way to prevent this? I get that the inaccuracies will produce slightly different mesh and I don’t want to anchor the objects so my guess is I need to somehow determine a fixed floor height and alter the scanned meshes so they adhere that fixed height.
Any thoughts or ideas appreciated
/Andreas
I want to align and fuse the video streams from the main camera and my external camera in visionOS 2.0, ensuring that the fused image remains directly in front of the field of view as the head moves, similar to a normal passthrough mode video image. I have already achieved and verified the static image alignment and fusion on the Vision Pro using screenshots from the main camera and the external video stream. However, I don't know how to perform real-time fusion with the main camera images. Could you please advise on how I can achieve this?