Vision

Volumetric Window Size

Hi, I tried to change the default size for a volumetric window but It looks like this window has a maximum width value. Is it true? WindowGroup(id: "id") { ItemToShow() }.windowStyle(.volumetric) .defaultSize(width: 100, height: 0.8, depth: 0.3, in: .meters) Here I set the width to 100 meters but It still looks like about 2 meters

0

652

Feb ’24

I want to develop an app for Vestibular Rehabilitation and Dizziness

Hello everyone, I want to develop an app for vision pro that aims to help people with vertigo and dizziness problems. The problem is that I can not afford vision pro. If I use standart vr set with an iPhone inside would it cause issues on real vision pro?

Graphics & Games General Vision AR / VR Reality Composer Pro visionOS

0

569

Feb ’24

Vision Pro & Vision SDK

I'm exploring my Vision Pro and finding it unclear whether I can even achieve things like body pose detection etc. https://developer.apple.com/videos/play/wwdc2023/111241/ It's clear that I can apply it to self provided images, but how about to the data coming from visionOS SDKs? All I can find is this mesh data from ARKit, https://developer.apple.com/documentation/arkit/arkit_in_visionos - am I missing something or do we not yet have good APIs for this? Appreciate any guidance! Thanks.

Machine Learning & AI General Vision Machine Learning Core ML visionOS

2

0

1.3k

Feb ’24

atal error: Your app was given a scene with session role UISceneSessionRole

When trying to run my app with .windowStyle(.volumetric) for vision OS, this error is returning: Fatal error: Your app was given a scene with session role UISceneSessionRole(_rawValue: UIWindowSceneSessionRoleApplication) but no scenes declared in your App body match this scroll.

Spatial Computing ARKit ARKit Vision AR / VR

2

1

1k

Feb ’24

CoreML Image Classification Model - What Preprocessing Is Required For Static Images

I have trained a model to classify some symbols using Create ML. In my app I am using VNImageRequestHandler and VNCoreMLRequest to classify image data. If I use a CVPixelBuffer obtained from an AVCaptureSession then the classifier runs as I would expect. If I point it at the symbols it will work fairly accurately, so I know the model is trained fairly correctly and works in my app. If I try to use a cgImage that is obtained by cropping a section out of a larger image (from the gallery), then the classifier does not work. It always seems to return the same result (although the confidence is not a 1.0 and varies for each image, it will be to within several decimal points of it, eg 9.9999). If I pause the app when I have the cropped image and use the debugger to obtain the cropped image (via the little eye icon and then open in preview), then drop the image into the Preview section of the MLModel file or in Create ML, the model correctly classifies the image. If I scale the cropped image to be the same size as I get from my camera, and convert the cgImage to a CVPixelBuffer with same size and colour space to be the same as the camera (1504, 1128, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) then I get some difference in ouput, it's not accurate, but it returns different results if I specify the 'centerCrop' or 'scaleFit' options. So I know that 'something' is happening, but it's not the correct thing. I was under the impression that passing a cgImage to the VNImageRequestHandler would perform the necessary conversions, but experimentation shows this is not the case. However, when using the preview tool on the model or in Create ML this conversion is obviously being done behind the scenes because the cropped part is being detected. What am I doing wrong. tl;dr my model works, as backed up by using video input directly and also dropping cropped images into preview sections passing the cropped images directly to the VNImageRequestHandler does not work modifying the cropped images can produce different results, but I cannot see what I should be doing to get reliable results. I'd like my app to behave the same way the preview part behaves, I give it a cropped part of an image, it does some processing, it goes to the classifier, it returns a result same as in Create ML.

Media Technologies General Image I/O Vision Create ML

2

0

917

Mar ’24

Is the Apple Neural Scene Analyzer (ANSA) backbone available to devs

Hello, My understanding of the paper below is that iOS ships with a MobileNetv3-based ML model backbone, which then uses different heads for specific tasks in iOS. I understand that this backbone is accessible for various uses through the Vision framework, but I was wondering if it is also accessible for on-device fine-tuning for other purposes. Just as an example, if I want to have a model to detect some unique object in a photo, can I use the built in backbone or do I have to include my own in the app. Thanks very much for any advice and apologies if I didn't understand something correctly. Source: https://machinelearning.apple.com/research/on-device-scene-analysis

Machine Learning & AI General ML Compute Vision Core ML

1

0

836

Feb ’24

When I enter immersive view, the window keeps getting pushed back.

I'm using RealityKit to give an immersive view of 360 pictures. However, I'm seeing a problem where the window disappears when I enter immersive mode and returns when I rotate my head. Interestingly, putting ".glassBackground()" to the back of the window cures the issue, however I prefer not to use it in the UI's backdrop. How can I deal with this? here is link of Gif:- https://firebasestorage.googleapis.com/v0/b/affirmation-604e2.appspot.com/o/Simulator%20Screen%20Recording%20-%20Apple%20Vision%20Pro%20-%202024-01-30%20at%2011.33.39.gif?alt=media&token=3fab9019-4902-4564-9312-30d49b15ea48

App & System Services Core OS Vision VisionKit visionOS Reality Composer Pro

0

653

Jan ’24

Swift Student Challenge Vision

Hi Developers, I want to create a Vision app on Swift Playgrounds on iPad. However, Vision does not properly function on Swift Playgrounds on iPad or Xcode Playgrounds. The Vision code only works on a normal Xcode Project. SO can I submit my Swift Student Challenge 2024 Application as a normal Xcode Project rather than Xcode Playgrounds or Swift Playgrounds File. Thanks :)

Developer Tools & Services Swift Playgrounds Swift Playgrounds Swift Student Challenge Vision Machine Learning

7

0

1.3k

Feb ’24

Is this type of offset even possible with Vision OS?

Hey, I'm working on a UI that a designer created. But he added an object behind the glass, with an offset, pretty much like the cloud in this video: https://dribbble.com/shots/23039991-Weather-Widget-Apple-Vision-Pro-visionOS I tried a couple of methods, but I always ended up clipping my object. So, here's the question: Is there a way to have some object behind the glass panel, but with a slight offset on the x and y?

App & System Services Core OS Vision visionOS Reality Composer Pro

1

0

479

Jan ’24

vision pro keyboard

I think there is a problem with the keyboard of vision pro. I don't think it's difficult to enter another language. If you're not going to make it, shouldn't Apple provide an extended custom keyboard? Sometimes it's frustrating to see things that are intentionally restricted. If you have any information about vision pro's keyboard or want to discuss it, let's talk about your thoughts together! I don't have any information yet

App & System Services Core OS Vision visionOS

2

0

751

Feb ’24

Is it possible to create a compass in VisionOS?

I am attempting to create a simple compass for Apple Vision Pro. The method I am familiar with involves using: locationManager.startUpdatingHeading() locationManager(_ manager: CLLocationManager, didUpdateHeading newHeading: CLHeading) However, this does not function on visionOS as 'CLHeading is unavailable in visionOS'. Is there any way to develop this simple compass on visionOS?

App & System Services Core OS Vision visionOS Reality Composer Pro

1

0

563

Jan ’24

Vision Framework - Text Recognition - Cannot recognize some umlaut diacritics

I am using Vision Framework to recognize text in my app. However, some umlaut diacritics are recognized incorrectly, for example: Grudziński (The incorrect result is: Grudzinski). I already changed language to DE (because my app needs to support DE text) and tried to use VNRecognizeTextRequest#customWord with usesLanguageCorrection but the result still is incorrect. Does Apple provide any APIs to solve this problem? This issue also happens when I open the Gallery on my phone, copy text from images, and paste it to another place.

App & System Services Core OS iOS Vision

0

390

Jan ’24

VNDetectHumanBodyPose3DRequest Gives Inconsistent Results

I'm currently building an iOS app that requires the ability to detect a person's height with a live video stream. The new VNDetectHumanBodyPose3DRequest is exactly what I need but the observations I'm getting back are very inconsistent and unreliable. When I say inconsistent, I mean the values never seem to settle and they can fluctuate anywhere from 5 '4" to 10'1" (I'm about 6'0"). In terms of unreliable, I have once seen a value that closely matches my height but I rarely see any values that are close enough (within an inch) of the ground truth. In terms of my code, I'm not doing any fancy. I'm first opening a LiDAR stream on my iPhone Pro 14: guard let videoDevice = AVCaptureDevice.default(.builtInLiDARDepthCamera, for: .video, position: .back) else { return } guard let videoDeviceInput = try? AVCaptureDeviceInput(device: videoDevice) else { return } guard captureSession.canAddInput(videoDeviceInput) else { return } captureSession.addInput(videoDeviceInput) I'm then creating an output synchronizer so I can get both image and depth data at the same time: videoDataOutput = AVCaptureVideoDataOutput() captureSession.addOutput(videoDataOutput) depthDataOutput = AVCaptureDepthDataOutput() depthDataOutput.isFilteringEnabled = true captureSession.addOutput(depthDataOutput) outputVideoSync = AVCaptureDataOutputSynchronizer(dataOutputs: [depthDataOutput, videoDataOutput]) Finally, my delegate function that handles the synchronizer is roughly: fileprivate func perform3DPoseRequest(cmSampleBuffer: CMSampleBuffer, depthData: AVDepthData) { let imageRequestHandler = VNImageRequestHandler(cmSampleBuffer: cmSampleBuffer, depthData: depthData, orientation: .up) let request = VNDetectHumanBodyPose3DRequest() do { // Perform the body pose request. try imageRequestHandler.perform([request]) if let observation = request.results?.first { if (observation.heightEstimation == .measured) { print("Body height (ft) \(formatter.string(fromMeters: Double(observation.bodyHeight))) (m): \(observation.bodyHeight)") ... I'd appreciate any help determining how to get accurate results from the observation's bodyHeight. Thanks!

Machine Learning & AI General Vision

0

350

Jan ’24

How to get corresponding pixel from Object Detection

Hey guys! I'm building an app which detects cars via Vision and then retrieves the distance to said car by a synchronized depthDataMap. However, I'm having trouble finding the correct corresponding pixel in that depthDataMap. While the CGRect of the ObjectObservation ranges from 0 - 300 (x) and 0 - 600 (y), The width x height of the DepthDataMap is Only 320 x 180, so I can't get the right corresponding pixel. Any Idea on how to solve this? Kind regards

Programming Languages Swift Swift Vision AVFoundation

1

0

563

Feb ’24

A place that sells vision pro

I'm going to the U.S. to buy a vision pro, does anyone have any information about where they sell it? Will it be sold in Hawaii by any chance? For now, I'm thinking about New York.

Machine Learning & AI General Vision

0

426

Jan ’24

Expected Global Sales Start Date for Vision Pro

Currently in South Korea, due to my personal experiences with what seems like warranty but isn't, and the operation of a ruthless Genius Bar, I feel compelled to purchase the officially released Vision Pro. I'd like to discuss with other developers here about their thoughts on the release schedule. The product launched in the USA in February, but I'm curious about the months following for the secondary and tertiary launch countries. Naturally, we'll know the launch is imminent when local staff are summoned to the headquarters for training. However, the urgency for localized services, development, and personal purchase is growing on my mind.

Machine Learning & AI General Vision

0

415

Dec ’23

Issues with VNDetectHumanBodyPose3DRequest iOS 17/Xcode 15

I seem to be having some trouble running the example app from the WWDC 2023 session on 3D Body Pose Detection (this one). I'm getting an issue about the request revision being incompatible, I tried searching the API documentation for any configuration that has been changed or introduced but to no avail. I also couldn't find much online for it. Is this a known issue? Or am I doing something wrong? Error in question: Unable to perform the request: Error Domain=com.apple.Vision Code=16 "VNDetectHumanBodyPose3DRequest does not support VNDetectHumanBodyPose3DRequestRevision1" UserInfo={NSLocalizedDescription=VNDetectHumanBodyPose3DRequest does not support VNDetectHumanBodyPose3DRequestRevision1}. Code Snippet: guard let assetURL = fileURL else { return } let request = VNDetectHumanBodyPose3DRequest() self.fileURL = assetURL let requestHandler = VNImageRequestHandler(url: assetURL) do { try requestHandler.perform([request]) if let returnedObservation = request.results?.first { Task { @MainActor in self.humanObservation = returnedObservation } } } catch { print("Unable to perform the request: \(error).") } Thank you for any and all advice!

Machine Learning & AI General Vision

1

0

394

Dec ’23

failed to find newest available Simulator runtime

Env Intel Core i7 macOS :14.0 Xcode 15 Beta 8 simulator:visionOS 1.0 beta 3(21N5233e) simulator: ios 17.0.1 ,ios 17.0 beta 8 Step Xcode create a new Vision Demo, it can't build. [macosx] error: Failed to find newest available Simulator runtime Command RealityAssetsCompile failed with a nonzero exit code

Developer Tools & Services Xcode Xcode Vision VisionKit visionOS

1

0

777

Dec ’23

VNRecognizedText returns wrong bounding box

I am trying to parse text from an image, split it into words and store the words in a String array. Additionally I want to store the bounding box of each recognized word. My code works but for some reason the bounding boxes of words that are not separated by a space but by an apostrophe come out wrong. Here is the complete code of my VNRecognizeTextRequestHander: let request = VNRecognizeTextRequest { request, error in guard let observations = request.results as? [VNRecognizedTextObservation] else { return } // split recognized text into words and store each word with corresponding observation let wordObservations = observations.flatMap { observation in observation.topCandidates(1).first?.string.unicodeScalars .split(whereSeparator: { CharacterSet.letters.inverted.contains($0) }) .map { (observation, $0) } ?? [] } // store recognized words as strings recognizedWords = wordObservations.map { (observation, word) in String(word) } // calculate bounding box for each word recognizedWordRects = wordObservations.map { (observation, word) in guard let candidate = observation.topCandidates(1).first else { return .zero } let stringRange = word.startIndex..<word.endIndex guard let rect = try? candidate.boundingBox(for: stringRange)?.boundingBox else { return .zero } let bottomLeftOriginRect = VNImageRectForNormalizedRect(rect, Int(captureRect.width), Int(captureRect.height)) // adjust coordinate system to start in top left corner let topLeftOriginRect = CGRect(origin: CGPoint(x: bottomLeftOriginRect.minX, y: captureRect.height - bottomLeftOriginRect.height - bottomLeftOriginRect.minY), size: bottomLeftOriginRect.size) print("BoundingBox for word '\(String(word))': \(topLeftOriginRect)") return topLeftOriginRect } } And here's an example for what's happening. When I'm processing the following image: the code above produces the following output: BoundingBox for word 'In': (23.00069557577264, 5.718113962610181, 45.89460636656961, 32.78087073878238) BoundingBox for word 'un': (71.19064286904202, 6.289275587192936, 189.16024359557852, 34.392966621800475) BoundingBox for word 'intervista': (71.19064286904202, 6.289275587192936, 189.16024359557852, 34.392966621800475) BoundingBox for word 'del': (262.64622870703477, 8.558512219726875, 54.733978711037985, 32.79967358237818) Notice how the bounding boxes of the words 'un' and 'intervista' are exactly the same. This happens consistently for words that are separated by an apostrophe. Why is that? Thank you for any help Elias

Machine Learning & AI General Vision

0

342

Dec ’23

Explore 3D body pose and person segmentation in Vision

What is the accuracy and resolution of the angles measured using Vision?

Machine Learning & AI General Vision

0

294

Dec ’23

Post

Replies

Boosts

Views

Activity

Vision

Posts under Vision tag

Post

Replies

Boosts

Views

Activity