Apply computer vision algorithms to perform a variety of tasks on input images and video using Vision.

Posts under Vision tag

105 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

The Vision request does not work in simulator with Error "Could not create inference context"
When I use VNGenerateForegroundInstanceMaskRequest to generate the mask in the simulator by SwiftUI, there is an error "Could not create inference context". Then I add the code to make the vision by CPU: let request = VNGenerateForegroundInstanceMaskRequest() let handler = VNImageRequestHandler(ciImage: inputImage) #if targetEnvironment(simulator) if #available(iOS 18.0, *) { let allDevices = MLComputeDevice.allComputeDevices for device in allDevices { if(device.description.contains("MLCPUComputeDevice")){ request.setComputeDevice(.some(device), for: .main) break } } } else { // Fallback on earlier versions request.usesCPUOnly = true } #endif do { try handler.perform([request]) if let result = request.results?.first { let mask = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler) return CIImage(cvPixelBuffer: mask) } } catch { print(error) } Even I force the simulator to run the code by CPU, but it still have the error: "Could not create inference context"
2
1
786
Sep ’24
Detect animal poses in Vision: Detected joints and connection are drawn correctly only on iPhone without ignoring safe area
Hi, I'm trying to personalize the Detect animal poses in Vision example (WWDC 23). Detect animal poses in Vision After some tests I saw that the landmarks and connection drawings work only if I do not ignore the safe area, if I ignore it (removing the toggle) or use the app on the iPad the drawings are no longer applied correctly. In the example GeometryReader is used to detect the size of the view: ... ZStack { GeometryReader { geo in AnimalSkeletonView(animalJoint: animalJoint, size: geo.size) } }.frame(maxWidth: .infinity) ... struct AnimalSkeletonView: View { // Get the animal joint locations. @StateObject var animalJoint = AnimalPoseDetector() var size: CGSize var body: some View { DisplayView(animalJoint: animalJoint) if animalJoint.animalBodyParts.isEmpty == false { // Draw the skeleton of the animal. // Iterate over all recognized points and connect the joints. ZStack { ZStack { // left head if let nose = animalJoint.animalBodyParts[.nose] { if let leftEye = animalJoint.animalBodyParts[.leftEye] { Line(points: [nose.location, leftEye.location], size: size) .stroke(lineWidth: 5.0) .fill(Color.orange) } } ... } } } } } // Create a transform that converts the pose's normalized point. struct Line: Shape { var points: [CGPoint] var size: CGSize func path(in rect: CGRect) -> Path { let pointTransform: CGAffineTransform = .identity .translatedBy(x: 0.0, y: -1.0) .concatenating(.identity.scaledBy(x: 1.0, y: -1.0)) .concatenating(.identity.scaledBy(x: size.width, y: size.height)) var path = Path() path.move(to: points[0]) for point in points { path.addLine(to: point) } return path.applying(pointTransform) } } Looking online I saw that it was recommended to change the property cameraView.previewLayer.videoGravity from: cameraView.previewLayer.videoGravity = .resizeAspectFill to: cameraView.previewLayer.videoGravity = .resizeAspect but it doesn't work for me. Could you help me understand where I'm wrong? Thanks!
1
0
605
Sep ’24
Symbol Not Found Error in VNFaceLandmarkRegion2D with MacCatalyst on macOS 14.6.1 (Xcode 16)
We have updated our cross-platform applications to support iOS 18 and are in the final stages of releasing versions built with MacCatalyst. After merging the MacCatalyst changes with those for iOS 18, we are now required to build the app using Xcode 16. However, since transitioning to Xcode 16, the app builds successfully but crashes immediately on startup with the following error: dyld[45279]: Symbol not found: _$sSo22VNFaceLandmarkRegion2DC6VisionE16normalizedPointsSaySo7CGPointVGvg Referenced from: <211097A0-6612-3A9A-80B5-AE12915EBA2A> /Users/***/Library/Developer/Xcode/DerivedData/DM_iOS_Apps-gzpzdsacfldxxwclyngreqkbhtey/Build/Products/Debug-maccatalyst/MyApp.app/Contents/Frameworks/Filters_MyApp.framework/Versions/A/Filters_MyApp Expected in: <50DB755E-C83C-3FC7-A0BB-9C4DF9FEA374> /System/Library/Frameworks/Vision.framework/Versions/A/Vision This crash occurs only when building the app with Xcode 16 for MacCatalyst on macOS 14.6.1. On iOS and macOS 15, it functions as expected, and it also worked prior to the iOS 18 changes, which are independent of the Vision framework code, when building with Xcode 15. Here are the environment details where the error occurs: Xcode Version: Xcode 16.0 (16A242d) macOS Version: macOS Sonoma 14.6.1 And the setup where it works: Xcode Version: Xcode 16.0 (16A242d) macOS Version: macOS Sequoia 15.0 Additionally, attempting to implement a workaround using pointsInImage(imageSize:) resulted in a similar issue, where the symbol for this method is also missing. Is this a known issue? Are there any workarounds or fixes available? We have already submitted this issue as feedback (FB15164375), along with a demo project to illustrate the problem.
2
0
665
Oct ’24
Vision framework not working on Apple Vision Pro
com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/ Vision.framework/anodv4_drop6_fp16.H14G.espresso.hwx Code rise this error: func imageToHeadBox(image: CVPixelBuffer) async throws -> [CGRect] { let request:DetectFaceRectanglesRequest = DetectFaceRectanglesRequest() let faceResult:[FaceObservation] = try await request.perform(on: image) let faceBoxs:[CGRect] = faceResult.map { face in let faceBoundingBox:CGRect = face.boundingBox.cgRect return faceBoundingBox } return faceBoxs }
1
0
797
Sep ’24
Difficulty Locating Center of Pupil Using ARKit – Vision vs. ARKit for Fine Detail?
Hi everyone, I'm working on an AR application where I need to accurately locate the center of the pupil and measure anatomical distances between the pupil and eyelids. I’ve been using ARKit’s face tracking, but I’m having trouble pinpointing the exact center of the pupil. My Questions: Locating Pupil Center in ARKit: Is there a reliable way to detect the exact center of the pupil using ARKit? If so, how can I achieve this? Framework Recommendation: Given the need for fine detail in measurements, would ARKit be sufficient, or would it be better to use the Vision framework for more accurate 2D facial landmark detection? Alternatively, would a hybrid approach, combining Vision for precision and ARKit for 3D tracking, be more effective? What I've Tried: Using ARKit’s ARFaceAnchor to detect face landmarks, but the results for the pupil position seem imprecise for my needs. Considering Vision for 2D detection, but concerned about integrating it into a 3D AR experience. Any insights, code snippets, or guidance would be greatly appreciated! Thanks in advance!
0
1
412
Aug ’24
VisionOS Enterprise API: fail to get cameraFrame in cameraFrameUpdates{}
I am developing an app based on visionOS and need to utilize the main camera access provided by the Enterprise API. I have applied for an enterprise license and added the main camera access capability and the license file in Xcode. In my code, I used await arKitSession.queryAuthorization(for: [.cameraAccess]) to request user permission for camera access. After obtaining the permission, I used arKitSession to run the cameraFrameProvider. However, when running for await cameraFrame in cameraFrameUpdates{ print("hello") guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } pixelBuffer = mainCameraSample.pixelBuffer } , I am unable to receive any frames from the camera, and even print("hello") within the braces do not execute. The app does not crash or throw any errors. Here is my full code: import SwiftUI import ARKit struct cameraTestView: View { @State var pixelBuffer: CVPixelBuffer? var body: some View { VStack{ Button(action:{ Task { await loadCameraFeed() } }){ Text("test") } if let pixelBuffer = pixelBuffer { let ciImage = CIImage(cvPixelBuffer: pixelBuffer) let context = CIContext(options: nil) if let cgImage = context.createCGImage(ciImage, from: ciImage.extent) { Image(uiImage: UIImage(cgImage: cgImage)) } }else{ Image("exampleCase") .resizable() .scaledToFill() .frame(width: 400,height: 400) } } } func loadCameraFeed() async { // Main Camera Feed Access Example let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left]) let cameraFrameProvider = CameraFrameProvider() let arKitSession = ARKitSession() // main camera feed access example var cameraAuthorization = await arKitSession.queryAuthorization(for: [.cameraAccess]) guard cameraAuthorization == [ARKitSession.AuthorizationType.cameraAccess:ARKitSession.AuthorizationStatus.allowed] else { return } do { try await arKitSession.run([cameraFrameProvider]) } catch { return } let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) if cameraFrameUpdates != nil { print("identify cameraFrameUpdates") } else{ print("fail to get cameraFrameUpdates") return } for await cameraFrame in cameraFrameUpdates! { print("hello") guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } pixelBuffer = mainCameraSample.pixelBuffer } } } #Preview(windowStyle: .automatic) { cameraTestView() } When I click the button, the console prints: identify cameraFrameUpdates It seems like it stuck in getting cameraFrame from cameraFrameUpdates. Occurring on VisionOS 2.0 Beta (just updated), Xcode 16 Beta 6 (just updated). Does anyone have a workaround for this? I would be grateful if anyone can help.
2
1
705
Aug ’24
ModelContainer working but ModelContext not finding items with SwiftDta
I am trying to count a database table from inside some of my classes. I am tying to do this below **(My problem is that count1 is working, but count2 is not working.) ** class AppState{ private(set) var context: ModelContext? .... func setModelContext(_ context: ModelContext) { self.context = context } @MainActor func count()async{ let container1 = try ModelContainer(for: Item.self) let descriptor = FetchDescriptor<Item>() let count1 = try container1.mainContext.fetchCount(descriptor) let count2 = try context.fetchCount(descriptor) print("WORKING COUNT: \(count1)") print("NOTWORKING COUNT: \(count2) -> always 0") } I am passing the context like: ... @main @MainActor struct myApp: App { @State private var appState = AppState() @Environment(\.modelContext) private var modelContext WindowGroup { ItemView(appState: appState) .task { appState.setModelContext(modelContext) } } .windowStyle(.plain) .windowResizability(.contentSize) .modelContainer(for: [Item.self, Category.self]) { result in ... } Can I get some guidance on why this is happening? Which one is better to use? If I should use count2, how can I fix it? Is this the correct way to search inside an application using SwiftData ? I don't wanna search using the View like @Query because this operation is gonna happen on the background of the app.
1
0
510
Aug ’24
VisionFramework does not work with VisionOS2.0
I try vision frameworks with VisionPro but does not work only with VisionOS2.0. When I perform requests, do not work and below error is caught. I try same code with VisionOS1.2, iOS18.0beta it works. I try also new beta API but does not work and same error. ex.GenerateForegroundInstanceMaskRequest do you have any idea? is it any permission for use vision framework with visionOS2.0. This is my try list with VisionOS2.0beta4 GenerateForegroundInstanceMaskRequest (not work error1) VNGenerateForegroundInstanceMaskRequest(not work error1) VNRecognizeTextRequest (not work error2) with VisionOS1.2 VNRecognizeTextRequest (work) with iOS 18beta GenerateForegroundInstanceMaskRequest (work) My Development Env Env1 VisionPro: VIsionOS2.0beta4 Xcode: 16.0beta4,16.0beta2. macOS: 14.5(23F79) Env2 VisionPro: VIsionOS1.2. Xcode: 15.4 macOS: 14.5(23F79). Error1 Error Domain=com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)" UserInfo={NSLocalizedDescription=Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)} Error2 Error Domain=com.apple.Vision Code=11 "VNRecognizeTextRequest produced an internal error" UserInfo={NSLocalizedDescription=VNRecognizeTextRequest produced an internal error, NSUnderlyingError=0x3001f6850 {Error Domain=CRImageReaderErrorDomain Code=-5 "Unknown error" UserInfo={NSLocalizedDescription=Unknown error}}}
8
0
1.1k
Sep ’24
VisionOS animation on USDZ
Hello all, I'm developing an application for visionOS and I'm trying to implement 2 different animations: First animation Initially, I have a map that should not be visible. I would like to create an animation effect where it appears as if a drop of water falls in the center of the map and the expanding waves gradually reveal the entire map. Is there a way to do it directly on SwiftUI or I need an animation on my USDZ? Second animation I want an animation effect similar to a cinema screen opening from the center, gradually revealing a video that was initially hidden. Is there a way to do it directly on SwiftUI? Can someone help me with this topic? Thanks ;)
2
1
506
Jul ’24
Help Needed: Error Codes in VCPHumanPoseImageRequest.mm[85] and NSArrayM insertObject
Hey all 👋🏼 We're currently working on a video processing project using the Vision framework (face, body and hand pose detection), and We've encountered a couple of errors that I need help with. We are on Xcode 16 Beta 3, testing on an iPhone 14 Pro running iOS 18 beta. The error messages are as follows: [LOG_ERROR] /Library/Caches/com.apple.xbs/Sources/MediaAnalysis/VideoProcessing/VCPHumanPoseImageRequest.mm[85]: code 18,446,744,073,709,551,598 encountered an unexpected condition: *** -[__NSArrayM insertObject:atIndex:]: object cannot be nil What we've tried: Debugging: I’ve tried stepping through the code, but the errors occur before I can gather any meaningful insights. Searching Documentation: Looked through Apple’s developer documentation and forums but couldn’t find anything related to these specific error codes. Nil Check: Added checks to ensure objects are not nil before inserting them into arrays, but the error persists. Here are my questions: Has anyone encountered similar errors with the Vision framework, specifically related to VCPHumanPoseImageRequest and NSArray operations? Is there any known issue or bug in the version of the framework I might be using? Could it also be related to the beta? Are there any additional debug steps or logging mechanisms I can implement to narrow down the cause? Any suggestions on how to handle nil objects more effectively in this context? I would greatly appreciate any insights or suggestions you might have. Thank you in advance for your assistance! Thanks all!
3
0
825
Jul ’24
Can’t Figure Out How to Get My Earth Entity to Rotate on its Axis
I can‘t Figure Out How to Get My Earth Entity to Rotate on its Axis. This is a follow up post from a previous Apple Developer forum post. How would I have the earth (parent) entity rotate CCW underneath the orbiting starship child? I tried adding the following code block to the RealityView but it is not working: if let rotatingEarth = starshipEntity.findEntity(named: "Earth") { rotatingEarth.transform.rotation = simd_quatf.init(angle: 360, axis: SIMD3(x: 0, y: 1, z: 0)) if let animation = try? AnimationResource.generate(with: rotatingEarth as! AnimationDefinition) { rotatingEarth.playAnimation(animation) } } Any advice on getting the earth to rotate? I tried reviewing the Hello World WWDC23 project code, but I was unable to understand the complexity and how that sample project got the earth to rotate. i want to do this for visionOS 1.2. I realize there are some new animation and possible other capabilities coming up in vision 2.0 but I want to try to address this issue in the current released visionOS version.
5
0
1k
Jul ’24
Misaligned depth and rgb image truedepth from vga streaming
I'm currently streaming synchronised video and depth data from my iPhone 13, using AVFoundation, video set to AVCaptureSession.Preset.vga640x480. When looking at the corresponding images (with depth values mapped to a grey colour map), (both map and image are of size 640x480) it appears the two feeds have different fields of view, with the depth feed zoomed in and angled upwards, and the colour feed more zoomed out. I've looked at the intrinsics from both the depth map, and my colour sample buffer, they are identical. Does anyone know why this might be? My setup code is below (shortened): import AVFoundation import CoreVideo class VideoCaptureManager { private enum SessionSetupResult { case success case notAuthorized case configurationFailed } private enum ConfigurationError: Error { case cannotAddInput case cannotAddOutput case defaultDeviceNotExist } private let videoDeviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInTrueDepthCamera], mediaType: .video, position: .front) private let session = AVCaptureSession() public let videoOutput = AVCaptureVideoDataOutput() public let depthDataOutput = AVCaptureDepthDataOutput() private var outputSynchronizer: AVCaptureDataOutputSynchronizer? private var videoDeviceInput: AVCaptureDeviceInput! private let sessionQueue = DispatchQueue(label: "session.queue") private let videoOutputQueue = DispatchQueue(label: "video.output.queue") private var setupResult: SessionSetupResult = .success init() { sessionQueue.async { self.requestCameraAuthorizationIfNeeded() } sessionQueue.async { self.configureSession() } sessionQueue.async { self.startSessionIfPossible() } } private func requestCameraAuthorizationIfNeeded() { switch AVCaptureDevice.authorizationStatus(for: .video) { case .authorized: break case .notDetermined: AVCaptureSession sessionQueue.suspend() AVCaptureDevice.requestAccess(for: .video, completionHandler: { granted in if !granted { self.setupResult = .notAuthorized } self.sessionQueue.resume() }) default: setupResult = .notAuthorized } } private func configureSession() { if setupResult != .success { return } let defaultVideoDevice: AVCaptureDevice? = videoDeviceDiscoverySession.devices.first guard let videoDevice = defaultVideoDevice else { print("Could not find any video device") setupResult = .configurationFailed return } do { videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice) } catch { setupResult = .configurationFailed return } session.beginConfiguration() session.sessionPreset = AVCaptureSession.Preset.vga640x480 guard session.canAddInput(videoDeviceInput) else { print("Could not add video device input to the session") setupResult = .configurationFailed session.commitConfiguration() return } session.addInput(videoDeviceInput) if session.canAddOutput(videoOutput) { session.addOutput(videoOutput) if let connection = videoOutput.connection(with: .video) { connection.isCameraIntrinsicMatrixDeliveryEnabled = true } else { print("Cannot setup camera intrinsics") } videoOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)] } else { print("Could not add video data output to the session") setupResult = .configurationFailed session.commitConfiguration() return } if session.canAddOutput(depthDataOutput) { session.addOutput(depthDataOutput) depthDataOutput.isFilteringEnabled = false if let connection = depthDataOutput.connection(with: .depthData) { connection.isEnabled = true } else { print("No AVCaptureConnection") } } else { print("Could not add depth data output to the session") setupResult = .configurationFailed session.commitConfiguration() return } let depthFormats = videoDevice.activeFormat.supportedDepthDataFormats let filtered = depthFormats.filter({ CMFormatDescriptionGetMediaSubType($0.formatDescription) == kCVPixelFormatType_DepthFloat16 }) let selectedFormat = filtered.max(by: { first, second in CMVideoFormatDescriptionGetDimensions(first.formatDescription).width < CMVideoFormatDescriptionGetDimensions(second.formatDescription).width }) do { try videoDevice.lockForConfiguration() videoDevice.activeDepthDataFormat = selectedFormat videoDevice.unlockForConfiguration() } catch { print("Could not lock device for configuration: \(error)") setupResult = .configurationFailed session.commitConfiguration() return } session.commitConfiguration() } private func addVideoDeviceInputToSession() throws { do { var defaultVideoDevice: AVCaptureDevice? defaultVideoDevice = AVCaptureDevice.default( .builtInTrueDepthCamera, for: .depthData, position: .front ) guard let videoDevice = defaultVideoDevice else { print("Default video device is unavailable.") setupResult = .configurationFailed session.commitConfiguration() throw ConfigurationError.defaultDeviceNotExist } let videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice) if session.canAddInput(videoDeviceInput) { session.addInput(videoDeviceInput) } else { setupResult = .configurationFailed session.commitConfiguration() throw ConfigurationError.cannotAddInput } }
0
0
683
Jul ’24
How to support USDZ exported from 3dmaxs in the Vision Pro project
We have a lot of resources made using 3Dmaxs design software. How can the USDZ exported from 3dmax be used in the vision project. The preview in the folder on the Apple computer is correct. However, it is not possible to parse the texture in the xcode project. The difference in USDZ data compared to Blender is that the texture data exported by 3dmax has an additional node graph node. Is there any export plugin support?
0
0
419
Jul ’24