Apply computer vision algorithms to perform a variety of tasks on input images and video using Vision.

Posts under Vision tag

104 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

One app Multi-platform. The concern of a novice developer
Hello . Currently, only the ios version is on sale on the App Store. The application is offering an icloud-linked, auto-renewable subscription. I want to sell to the app store connect with the same identifier, AppID at the same time. I simply added visionos to the existing app project to provide the visionos version early, but the existing UI-related code and the location-related code are not compatible. We used the same identifier with the same name, duplicated and optimized only what could be implemented, and created it without any problems on the actual device. However, when I added the visionos platform to the App Store cennect and tried to upload it through the archive in the app for visionos that I created as an addition, there was an error in the identifier and provisioning, so the upload was blocked. The result of looking up to solve the problem App Group -I found out about the function, but it was judged that a separate app was for an integrated service, so it was not suitable for me. Add an APP to an existing app project via target and manually adjust the platform in Xcode -> Build Phases -> Compile Soures -> Archive upload success?( I haven't been able to implement this stage of information yet.) I explained the current situation. Please give me some advice on how to implement it.visionos has a lot of constraints, so you need to take a lot of features off.
0
0
559
Mar ’24
VNRecognizeTextRequest on Chinese text does not recognize vertical text or allow individual character recognition
I am using VNRecognizeTextRequest to read Chinese characters. It works fine with text written horizontally, but if even two characters are written vertically, then nothing is recognized. Does anyone know how to get the vision framework to either handle vertical text or recognize characters individually when working with Chinese? I am setting VNRequestTextRecognitionLevel to accurate, since setting it to fast does not recognize any Chinese characters at all. I would love to be able to use fast recognition and handle the characters individually, but it just doesn't seem to work with Chinese. And, when using accurate, if I take a picture of any amount of text, but it's arranged vertically, then nothing is recognized. I can take a picture of 1 character and it works, but if I add just 1 more character below it, then nothing is recognized. It's bizarre. I've tried setting usesLanguageCorrection = false and tried using VNRecognizeTextRequestRevision3, ...Revision2 and ...Revision1. Strangely enough, revision 2 seems to recognize some text if it's vertical, but the bounding boxes are off. Or, sometimes the recognized text will be wrong. I tried playing with DataScannerViewController and it's able to recognize characters in vertical text, but I can't figure out how to replicate it with VNRecognizeTextRequest. The problem with using DataScannerViewController is that it treats the whole text block as one item, and it uses the live camera buffer. As soon as I capture a photo, I still have to use VNRecognizeTextRequest. Below is a code snippet of how I'm using VNRecognizeTextRequest. There's not really much to it and there aren't many other parameters I can try out (plus I've already played around with them). I've also attached a sample image with text laid out vertically. func detectText( in sourceImage: CGImage, oriented orientation: CGImagePropertyOrientation ) async throws -> [VNRecognizedTextObservation] { return try await withCheckedThrowingContinuation { continuation in let request = VNRecognizeTextRequest { request, error in // ... continuation.resume(returning: observations) } request.recognitionLevel = .accurate request.recognitionLanguages = ["zh-Hant", "zh-Hans"] // doesn't seem have any impact // request.usesLanguageCorrection = false do { let requestHandler = VNImageRequestHandler( cgImage: sourceImage, orientation: orientation ) try requestHandler.perform([request]) } catch { continuation.resume(throwing: error) } } }
1
0
327
Mar ’24
Visionos upload error Profile doesn't include the com.apple.application-identifier entitlement.
Currently, I'm reducing the ios version being sold on the App Store with the same app id and identifier to visionos, and I'm trying to upload it using the same identifier, developer id, and icloud, but I can't upload it because of an error. I didn't know what the problem was, so I updated the update of the ios version early, but I uploaded the review without any problems. Uploading the visionos single version has a profile problem, so I would appreciate it if you could tell me the solution. In addition, a lot of the code used in the ios version is not compatible with visionos, so we have created a new project for visionos.
0
0
534
Mar ’24
How to show only Spatial video using PHPickerFilter in swift 5.0
I want to get only spatial video while open the Photo library in my app. How can I achieve? One more thing, If I am selecting any video using photo library then how to identify selected video is Spatial Video or not? self.presentPicker(filter: .videos) /// - Tag: PresentPicker private func presentPicker(filter: PHPickerFilter?) { var configuration = PHPickerConfiguration(photoLibrary: .shared()) // Set the filter type according to the user’s selection. configuration.filter = filter // Set the mode to avoid transcoding, if possible, if your app supports arbitrary image/video encodings. configuration.preferredAssetRepresentationMode = .current // Set the selection behavior to respect the user’s selection order. configuration.selection = .ordered // Set the selection limit to enable multiselection. configuration.selectionLimit = 1 let picker = PHPickerViewController(configuration: configuration) picker.delegate = self present(picker, animated: true) } `func picker(_ picker: PHPickerViewController, didFinishPicking results: [PHPickerResult]) { picker.dismiss(animated: true) { // do something on dismiss } guard let provider = results.first?.itemProvider else {return} provider.loadFileRepresentation(forTypeIdentifier: "public.movie") { url, error in guard error == nil else{ print(error) return } // receiving the video-local-URL / filepath guard let url = url else {return} // create a new filename let fileName = "\(Int(Date().timeIntervalSince1970)).\(url.pathExtension)" // create new URL let newUrl = URL(fileURLWithPath: NSTemporaryDirectory() + fileName) print(newUrl) print("===========") // copy item to APP Storage //try? FileManager.default.copyItem(at: url, to: newUrl) // self.parent.videoURL = newUrl.absoluteString } }`
1
0
724
Mar ’24
Object Detection using Vision performs different than in Create ML Preview
Context So basically I've trained my model for object detection with +4k images. Under preview I'm able to check the prediction for Image "A" which detects two labels with 100% and its Bounding Boxes look accurate. The problem itself However, inside the Swift Playground, when I try to perform object detection using the same model and same Image I don't get same results. What I expected Is that after performing the request and processing the array of VNRecognizedObjectObservation would show the very same results that appear in CreateML Preview. Notes: So the way I'm importing the model into playground is just by drag and drop. I've trained the images using JPEG format. The test Image is rotated so that it looks vertical using MacOS Finder rotation tool. I've tried, while creating VNImageRequestHandlerto pass a different orientation, with the same result. Swift Playground code This is the code I'm using. import UIKit import Vision do{ let model = try MYMODEL_FROMCREATEML(configuration: MLModelConfiguration()) let mlModel = model.model let coreMLModel = try VNCoreMLModel(for: mlModel) let request = VNCoreMLRequest(model: coreMLModel) { request, error in guard let results = request.results as? [VNRecognizedObjectObservation] else { return } results.forEach { result in print(result.labels) print(result.boundingBox) } } let image = UIImage(named: "TEST_IMAGE.HEIC")! let requestHandler = VNImageRequestHandler(cgImage: image.cgImage!) try requestHandler.perform([request]) } catch { print(error) } Additional Notes & Uncertainties Not sure if this is relevant, but just in case: I've trained the model using pictures I took from my iPhone using 48MP HEIC format. All photos were on vertical position. With a python script I overwrote the EXIF orientation to 1 (Normal). This was in order to be able to annotate the images using the CVAT tool and then convert to CreateML annotation format. Assumption #1 Since I've read that Object Detection in Create ML is based on YOLOv3 architecture which inside the first layer resizes the image dimension, meaning that I don't have to worry about using very large images to train my model. Is this correct? Assumption #2 Also makes me asume that the same thing happens when I try to make a prediction?
0
0
747
Mar ’24
Unable to scan barcode from an Image using vision
Hello, I have been working to try to create a scanner to scan a PDF417 barcode from your photos library for a few days now and have come to a dead end. Every time that I run my function on the photo, my array of observations always returns as []. This example is me trying to use it with an automatic generated image because I think that if it works with this, it will work with a real screenshot. That being said, I have already tried with all sorts of images that aren't pre-generated, and they, still, have failed to work. Code below: Calling the function createVisionRequest(image: generatePDF417Barcode(from: "71238-12481248-128035-40239431")!) Creating the Barcode: static func generatePDF417Barcode(from key: String) -> UIImage? { let data = key.data(using: .utf8)! let filter = CIFilter.pdf417BarcodeGenerator() filter.message = data filter.rows = 7 let transform = CGAffineTransform(scaleX: 3, y: 4) if let outputImage = filter.outputImage?.transformed(by: transform) { let context = CIContext() if let cgImage = context.createCGImage(outputImage, from: outputImage.extent) { return UIImage(cgImage: cgImage) } } return nil } Main function for scanning the barcode: static func desynthesizeIDScreenShot(from image: UIImage, completion: @escaping (String?) -> Void) { guard let ciImage = CIImage(image: image) else { print("Empty image") return } let imageRequestHandler = VNImageRequestHandler(ciImage: ciImage, orientation: .up) let request = VNDetectBarcodesRequest { (request,error) in guard error == nil else { completion(nil) return } guard let observations = request.results as? [VNDetectedObjectObservation] else { completion(nil) return } request.revision = VNDetectBarcodesRequestRevision2 let result = (observations.first as? VNBarcodeObservation)?.payloadStringValue print("Observations", observations) if let result { completion(result) print() print(result) } else { print(error?.localizedDescription) //returns nil completion(nil) print() print(result) print() } } request.symbologies = [VNBarcodeSymbology.pdf417] try? imageRequestHandler.perform([request]) } Thanks!
1
0
439
Mar ’24
How to make the Swift UI window display the frosted glass effect normally in the metal immersive space
I used metal and CompositorLayer to render an immersive space skybox. In this space, the window created by the Swift UI I created only displays the gray frosted glass background effect (it seems to ignore the metal-rendered skybox and only samples and displays the black background). why is that? Is there any solution to display the normal frosted glass background? Thank you very much!
1
0
805
Mar ’24
SwiftData not working in VisionOS
I want to make icloud backup using SwiftData in VisionOS and I need to use SwiftData first but I get the following error even though I do the following steps I followed the steps below I created a Model import Foundation import SwiftData @Model class NoteModel { @Attribute(.unique) var id: UUID var date:Date var title:String var text:String init(id: UUID = UUID(), date: Date, title: String, text: String) { self.id = id self.date = date self.title = title self.text = text } } I added modelContainer WindowGroup(content: { NoteView() }) .modelContainer(for: [NoteModel.self]) And I'm making inserts to test import SwiftUI import SwiftData struct NoteView: View { @Environment(\.modelContext) private var context var body: some View { Button(action: { // new Note let note = NoteModel(date: Date(), title: "New Note", text: "") context.insert(note) }, label: { Image(systemName: "note.text.badge.plus") .font(.system(size: 24)) .frame(width: 30, height: 30) .padding(12) .background( RoundedRectangle(cornerRadius: 50) .foregroundStyle(.black.opacity(0.2)) ) }) .buttonStyle(.plain) .hoverEffectDisabled(true) } } #Preview { NoteView().modelContainer(for: [NoteModel.self]) }
0
0
514
Feb ’24
Error loading ReferenceImage
Currently, I try to test the ImageTrackingProvider with the Apple Vision Pro. I started with some basic code: import RealityKit import ARKit @MainActor class ARKitViewModel: ObservableObject{ private let session = ARKitSession() private let imageTracking = ImageTrackingProvider(referenceImages: ReferenceImage.loadReferenceImages(inGroupNamed: "AR")) func runSession() async { do{ try await session.run([imageTracking]) } catch{ print(error) } } func processUpdates() async { for await _ in imageTracking.anchorUpdates{ print("test") } } } I only have one picture in the AR folder. I added the size an I have no error messages in the AR folder. As I am trying to run the application with the vision Pro, I receive following error: ar_image_tracking_provider_t <0x28398f1e0>: Failed to load reference image <ARReferenceImage: 0x28368f120 name="IMG_1640" physicalSize=(1.350, 2.149)> with error: Failed to add reference image. It finds the image, but there seems to be a problem with the loading. I tried the jpeg and the png format. I do not understand why it fails to load the ReferenceImage. I use Xcode Version 15.3 beta 3
2
0
969
Feb ’24
Inquiry about the production of visionos apps separate from the existing ios app store sales
Hello . I'm currently selling an app. I tried to run the ios version on visionos to provide the same service as the existing app, but a huge amount of incompatible code was generated. Can I create a visionos app with the same name, add visionos from the App Store connect site, and upload a completely separate app project file? The iOS app and the visionos app will use the same icloud, in-app payment, and so on. However, we plan to separate the project itself to reduce the size of the user's app itself and optimize it.
0
0
506
Feb ’24
Segmenting a single hand from multiple available hands in hand pose estimation
Hi, I'm working on developing an app. I need my app to work in a crowd with multiple people (who have multiple hands). From what I understand, the Vision Framework currently uses a heuristic of "largest hand" to assign as the detected hand. This won't work for my application since the largest hand won't always be the one that is of interest. In fact, the hand of interest will be the one that is pointing. I know how to train a model using CreateML to identify a hand that is pointing, but where I'm running into issues is that there is no straightforward way to directly override the Vision framework's built-in heuristic of selecting the largest hand when you're solely relying on Swift and Create ML. I would like my framework to be: Request hand landmarks Process image CreateML reports which hand is pointing We use the pointing hand to collect position data on the points of the index finger But within vision's framework, if you set the number of hands to collect data for to 1, it will just choose the largest hand and report position data for that hand only. Of course, the easy work around here is to set it to X number of hands, but within the scope of an IOS device, this is computationally intensive (since my app could be handling up to 10 hands at a time). Has anyone come up with a simpler solution to this problem or aware of something within visionOS to do it?
1
0
513
Feb ’24
Connecting xcode with real my visionpro
Currently, the id of my actual visionpro device is different from the xcode that works on my macmini. I added a visionpro id to xcode, but I can't build it with my visionpro. Can I log out of the existing login to xcode and log in with the same ID as visionpro? However, the appleID created for visionpro does not have an Apple Developer membership, so there is no certificate that makes it run on the actual device. How can I add my visionpro ID from the ID with my apple developer membership to run the xcode project app on visionpro? This is the first time this is the first time.
1
0
4.4k
Feb ’24