I have tested my application in iOS 15, 16, 17 Version in that vision kit reading value in Horizontal direction once I got updated my device to iOS 18.0 beta value was reading as in vertical direction
The build was generated in Xcode 13.4.1.
Team please help to understand why this and need to change anything in code level
VisionKit
RSS for tagScan documents with the camera on iPhone and iPad devices using VisionKit.
Posts under VisionKit tag
48 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Essentially, I'm trying to find the most straightforward/simple way to outline an Image with varying contours. The intention is similar to the way iMessage allows you to add an outline to a sticker. The "goal" in the example is simply the input image on top of the outline.
I try vision frameworks with VisionPro but does not work only with VisionOS2.0.
When I perform requests, do not work and below error is caught.
I try same code with VisionOS1.2, iOS18.0beta it works.
I try also new beta API but does not work and same error.
ex.GenerateForegroundInstanceMaskRequest
do you have any idea? is it any permission for use vision framework with visionOS2.0.
This is my try list
with VisionOS2.0beta4
GenerateForegroundInstanceMaskRequest (not work error1)
VNGenerateForegroundInstanceMaskRequest(not work error1)
VNRecognizeTextRequest (not work error2)
with VisionOS1.2
VNRecognizeTextRequest (work)
with iOS 18beta
GenerateForegroundInstanceMaskRequest (work)
My Development Env
Env1
VisionPro: VIsionOS2.0beta4
Xcode: 16.0beta4,16.0beta2.
macOS: 14.5(23F79)
Env2
VisionPro: VIsionOS1.2.
Xcode: 15.4
macOS: 14.5(23F79).
Error1
Error Domain=com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)" UserInfo={NSLocalizedDescription=Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)}
Error2
Error Domain=com.apple.Vision Code=11 "VNRecognizeTextRequest produced an internal error" UserInfo={NSLocalizedDescription=VNRecognizeTextRequest produced an internal error, NSUnderlyingError=0x3001f6850 {Error Domain=CRImageReaderErrorDomain Code=-5 "Unknown error" UserInfo={NSLocalizedDescription=Unknown error}}}
App crashes on iOS 16.4 when there is usage for ImageAnalysisInteraction api from VisionKit. App crashes before even starts.
Here is output:
dyld[3240]: Symbol not found: _$s9VisionKit24ImageAnalysisInteractionC7subject2atAC7SubjectVSgSo7CGPointV_tYaFTu
Referenced from: <BAD7A699-FB4E-3D0E-8CD4-45CC9FC3D5E5> /Users/sereza/Library/Developer/CoreSimulator/Devices/B64EAF39-0DD9-49EC-A3F7-69675C94B8BE/data/Containers/Bundle/Application/F4E30E86-ED4D-4748-AB99-434208D55483/VisionKitChecker.app/VisionKitChecker
Expected in: <F05E3A17-D74A-3EE2-BC8D-DDCC23E48707> /Library/Developer/CoreSimulator/Volumes/iOS_20E247/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS 16.4.simruntime/Contents/Resources/RuntimeRoot/System/Library/Frameworks/VisionKit.framework/VisionKit
Here is enough code to produce this crash. Please note that this code never gets called. It is enough that it exists in the project:
import VisionKit
@MainActor
final class LiftHelper: ObservableObject {
func doSomething() async throws {
let interaction = ImageAnalysisInteraction()
let _ = try await interaction.image(for: [])
}
}
I would like to know what is the global path of the Vision Pro file system. For instance, if I put a file called example.pdf inside "On My Apple Vision Pro" what would be the global path for that file? "On My Apple Vision Pro/user_name/example.pdf" or "/example.pdf" or "/username/example.pdf" and so on. I tried to search about it but I didn't found no official source about it. Thanks in advance!
Does anyone know which control is used to automatically recognize objects in photos and achieve the function of cutout by right-clicking the mouse?
有人知道这个鼠标点击右键自动识别照片中的对象然后可以实现抠图的功能用的是哪个控件吗?
There is a long press recognition feature in the photo album of the mobile phone system. What is this feature called in Apple development, and which control should I use to have this feature?
手机系统相册中有个长按识别对象的功能,这个功能在苹果开发中叫做什么,我应该使用哪个控件才能拥有这个功能?
I try to use the new VNCalculateImageAestheticsScoresRequest API.
Code is compiling and running but delivers the same result for every image
Xcode 16 Beta 2 Simulator
Did I missing anything ?
What is the best way to demonstrate or create 2D video to demonstrate an immersive video app? So far I've shared the AVP to my desktop Mac and screen captured the resulting view. Rather shaky at times. With visionOS 2.0 beta (2) is there a better way?
Thanks, David
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app.
I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models.
I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app.
Does anyone know if this is possible with frameworks as Vision or VisionKit?
Hello,
I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below
import ARKit
com.apple.developer.arkit.main-camera-access.allow = true in info.plist
Do below code
func loadCameraFeed() async {
// Main Camera Feed Access Example
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
let cameraFrameProvider = CameraFrameProvider()
var arKitSession = ARKitSession()
var pixelBuffer: CVPixelBuffer?
await arKitSession.queryAuthorization(for: [.cameraAccess])
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
return
}
guard let cameraFrameUpdates =
cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else {
return
}
print(cameraFrameUpdates)
for await cameraFrame in cameraFrameUpdates {
print(cameraFrame)
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
}
}
I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS.
Please guide me on how to achieve my next step. I am blank after this code.
I faced a problem during development that I could not scan Code39 barcode with iPad using Vision. A sample label I used for test has multiple Code39 barcode on it and I could scan almost all barcodes except for specific one.
And when I use conventional barcode scanner and free apps to scan barcode, I could scan the barcode with no problem. I failed to scan the barcode only when I use Vision function.
Has anyone faced similar situation?
Do you know the cause why specific barcode could not be scanned with iPad with Vision?
Hi,
I face a problem that I could not scan a specific Code 39 barcode with Vision framework. We have multiple barcode in a label and almost all Code 39 can be scanned, but not for specific one.
One more information, regardless the one that is not recognized with Vision can be read by a general barcode scanner.
Have anyone faced similar situation?
Is there unique condition to make it hard to scan the barcode when using Vision?(size, intensity, etc)
Regards,
extension Entity {
func addPanoramicImage(for media: WRMedia) {
let subscription=TextureResource.loadAsync(named:"image_20240425_201630").sink(
receiveCompletion: { switch $0 {
case .finished: break
case .failure(let error): assertionFailure("(error)")
}
},
receiveValue: { [weak self] texture in
guard let self = self else { return }
var material = UnlitMaterial()
material.color = .init(texture: .init(texture))
self.components.set(ModelComponent(
mesh: .generateSphere(radius: 1E3),
materials: [material] ))
self.scale *= .init(x: -1, y: 1, z: 1)
self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription))
}
problem:
case .failure(let error): assertionFailure("(error)")
Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
xtension Entity {
func addPanoramicImage(for media: WRMedia) {
let subscription = TextureResource.loadAsync(named:"image_20240425_201630").sink(
receiveCompletion: {
switch $0 {
case .finished: break
case .failure(let error): assertionFailure("(error)")
}
},
receiveValue: { [weak self] texture in
guard let self = self else { return }
var material = UnlitMaterial()
material.color = .init(texture: .init(texture))
self.components.set(ModelComponent(
mesh: .generateSphere(radius: 1E3),
materials: [material]
))
self.scale *= .init(x: -1, y: 1, z: 1)
self.transform.translation += SIMD3(0.0, -1, 0.0)
}
)
components.set(Entity.WRSubscribeComponent(subscription: subscription))
}
func updateRotation(for media: WRMedia) {
let angle = Angle.degrees( 0.0)
let rotation = simd_quatf(angle: Float(angle.radians), axis: SIMD3<Float>(0, 0.0, 0))
self.transform.rotation = rotation
}
struct WRSubscribeComponent: Component {
var subscription: AnyCancellable
}
}
case .failure(let error): assertionFailure("(error)")
Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
Hi all apple devs! I am a young developer who is completely new to everything programming. I am currently trying to develop an app where I want to use visionkit, but I can't for the life of me figure out how to implement its features. I've been stuck on this for several days, so I am now resorting to asking all of you experts for help! Your assistance would be immensely appreciated!
I started to develop the app trying to exclusively use swiftUI to futureproof my app. Upon figuring out what visionkit is, to my understanding it is more compatible with UIkit? So I rewrote the part of my code that will use visionkit into a UIkit based view, to simplify the integration of visionkits features. It might just have overcomplicated my code? Can visionkit be easily implemented using only swiftUI? I noticed in the demo on the video tutorial the code is in a viewcontroller not a contentview, is this what makes my image unresponsive?
My image is not interactable like her demo in the video, where in my code do I go wrong? Help a noob out!
The desired user flow is like this: User selects an image through the "Open camera" or "Open Camera Roll" buttons. Upon selection the UIkit based view opens and the selected image is displayed on it. (This is where I want to implement visionkit features) User interacts with the image by touching on it, if touching on a subject, the subject should be lifted out of the rest of the image and be assigned to the editedImage, which in turn displays only the subject without the background on the contentview. (For now the image is assigned to editedimage by longpressing without any subjectlifting since I cant get visionkit to work as I want)
Anyways, here's a code snippet of my peculiar effort to implement subject lifting and visionkit into my app:
For example: we use DocKit for birdwatching, so we have an unknown field distance and direction.
Distance = ?
Direction = ?
For example, the rock from which the observation is made. The task is to recognize the number of birds caught in the frame, add a detection frame and collect statistics.
Question:
What is the maximum number of frames processed with custom object recognition?
If not enough, can I do the calculations myself and transfer to DokKit for fast movement?
I want to make camera app for capturing Spatial video.
I found some apps for capturing Spatial video, But I don't know how can I open dual camera.
Please let me know how can I handle this.