Essentially, I'm trying to find the most straightforward/simple way to outline an Image with varying contours. The intention is similar to the way iMessage allows you to add an outline to a sticker. The "goal" in the example is simply the input image on top of the outline.
Vision
RSS for tagApply computer vision algorithms to perform a variety of tasks on input images and video using Vision.
Posts under Vision tag
104 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I try vision frameworks with VisionPro but does not work only with VisionOS2.0.
When I perform requests, do not work and below error is caught.
I try same code with VisionOS1.2, iOS18.0beta it works.
I try also new beta API but does not work and same error.
ex.GenerateForegroundInstanceMaskRequest
do you have any idea? is it any permission for use vision framework with visionOS2.0.
This is my try list
with VisionOS2.0beta4
GenerateForegroundInstanceMaskRequest (not work error1)
VNGenerateForegroundInstanceMaskRequest(not work error1)
VNRecognizeTextRequest (not work error2)
with VisionOS1.2
VNRecognizeTextRequest (work)
with iOS 18beta
GenerateForegroundInstanceMaskRequest (work)
My Development Env
Env1
VisionPro: VIsionOS2.0beta4
Xcode: 16.0beta4,16.0beta2.
macOS: 14.5(23F79)
Env2
VisionPro: VIsionOS1.2.
Xcode: 15.4
macOS: 14.5(23F79).
Error1
Error Domain=com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)" UserInfo={NSLocalizedDescription=Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/Vision.framework/subject_lifting_gen1_rev5_gv8dsz6vxu_multihead_int8.espresso.net Error= (DESIGN)}
Error2
Error Domain=com.apple.Vision Code=11 "VNRecognizeTextRequest produced an internal error" UserInfo={NSLocalizedDescription=VNRecognizeTextRequest produced an internal error, NSUnderlyingError=0x3001f6850 {Error Domain=CRImageReaderErrorDomain Code=-5 "Unknown error" UserInfo={NSLocalizedDescription=Unknown error}}}
Hello all,
I'm developing an application for visionOS and I'm trying to implement 2 different animations:
First animation
Initially, I have a map that should not be visible. I would like to create an animation effect where it appears as if a drop of water falls in the center of the map and the expanding waves gradually reveal the entire map.
Is there a way to do it directly on SwiftUI or I need an animation on my USDZ?
Second animation
I want an animation effect similar to a cinema screen opening from the center, gradually revealing a video that was initially hidden.
Is there a way to do it directly on SwiftUI?
Can someone help me with this topic?
Thanks ;)
How should I set the window of WindowGrop to resemble a curved screen style?
Hey all 👋🏼
We're currently working on a video processing project using the Vision framework (face, body and hand pose detection), and We've encountered a couple of errors that I need help with. We are on Xcode 16 Beta 3, testing on an iPhone 14 Pro running iOS 18 beta.
The error messages are as follows:
[LOG_ERROR] /Library/Caches/com.apple.xbs/Sources/MediaAnalysis/VideoProcessing/VCPHumanPoseImageRequest.mm[85]: code 18,446,744,073,709,551,598
encountered an unexpected condition: *** -[__NSArrayM insertObject:atIndex:]: object cannot be nil
What we've tried:
Debugging: I’ve tried stepping through the code, but the errors occur before I can gather any meaningful insights.
Searching Documentation: Looked through Apple’s developer documentation and forums but couldn’t find anything related to these specific error codes.
Nil Check: Added checks to ensure objects are not nil before inserting them into arrays, but the error persists.
Here are my questions:
Has anyone encountered similar errors with the Vision framework, specifically related to VCPHumanPoseImageRequest and NSArray operations?
Is there any known issue or bug in the version of the framework I might be using? Could it also be related to the beta?
Are there any additional debug steps or logging mechanisms I can implement to narrow down the cause?
Any suggestions on how to handle nil objects more effectively in this context?
I would greatly appreciate any insights or suggestions you might have. Thank you in advance for your assistance!
Thanks all!
I can‘t Figure Out How to Get My Earth Entity to Rotate on its Axis. This is a follow up post from a previous Apple Developer forum post.
How would I have the earth (parent) entity rotate CCW underneath the orbiting starship child?
I tried adding the following code block to the RealityView but it is not working:
if let rotatingEarth = starshipEntity.findEntity(named: "Earth") {
rotatingEarth.transform.rotation = simd_quatf.init(angle: 360, axis: SIMD3(x: 0, y: 1, z: 0))
if let animation = try? AnimationResource.generate(with: rotatingEarth as! AnimationDefinition) {
rotatingEarth.playAnimation(animation)
}
}
Any advice on getting the earth to rotate?
I tried reviewing the Hello World WWDC23 project code, but I was unable to understand the complexity and how that sample project got the earth to rotate.
i want to do this for visionOS 1.2. I realize there are some new animation and possible other capabilities coming up in vision 2.0 but I want to try to address this issue in the current released visionOS version.
Hey, is there a way to create a good ground shadow shader? I'm using a ground with an unlit material and I can't get the ground shadow to work properly. If I use a PBR texture it works better, but i can barely see it and I want to control the intensity more.
I'm currently streaming synchronised video and depth data from my iPhone 13, using AVFoundation, video set to AVCaptureSession.Preset.vga640x480. When looking at the corresponding images (with depth values mapped to a grey colour map), (both map and image are of size 640x480) it appears the two feeds have different fields of view, with the depth feed zoomed in and angled upwards, and the colour feed more zoomed out. I've looked at the intrinsics from both the depth map, and my colour sample buffer, they are identical.
Does anyone know why this might be?
My setup code is below (shortened):
import AVFoundation
import CoreVideo
class VideoCaptureManager {
private enum SessionSetupResult {
case success
case notAuthorized
case configurationFailed
}
private enum ConfigurationError: Error {
case cannotAddInput
case cannotAddOutput
case defaultDeviceNotExist
}
private let videoDeviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInTrueDepthCamera],
mediaType: .video,
position: .front)
private let session = AVCaptureSession()
public let videoOutput = AVCaptureVideoDataOutput()
public let depthDataOutput = AVCaptureDepthDataOutput()
private var outputSynchronizer: AVCaptureDataOutputSynchronizer?
private var videoDeviceInput: AVCaptureDeviceInput!
private let sessionQueue = DispatchQueue(label: "session.queue")
private let videoOutputQueue = DispatchQueue(label: "video.output.queue")
private var setupResult: SessionSetupResult = .success
init() {
sessionQueue.async {
self.requestCameraAuthorizationIfNeeded()
}
sessionQueue.async {
self.configureSession()
}
sessionQueue.async {
self.startSessionIfPossible()
}
}
private func requestCameraAuthorizationIfNeeded() {
switch AVCaptureDevice.authorizationStatus(for: .video) {
case .authorized:
break
case .notDetermined:
AVCaptureSession
sessionQueue.suspend()
AVCaptureDevice.requestAccess(for: .video, completionHandler: { granted in
if !granted {
self.setupResult = .notAuthorized
}
self.sessionQueue.resume()
})
default:
setupResult = .notAuthorized
}
}
private func configureSession() {
if setupResult != .success {
return
}
let defaultVideoDevice: AVCaptureDevice? = videoDeviceDiscoverySession.devices.first
guard let videoDevice = defaultVideoDevice else {
print("Could not find any video device")
setupResult = .configurationFailed
return
}
do {
videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice)
} catch {
setupResult = .configurationFailed
return
}
session.beginConfiguration()
session.sessionPreset = AVCaptureSession.Preset.vga640x480
guard session.canAddInput(videoDeviceInput) else {
print("Could not add video device input to the session")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
session.addInput(videoDeviceInput)
if session.canAddOutput(videoOutput) {
session.addOutput(videoOutput)
if let connection = videoOutput.connection(with: .video) {
connection.isCameraIntrinsicMatrixDeliveryEnabled = true
}
else {
print("Cannot setup camera intrinsics")
}
videoOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)]
} else {
print("Could not add video data output to the session")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
if session.canAddOutput(depthDataOutput) {
session.addOutput(depthDataOutput)
depthDataOutput.isFilteringEnabled = false
if let connection = depthDataOutput.connection(with: .depthData) {
connection.isEnabled = true
} else {
print("No AVCaptureConnection")
}
} else {
print("Could not add depth data output to the session")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
let depthFormats = videoDevice.activeFormat.supportedDepthDataFormats
let filtered = depthFormats.filter({
CMFormatDescriptionGetMediaSubType($0.formatDescription) == kCVPixelFormatType_DepthFloat16
})
let selectedFormat = filtered.max(by: {
first, second in CMVideoFormatDescriptionGetDimensions(first.formatDescription).width < CMVideoFormatDescriptionGetDimensions(second.formatDescription).width
})
do {
try videoDevice.lockForConfiguration()
videoDevice.activeDepthDataFormat = selectedFormat
videoDevice.unlockForConfiguration()
} catch {
print("Could not lock device for configuration: \(error)")
setupResult = .configurationFailed
session.commitConfiguration()
return
}
session.commitConfiguration()
}
private func addVideoDeviceInputToSession() throws {
do {
var defaultVideoDevice: AVCaptureDevice?
defaultVideoDevice = AVCaptureDevice.default(
.builtInTrueDepthCamera,
for: .depthData,
position: .front
)
guard let videoDevice = defaultVideoDevice else {
print("Default video device is unavailable.")
setupResult = .configurationFailed
session.commitConfiguration()
throw ConfigurationError.defaultDeviceNotExist
}
let videoDeviceInput = try AVCaptureDeviceInput(device: videoDevice)
if session.canAddInput(videoDeviceInput) {
session.addInput(videoDeviceInput)
} else {
setupResult = .configurationFailed
session.commitConfiguration()
throw ConfigurationError.cannotAddInput
}
}
Hi All,
I am trying to build a new iOS app by following https://developer.apple.com/videos/play/wwdc2024/10163/?time=67
When I trying to remove all legacy VN I am getting error, I would appreciate if someone can help me get up to speed with the new Vision API
We have a lot of resources made using 3Dmaxs design software. How can the USDZ exported from 3dmax be used in the vision project. The preview in the folder on the Apple computer is correct. However, it is not possible to parse the texture in the xcode project. The difference in USDZ data compared to Blender is that the texture data exported by 3dmax has an additional node graph node. Is there any export plugin support?
I try to use the new VNCalculateImageAestheticsScoresRequest API.
Code is compiling and running but delivers the same result for every image
Xcode 16 Beta 2 Simulator
Did I missing anything ?
"On the latest iOS 18 beta 2, the OCR API,the Translate App and Live Text performs very poorly in recognizing Japanese."
Vision pro cannot capture recordings over 16kHz to 24kHz at a sampling rate of 48kHz, why? Or can you tell me how to configure it?Vision pro cannot capture recordings over 16kHz to 24kHz at a sampling rate of 48kHz, why? Or can you tell me how to configure it?
I'm seeking insight on why the new VisionOS Barcode Scanning API is categorized as an Enterprise API and restricted only for proprietary and in-house apps.
I understand Apple's focus on privacy and I can see how this restriction could make sense for other Enterprise APIs like main camera access and passthrough screen capture.
Why is barcode scanning restricted from open apps? What makes barcode scanning more of a risk to privacy versus the unrestricted APIs for object tracking, image tracking, or hand tracking?
Hi everyone,
I'm curious about the capabilities of ARKit's object tracking feature. Specifically, I'd like to know:
Is there a size limit for the objects that can be tracked?
Can ARKit differentiate between two objects with the same shape but different models (e.g., different colors)?
Are objects with single colors and generic shapes (like squares or circles) effectively trackable?
Any insights or examples from your experiences would be greatly appreciated!
Thanks in advance.
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app.
I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models.
I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app.
Does anyone know if this is possible with frameworks as Vision or VisionKit?
Hello,
I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below
import ARKit
com.apple.developer.arkit.main-camera-access.allow = true in info.plist
Do below code
func loadCameraFeed() async {
// Main Camera Feed Access Example
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
let cameraFrameProvider = CameraFrameProvider()
var arKitSession = ARKitSession()
var pixelBuffer: CVPixelBuffer?
await arKitSession.queryAuthorization(for: [.cameraAccess])
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
return
}
guard let cameraFrameUpdates =
cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else {
return
}
print(cameraFrameUpdates)
for await cameraFrame in cameraFrameUpdates {
print(cameraFrame)
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
}
}
I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS.
Please guide me on how to achieve my next step. I am blank after this code.
I have created an archive for both iOS and MacOS versions of my app by doing the the following steps
Destinations select Build Any Mac (Mac Catalyst, arm64, x86_64)
Product > Archive
However when doing the same steps for VisionOS I get an error
Invalid Run Destination
I have selected both destinations, visionOS Simulator and Build any VisionOS simulator device (arm64, x86_64)
I am able to run the app and test, now I would like to upload to AppStoreConnect for TestFlight and App Store submission.
I have created a Hand Action Classification project following the guidelines which causes the Create ML tool to provide the very cryptic "Unexpected Error". The feature extraction phase is fine, with the error occurring during model training after the tool reports completion of the first five training iterations as it moves on to report the next ten.
The project is small with 3 training classes and 346 items
I have tried to vary the frame rate and action duration, with all augmentations unset, but the error still persists.
Can you please confirm how I may get further error diagnostic information so that I can determine why Create ML is unable to work with my training data?
Mac OS is Sonoma 14.5 on an iMac 24-inch, M1, 2021. Create ML is Version 5.0 (121.4)
HI eveyone
i've read that USDZ supports LOD to have 3 meshes with high medium and low polygon detail to be visible depending on the distance from the user to the entity...
but i dont know how to use it...
any experience or... god bless you... a downloadable file with a sample???
thanks a lot !!!!