I'm developing an iOS app using DockKit to control a motorized stand. I've noticed that as the zoom factor of the AVCaptureDevice increases, the stand's movement becomes increasingly erratic up and down, almost like a pendulum motion. I'm not sure why this is happening or how to fix it.
Here's a simplified version of my tracking logic:
func trackObject(_ boundingBox: CGRect, _ dockAccessory: DockAccessory) async throws {
guard let device = AVCaptureDevice.default(for: .video),
let input = try? AVCaptureDeviceInput(device: device) else {
fatalError("Camera not available")
}
let currentZoomFactor = device.videoZoomFactor
let dimensions = device.activeFormat.formatDescription.dimensions
let referenceDimensions = CGSize(width: CGFloat(dimensions.width), height: CGFloat(dimensions.height))
let intrinsics = calculateIntrinsics(for: device, currentZoom: Double(currentZoomFactor))
let deviceOrientation = UIDevice.current.orientation
let cameraOrientation: DockAccessory.CameraOrientation = {
switch deviceOrientation {
case .landscapeLeft: return .landscapeLeft
case .landscapeRight: return .landscapeRight
case .portrait: return .portrait
case .portraitUpsideDown: return .portraitUpsideDown
default: return .unknown
}
}()
let cameraInfo = DockAccessory.CameraInformation(
captureDevice: input.device.deviceType,
cameraPosition: input.device.position,
orientation: cameraOrientation,
cameraIntrinsics: useIntrinsics ? intrinsics : nil,
referenceDimensions: referenceDimensions
)
let observation = DockAccessory.Observation(
identifier: 0,
type: .object,
rect: boundingBox
)
let observations = [observation]
try await dockAccessory.track(observations, cameraInformation: cameraInfo)
}
func calculateIntrinsics(for device: AVCaptureDevice, currentZoom: Double) -> matrix_float3x3 {
let dimensions = CMVideoFormatDescriptionGetDimensions(device.activeFormat.formatDescription)
let width = Float(dimensions.width)
let height = Float(dimensions.height)
let diagonalPixels = sqrt(width * width + height * height)
let estimatedFocalLength = diagonalPixels * 0.8
let fx = Float(estimatedFocalLength) * Float(currentZoom)
let fy = fx
let cx = width / 2.0
let cy = height / 2.0
return matrix_float3x3(
SIMD3<Float>(fx, 0, cx),
SIMD3<Float>(0, fy, cy),
SIMD3<Float>(0, 0, 1)
)
}
I'm calling this function regularly (10-30 times per second) with updated bounding box information. The erratic movement seems to worsen as the zoom factor increases.
Questions:
Why might increasing the zoom factor cause this erratic movement?
I'm currently calculating camera intrinsics based on the current zoom factor. Is this approach correct, or should I be doing something differently?
Are there any other factors I should consider when using DockKit with a variable zoom?
Could the frequency of calls to trackRider (10-30 times per second) be contributing to the erratic movement? If so, what would be an optimal frequency?
Any insights or suggestions would be greatly appreciated. Thanks!
DockKit
RSS for tagAPI to enable control of motorized iPhone stands from within any Camera app.
Posts under DockKit tag
9 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Hello,
I‘m using DockKit within my SwiftUI Application with GetStream. Before updating to iOS 18 yesterday the custom Tracking using DockKit worked like a charm, but After updating it stopped working unexpectedly.
What‘s more curious: using the official GetStream Video Calls Application it works on iOS18 still, but Not within my Application. I can confirm, that my iPhone is still paired and I can receive logs about the current docking State and everything seems fine.
Any suggestions what I‘m missing here?
We are experimenting with the DockKit API in iOS 18. However, we are unable to retrieve the speakingConfidence, lookingAtCameraConfidence, and saliencyRank for the person being tracked. We are able to get the rect and identifier. Has anyone been able to retrieve speakingConfidence, lookingAtCameraConfidence, and saliencyRank?
We are experimenting with the DockKit API in iOS 18. However, we are unable to retrieve the speakingConfidence, lookingAtCameraConfidence, and saliencyRank for the person being tracked. We are able to get the rect and identifier. Has anyone been able to retrieve speakingConfidence, lookingAtCameraConfidence, and saliencyRank?
We are experimenting with the DockKit API in iOS 18. However, we are unable to retrieve the speakingConfidence, lookingAtCameraConfidence, and saliencyRank for the person being tracked. We are able to get the rect and identifier. Has anyone been able to retrieve speakingConfidence, lookingAtCameraConfidence, and saliencyRank?
Hello!
We want to try the new features of DockKit on an iOS 18 device. However, I am unable to pair the DockKit-compatible device with my iOS 18 device. Is there a way to successfully pair them?
Any source code samples for how to program DockKit ?
I have read https://developer.apple.com/documentation/DockKit and would like to see it used in an app. For instance, how to setup notification in a SwiftUI-based app running code like this
do {
for await accessory in try DockAccessoryManager.shared.accessoryStateChanges {
// If this is an accessory you’re interested in, save it for later use.
}
} catch {
log(“Failed fetching state changes, \(error)“)
}
For example: we use DocKit for birdwatching, so we have an unknown field distance and direction.
Distance = ?
Direction = ?
For example, the rock from which the observation is made. The task is to recognize the number of birds caught in the frame, add a detection frame and collect statistics.
Question:
What is the maximum number of frames processed with custom object recognition?
If not enough, can I do the calculations myself and transfer to DokKit for fast movement?
I'd love to play around with DockKit, but I didn't see anything mentioned about hardware. I'm assuming Apple isn't releasing their own motorized dock and haven't seen anything about how to get hardware recognized by the accessory manager.
I'd like to prototype a dock myself using esp32 and some stepper motors. I've already got this working with bluetooth communication from iOS via CoreBluetooth, but I don't know if there's specific service and characteristic UUIDs that the system is looking for to say it's compatible with DockKit?
Would really love to start playing with this, anyone got any insights on how to get up and running?