Hi Apple Developer Community,
I’m exploring ways to fine-tune the SNSoundClassifier to allow users of my iOS app to personalize the model by adding custom sounds or adjusting predictions. While Apple’s WWDC session on sound classification explains how to train from scratch, I’m specifically interested in using SNSoundClassifier as the base model and building/fine-tuning on top of it.
Here are a few questions I have:
1. Fine-Tuning on SNSoundClassifier:
Is there a way to fine-tune this model programmatically through APIs? The manual approach using macOS, as shown in this documentation is clear, but how can it be done dynamically - within the app for users or in a cloud backend (AWS/iCloud)?
Are there APIs or classes that support such on-device/cloud-based fine-tuning or incremental learning? If not directly, can the classifier’s embeddings be used to train a lightweight custom layer?
Training is likely computationally intensive and drains too much on battery, doing it on cloud can be right way but need the right apis to get this done. A sample code will do good.
2. Recommended Approach for In-App Model Customization:
If SNSoundClassifier doesn’t support fine-tuning, would transfer learning on models like MobileNetV2, YAMNet, OpenL3, or FastViT be more suitable?
Given these models (SNSoundClassifier, MobileNetV2, YAMNet, OpenL3, FastViT), which one would be best for accuracy and performance/efficiency on iOS? I aim to maintain real-time performance without sacrificing battery life. Also it is important to see architecture retention and accuracy after conversion to CoreML model.
3. Cost-Effective Backend Setup for Training:
Mac EC2 instances on AWS have a 24-hour minimum billing, which can become expensive for limited user requests. Are there better alternatives for deploying and training models on user request when s/he uploads files (training data)?
4. TensorFlow vs PyTorch:
Between TensorFlow and PyTorch, which framework would you recommend for iOS Core ML integration? TensorFlow Lite offers mobile-optimized models, but I’m also curious about PyTorch’s performance when converted to Core ML.
5. Metrics:
Metrics I have in mind while picking the model are these: Publisher, Accuracy, Fine-Tuning capability, Real-Time/Live use, Suitability of iPhone 16, Architectural retention after coreML conversion, Reasons for unsuitability, Recommended use case.
Any insights or recommended approaches would be greatly appreciated.
Thanks in advance!
Post
Replies
Boosts
Views
Activity
Hi everyone,
I’m encountering a strange issue when trying to archive my iOS app for App Store distribution. The project builds and runs fine on “Any iOS Device (arm64)”, but when I try to Product → Archive, I get multiple errors related to preview sections in my SwiftUI view files. The app uses camera for photo and video capture.
Errors:
• Cannot find 'PreviewCameraModel' in scope
• Cannot infer contextual base in reference to member 'video'
• Cannot infer contextual base in reference to member 'classify'
These errors only appear in code sections inside the #Preview blocks in SwiftUI files. Additionally:
When I click on an error in the Issue Navigator, the file shows the error momentarily but it disappears after less than a second.
The total error count decreases temporarily, but then it returns to the original number when clicking on other errors.
Build and Run works fine without any issues on devices and simulators, but these errors block the archiving process.
Workaround:
For now, I’ve resolved the issue by using #if DEBUG to exclude the preview code from release builds, but I’d prefer a cleaner solution if one exists.
System Details:
Xcode: 16.0
iOS Deployment Target: 16+
Swift: 5
Architecture: arm64
Has anyone encountered this issue or found a better way to handle SwiftUI preview code when archiving? Any advice on fixing this or insights into why the errors behave inconsistently during the archiving process would be appreciated.
Thanks in advance!
Hi everyone,
I’m experiencing an issue where audio interruptions (e.g., phone calls) are not being intercepted while running sound classification in an app that uses the AVAudioSession. Classification works fine, but interruptions aren’t handled, even though I’ve followed Apple’s guidelines on handling audio interruptions [1_Document].
The classification was initially based on [2_Classifer], where it worked perfectly. However, when I adopted classification in a more camera-focused app using [3_Cam], the interruption behavior stopped working. The classification setup is functioning with [3_Cam], but audio interruptions are not triggered.
The listener is invoked before starting sound analysis as suggested in [2_Classifier].
startListeningForAudioSessionInterruptions()
try startAnalyzing([(request, observer)])
FYI, one change I have made for classifications is following. This works fine in all cases.
// try audioSession.setCategory(.record, mode: .default)
try audioSession.setCategory(.playAndRecord, mode: .default, options: [.defaultToSpeaker, .allowBluetooth])
I suspect the issue might be related to the AVAudioSession configuration or how the app handles recording and playback together. Is there anything else I should check related to AVAudioSession? Are there additional APIs I could use to pre-check or better handle audio interruptions?
Any suggestions or guidance would be greatly appreciated!
Platform: Swift 5, Xcode 16, iOS 18.
References:
Document
Classifier
Cam
Best Regards
Question:
When implementing simultaneous video capture and audio processing in an iOS app, does the order of starting these components matter, or can they be initiated in any sequence?
I have an actor responsible for initiating video capture using the setCaptureMode function. In this actor, I also call startAudioEngine to begin the audio engine and register a resultObserver. While the audio engine starts successfully, I notice that the resultObserver is not invoked when startAudioEngine is called synchronously. However, it works correctly when I wrap the call in a Task.
Could you please explain why the synchronous call to startAudioEngine might be blocking the invocation of the resultObserver? What would be the best practice for ensuring both components work effectively together? Additionally, if I were to avoid using Task, what approach would be required? Lastly, is the startAudioEngine effective from the start time of the video capture (00:00)?
Platform: Xcode 16, Swift 6, iOS 18
References:
Classifying Sounds in an Audio Stream – In my case, the analyzeAudio() method is not invoked.
Setting Up a Capture Session – Here, the focus is on video capture.
Classifying Sounds in an Audio File
Code Snippet: (For further details. setVideoCaptureMode() surfaces the problem.)
// ensures all operations happen off of the `@MainActor`.
actor CaptureService {
...
nonisolated private let resultsObserver1 = ResultsObserver1()
...
private func setUpSession() throws { .. }
...
setVideoCaptureMode() throws {
captureSession.beginConfiguration()
defer { captureSession.commitConfiguration() }
/* -- Works fine (analyseAudio is printed)
Task {
self.resultsObserver1.startAudioEngine()
}
*/
self.resultsObserver1.startAudioEngine() // Does not work - analyzeAudio not printed
captureSession.sessionPreset = .high
try addOutput(movieCapture.output)
if isHDRVideoEnabled {
setHDRVideoEnabled(true)
}
updateCaptureCapabilities()
}
Question:
I'm working on a project in Xcode 16.1, using Swift 6 with iOS 18. My code is working fine in Swift 5, but I'm running into concurrency issues when upgrading to Swift 6, particularly with the @preconcurrency attribute in AVFoundation.
Here is the relevant part of my code:
import SwiftUI
@preconcurrency import AVFoundation
struct OverlayButtonBar: View {
...
let audioTracks = await loadTracks(asset: asset, mediaType: .audio)
...
// Tracks are extracted before crossing concurrency boundaries
private func loadTracks(asset: AVAsset, mediaType: AVMediaType) async -> [AVAssetTrack] {
do {
return try await asset.load(.tracks).filter { $0.mediaType == mediaType }
} catch {
print("Error loading tracks: \(error)")
return []
}
}
}
Issues:
When using @preconcurrency, I get the warning:
@preconcurrency attribute on module AVFoundation has no effect. Suggested fix by Xcode is: Remove @preconcurrency.
But if I remove @preconcurrency, I get both a warning and an error:
Warning: Add '@preconcurrency' to treat 'Sendable'-related errors from module 'AVFoundation' as warnings.
Error: Non-sendable type [AVAssetTrack] returned by implicitly asynchronous call to nonisolated function cannot cross actor boundary. (Class AVAssetTrack does not conform to the Sendable protocol (AVFoundation.AVAssetTrack)). This error comes if I attempt to directly access non-Sendable AVAssetTrack in an async context :
let audioTracks = await loadTracks(asset: asset, mediaType: .audio)
How can I resolve this issue while staying compliant with Swift 6 concurrency rules? Is there a recommended approach to handling non-Sendable types like AVAssetTrack in concurrency contexts?
Appreciate any guidance on making this work in Swift 6, especially considering it worked fine in Swift 5.
Thanks in advance!
Hello,
I’m encountering an issue with the PHPhotoLibrary API in Swift 6 and iOS 18. The code I’m using worked fine in Swift 5, but I’m now seeing the following error:
Sending main actor-isolated value of type '() -> Void' with later accesses to nonisolated context risks causing data races
Here is the problematic code:
Button("Save to Camera Roll") {
saveToCameraRoll()
}
...
private func saveToCameraRoll() {
guard let overlayFileURL = mediaManager.getOverlayURL() else {
return
}
Task {
do {
let status = await PHPhotoLibrary.requestAuthorization(for: .addOnly)
guard status == .authorized else {
return
}
try await PHPhotoLibrary.shared().performChanges({
if let creationRequest = PHAssetCreationRequest.creationRequestForAssetFromVideo(atFileURL: overlayFileURL) {
creationRequest.creationDate = Date()
}
})
await MainActor.run {
saveSuccessMessage = "Video saved to Camera Roll successfully"
}
} catch {
print("Error saving video to Camera Roll: \(error.localizedDescription)")
}
}
}
Problem Description:
The error message suggests that a main actor-isolated value of type () -> Void is being accessed in a nonisolated context, potentially leading to data races.
This issue arises specifically at the call to PHPhotoLibrary.shared().performChanges.
Questions:
How can I address the data race issues related to main actor isolation when using PHPhotoLibrary.shared().performChanges?
What changes, if any, are required to adapt this code for Swift 6 and iOS 18 while maintaining thread safety and actor isolation?
Are there any recommended practices for managing main actor-isolated values in asynchronous operations to avoid data races?
I appreciate any points or suggestions to resolve this issue effectively.
Thank you!
Hello,
Which API can be used to programatically fetch the ID of user who installed/paid the app?
This is useful if an app has to create a path hierarchy for different users who have installed/paid the app, for instance, /AppName//user_files, how to get the uniqueUserID, and also, to get the information about which user these files belong to based on this uniqueUserID.
App is using Swift, SwiftUI.
Thanks.