Post

Replies

Boosts

Views

Activity

AVAssetWriterInput -- inserting sample buffers with pauses in between not working
Hi, I'm trying to insert CMSampleBuffers into an AVAssetWriterInput that has been configured with expectsMediaDataInRealTime = false with pauses. That is, I insert fixed-length audio at specific (increasing and non-overlapping) time points with large gaps in between. E.g., 5 seconds of audio at t=3.0, 5 seconds of audio at t=12.0, etc. The first audio sample plays at t=3 in the final output video as expected. But then all the other samples are bunched up immediately after it instead of being scheduled at the correct time. Below is my code. I'm just loading the asset and then readjusting its timestamps to be correct in the absolute timeline. Why do they not get scheduled correctly when the timestamps and durations are definitely correct and non-overlapping? func addFrame(_ pixelBuffer: CVPixelBuffer) { guard CGSize(width: pixelBuffer.width, height: pixelBuffer.height) == outputSize else { return } let frameTime = CMTimeMake(value: frameCount, timescale: frameRate) if videoInput?.isReadyForMoreMediaData == true { pixelBufferAdaptor?.append(pixelBuffer, withPresentationTime: frameTime) frameCount += 1 currentTime = frameTime } } func addMP3AudioClip(_ audioData: Data) async throws { let tempURL = FileManager.default.temporaryDirectory.appendingPathComponent(UUID().uuidString + ".mp3") defer { try? FileManager.default.removeItem(at: tempURL) } try audioData.write(to: tempURL) let asset = AVAsset(url: tempURL) let duration = try await asset.load(.duration) let audioTrack = try await asset.loadTracks(withMediaType: .audio).first! let audioReader = try AVAssetReader(asset: asset) let outputSettings: [String: Any] = [ AVFormatIDKey: kAudioFormatLinearPCM, AVSampleRateKey: 44100, AVNumberOfChannelsKey: 2, AVLinearPCMBitDepthKey: 16, AVLinearPCMIsFloatKey: false, AVLinearPCMIsBigEndianKey: false, AVLinearPCMIsNonInterleaved: false ] let audioReaderOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: outputSettings) audioReader.add(audioReaderOutput) guard audioReader.startReading() else { throw NSError(domain: "AudioReaderError", code: 0, userInfo: [NSLocalizedDescriptionKey: "Failed to start reading audio"]) } let baseInsertionTime = currentTime.convertScale(duration.timescale, method: .default) // Capture the current video time when the method is called print("Adding audio clip at \(baseInsertionTime.seconds) seconds, duration: \(duration.seconds) seconds") var audioTime = CMTime.zero var totalDuration: Double = 0 while let sampleBuffer = audioReaderOutput.copyNextSampleBuffer() { let bufferDuration = CMSampleBufferGetDuration(sampleBuffer) let adjustedBuffer = adjustTimeStamp(of: sampleBuffer, by: baseInsertionTime) while !audioInput!.isReadyForMoreMediaData { try await Task.sleep(nanoseconds: 100_000_000) // 0.1 second } audioInput!.append(adjustedBuffer) print(" Adjusted time: \(adjustedBuffer.presentationTimeStamp.seconds)") audioTime = CMTimeAdd(audioTime, bufferDuration) totalDuration += bufferDuration.seconds } print("Finished adding audio clip. Last sample at: \(CMTimeAdd(baseInsertionTime, audioTime).seconds) seconds") print(" totalDuration=\(totalDuration)") } private func adjustTimeStamp(of sampleBuffer: CMSampleBuffer, by timeOffset: CMTime) -> CMSampleBuffer { var count: CMItemCount = 0 CMSampleBufferGetSampleTimingInfoArray(sampleBuffer, entryCount: 0, arrayToFill: nil, entriesNeededOut: &count) var timingInfo = [CMSampleTimingInfo](repeating: CMSampleTimingInfo(), count: count) CMSampleBufferGetSampleTimingInfoArray(sampleBuffer, entryCount: count, arrayToFill: &timingInfo, entriesNeededOut: nil) for i in 0..<count { timingInfo[i].presentationTimeStamp = CMTimeAdd(timingInfo[i].presentationTimeStamp, timeOffset) if timingInfo[i].decodeTimeStamp != .invalid { timingInfo[i].decodeTimeStamp = CMTimeAdd(timingInfo[i].decodeTimeStamp, timeOffset) } else { timingInfo[i].decodeTimeStamp = timingInfo[i].presentationTimeStamp } } var adjustedBuffer: CMSampleBuffer? CMSampleBufferCreateCopyWithNewTiming(allocator: nil, sampleBuffer: sampleBuffer, sampleTimingEntryCount: count, sampleTimingArray: &timingInfo, sampleBufferOut: &adjustedBuffer) return adjustedBuffer! }
0
0
208
Oct ’24
How to insert multiple AVAssets into AVMutableCompositionTrack with silence in between?
Hi, I'm recording videos frame by frame and occasionally a sound plays (from an MP3 asset). I want to composite these sounds into the video at the correct timings. But this doesn't work. Really pulling my hair out here. I've tried everything, including adding one after another and then inserting silence in between (allegedly this pushes subsequent clips back) but nothing works. Here, _currentTime is the current time according to the video frames added, which are added at 20Hz. You can see I am adding silence long enough to cover the time from the end of the last audio clip to now, plus extra padding to contain the audio we are about to add. Doesn't matter if I remove this, it just doesn't work. Sometimes I can get two pieces of audio to play but never a third and usually, only the first audio plays, and then nothing after. I'm completely stumped. func addFrame(_ pixelBuffer: CVPixelBuffer) { guard CGSize(width: pixelBuffer.width, height: pixelBuffer.height) == _outputSize else { return } let frameTime = CMTimeMake(value: Int64(_frameCount), timescale: _frameRate) if _videoInput?.isReadyForMoreMediaData == true { _pixelBufferAdaptor?.append(pixelBuffer, withPresentationTime: frameTime) _frameCount += 1 _currentTime = frameTime } } func addMP3AudioClip(_ audioData: Data) async throws { let tempURL = FileManager.default.temporaryDirectory.appendingPathComponent(UUID().uuidString + ".mp3") try audioData.write(to: tempURL) let asset = AVAsset(url: tempURL) let duration = try await asset.load(.duration) let audioTrack = try await asset.loadTracks(withMediaType: .audio).first! let currentAudioTime = _currentTime.convertScale(duration.timescale, method: .default) _audioTrack?.insertEmptyTimeRange(CMTimeRangeFromTimeToTime(start: _lastAudioClipEndTime, end: currentAudioTime)) _audioTrack?.insertEmptyTimeRange(CMTimeRangeFromTimeToTime(start: currentAudioTime, end: CMTimeAdd(currentAudioTime, duration))) let timeRange = CMTimeRangeMake(start: .zero, duration: duration) try _audioTrack?.insertTimeRange(timeRange, of: audioTrack, at: currentAudioTime) _lastAudioClipEndTime = CMTimeAdd(currentAudioTime, duration) try FileManager.default.removeItem(at: tempURL) _audioClipTimeRanges.append(CMTimeRangeMake(start: _currentTime, duration: duration)) } Thank you, -- B.
0
0
180
Oct ’24
WatchConnectivity: Sending from Watch (in audio background mode) -> iPhone (backgrounded) not working
Hi, I have an app that is performing long-duration audio recording on the Watch and need to communicate with the phone occasionally to: Request an auth token (login happens on the phone app) when needing to upload a recording. Occasionally poke the iPhone app to sample the current location (I don't do this on Watch). Most of the time, both the Watch and iPhone apps would be backgrounded but the Watch app has background audio enabled and is recording, so processing continues. I'm finding that WatchConnectivity isn't connected to the phone in these cases and cannot send a ping. That is, on the Watch side, WatchConnectivity is not connected to the phone (isReachable==false), and the messages are simply never received on the phone as a result. I'm not sure how else the apps should communicate this information. How are these scenarios typically handled? Thank you, -- B.
0
0
317
Aug ’24
watchOS: Notification never shows up on watch face and rarely appears as a banner at all
Hi, I'm triggering a notification from the audio interruption handler (but also have a debug button set to trigger it manually) and it frequently does not show up. I don't think I have ever seen it show up when the watch face is off. I have created a singleton class to trigger this notification as follows. Note that I use a UUID in the identifier because an old thread here suggests this is necessary, but it makes no difference as far as I can tell. Any ideas? I'd like this notification to be reliable. I'm also surprised that with trigger set to nil, it does not trigger instantaneously. Any help would be much appreciated! Thanks, -- B. import Foundation import UserNotifications class NotificationSender: NSObject, UNUserNotificationCenterDelegate { static let shared = NotificationSender() override init() { super.init() let center = UNUserNotificationCenter.current() center.requestAuthorization(options: [.sound, .badge]) { granted, error in if granted { print("Notification permission granted") } else { print("Notification permission denied") } } center.delegate = self // Define the action to open the app let openAction = UNNotificationAction(identifier: "openAction", title: "Open App", options: [.foreground]) // Add the action to the notification content let category = UNNotificationCategory(identifier: "resumeAudioCategory", actions: [openAction], intentIdentifiers: [], options: []) center.setNotificationCategories([category]) } func sendNotification() { let center = UNUserNotificationCenter.current() let content = UNMutableNotificationContent() content.title = "Recording Interrupted" content.body = "You will need to restart it manually." content.categoryIdentifier = "resumeAudioCategory" // Create the notification request //let trigger = UNTimeIntervalNotificationTrigger(timeInterval: 5, repeats: false) let request = UNNotificationRequest(identifier: "ResumeAudioNotification-\(UUID().uuidString)", content: content, trigger: nil) center.add(request) { error in if let error = error { print("Error adding notification request: \(error)") } } } // Handle notification when the app is in the foreground func userNotificationCenter(_ center: UNUserNotificationCenter, willPresent notification: UNNotification, withCompletionHandler completionHandler: @escaping (UNNotificationPresentationOptions) -> Void) { // Display the notification while the app is in the foreground completionHandler([.sound, .badge, .banner, .list]) } // Handle notification response func userNotificationCenter(_ center: UNUserNotificationCenter, didReceive response: UNNotificationResponse, withCompletionHandler completionHandler: @escaping () -> Void) { // Handle the user's response to the notification // For example, navigate to a specific screen in your app completionHandler() } }
0
0
434
Apr ’24
watchOS: Resume recording from AudioInterruption in background mode
Hi, I have a watchOS app that records audio for an extended period of time and because the mic is active, continues to record in background mode when the watch face is off. However, when a call comes in or Siri is activated, recording stops because of an audio interruption. Here is my code for setting up the session: private func setupAudioSession() { let audioSession = AVAudioSession.sharedInstance() do { try audioSession.setCategory(.playAndRecord, mode: .default, options: [.overrideMutedMicrophoneInterruption]) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) } catch { print("Audio Session error: \(error)") } } Before this I register an interruption handler that holds a reference to my AudioEngine (which I start and stop each time recording is activated by the user): _audioInterruptionHandler = AudioInterruptionHandler(audioEngine: _audioEngine) And here is how this class implements recovery: fileprivate class AudioInterruptionHandler { private let _audioEngine: AVAudioEngine public init(audioEngine: AVAudioEngine) { _audioEngine = audioEngine // Listen to interrupt notifications NotificationCenter.default.addObserver(self, selector: #selector(handleAudioInterruption(notification:)), name: AVAudioSession.interruptionNotification, object: nil) } @objc private func handleAudioInterruption(notification: Notification) { guard let userInfo = notification.userInfo, let interruptionTypeRawValue = userInfo[AVAudioSessionInterruptionTypeKey] as? UInt, let interruptionType = AVAudioSession.InterruptionType(rawValue: interruptionTypeRawValue) else { return } switch interruptionType { case .began: print("[AudioInterruptionHandler] Interruption began") case .ended: print("[AudioInterruptionHandler] Interruption ended") print("Interruption ended") do { try AVAudioSession.sharedInstance().setActive(true) } catch { print("[AudioInterruptionHandler] Error resuming audio session: \(error.localizedDescription)") } default: print("[AudioInterruptionHandler] Unknown interruption: \(interruptionType.rawValue)") } } } Unfortunately, it fails with: Error resuming audio session: Session activation failed Is this even possible to do on watchOS? This code worked for me on iOS. Thank you, -- B.
2
0
704
Apr ’24
How to compress an AVAudioPCMBuffer to m4a file format without writing to disk?
Hi, I'd like to upload audio samples to the OpenAI Whisper API or others that take m4a, mp3, and wav data. I capture from the microphone and perform some basic signal processing to try to filter out non-voice samples and am left with AVAudioPCMBuffer. It seems there are two approaches: Use AVAudioConverter to create AVAudioCompressedBuffer in MPEG-4 audio format. I've gotten this working (although I can't verify that the compressed data is valid because I'm unable to export a file with proper MPEG-4 headers). Use AVAssetWriter, but this writes to disk, which strikes me as inefficient. Neither of these readily produces a memory buffer with a .m4a file inside. Am I missing some obvious way to do this? How do people upload compressed audio data to remote endpoints? I've also explored just trying to create my own MPEG-4 compliant header but I was unable to produce a valid file. Thanks, -- B.
3
0
972
May ’23
Bluetooth Background Mode: Network-related errors and sporadic failures
Hi, I've read over many of the helpful posts on background modes ([1], [2], and responses to my thread [3]) but am encountering something that doesn't quite line up with my understanding. Background My app is using Bluetooth background mode in order to receive data from a peripheral periodically and then perform a network request (eventually, this will involve two network requests, one of them being an upload of audio, which will be a much larger transfer of up to a couple hundred KB, possibly requiring m4a compression first). In background mode, all the Bluetooth messages (currently a small string) are received reliably. The app is woken up and from my understanding, given some time to run in the background. I perform a POST request using URLSession.shared.dataTask(with: request). Problem Encountered These often succeed but sporadically fail with a timeout in background mode and with the error message "connection was lost". Furthermore, nearly every request in background mode is associated with some sort of connection error that appears in the log. Given that many of the requests still succeed, I'm not sure what to make of these. I see a variety of different ones. For example: Case 1: 2023-05-23 16:33:49.428668-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_read_request_report [C1] Receive failed with error "Socket is not connected" 2023-05-23 16:33:49.429633-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_read_request_report [C1] Receive failed with error "Socket is not connected" 2023-05-23 16:33:49.431677-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_read_request_report [C1] Receive failed with error "Socket is not connected" 2023-05-23 16:33:49.440358-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.440528-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.440640-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.440784-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.441459-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.441684-0700 ChatGPT for Monocle[4775:1088487] [quic] quic_conn_send_frames_for_key_state_block_invoke [C1.1.1.1:2] [-0151a6eeef6cab8c6b53cceead6cada7cf118a4e] unable to request outbound data 2023-05-23 16:33:49.445598-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_endpoint_handler_add_write_request [C1.1.1.1 104.18.6.192:443 failed channel-flow (satisfied (Path is satisfied), viable, interface: en0[802.11], ipv4, ipv6, dns)] Cannot send after flow table is released 2023-05-23 16:33:49.445773-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_write_request_report [C1] Send failed with error "Socket is not connected" 2023-05-23 16:33:49.446925-0700 ChatGPT for Monocle[4775:1088487] Connection 1: received failure notification 2023-05-23 16:33:49.447348-0700 ChatGPT for Monocle[4775:1088487] Connection 1: write error 1:57 2023-05-23 16:33:49.464436-0700 ChatGPT for Monocle[4775:1088487] [connection] nw_endpoint_handler_unregister_context [C1.1.1.1 104.18.6.192:443 failed channel-flow (satisfied (Path is satisfied), viable, interface: en0[802.11], ipv4, ipv6, dns)] Cannot unregister after flow table is released 2023-05-23 16:33:49.464974-0700 ChatGPT for Monocle[4775:1088487] [] nw_endpoint_flow_fillout_data_transfer_snapshot copy_info() returned NULL Case 2: 2023-05-23 16:34:09.422783-0700 ChatGPT for Monocle[4775:1088784] [connection] nw_read_request_report [C3] Receive failed with error "Socket is not connected" 2023-05-23 16:34:09.423511-0700 ChatGPT for Monocle[4775:1088784] [connection] nw_read_request_report [C3] Receive failed with error "Socket is not connected" 2023-05-23 16:34:09.425478-0700 ChatGPT for Monocle[4775:1088784] [connection] nw_read_request_report [C3] Receive failed with error "Socket is not connected" 2023-05-23 16:34:09.434263-0700 ChatGPT for Monocle[4775:1088784] [quic] quic_conn_send_frames_for_key_state_block_invoke [C3.1.1.1:2] [-01809801f02b5c3795812501322b6f9d3c91236f] unable to request outbound data The code that is generating these requests is fairly straightforward: public func send(query: String, apiKey: String, model: String, completion: @escaping (String, ChatGPTError?) -&gt; Void) { let requestHeader = [ "Authorization": "Bearer \(apiKey)", "Content-Type": "application/json" ] _payload["model"] = model if var messages = _payload["messages"] as? [[String: String]] { messages.append([ "role": "user", "content": "\(query)" ]) _payload["messages"] = messages } let jsonPayload = try? JSONSerialization.data(withJSONObject: _payload) let url = URL(string: "https://api.openai.com/v1/chat/completions")! var request = URLRequest(url: url) request.httpMethod = "POST" request.allHTTPHeaderFields = requestHeader request.httpBody = jsonPayload _task = URLSession.shared.dataTask(with: request) { data, response, error in if let error = error { DispatchQueue.main.async { completion("", ChatGPTError.networkRequestFailed(error: error)) } return } if let data = data { let (contentError, response) = self.extractContent(from: data) if let contentError = contentError { DispatchQueue.main.async { completion("", contentError) } } else if let response = response { DispatchQueue.main.async { completion(response, nil) } } return } DispatchQueue.main.async { completion("", ChatGPTError.responsePayloadParseError) } } _task?.resume() } Questions Is there any way to make this process more reliable? I assume the error messages are meaningful and should not be ignored. If not, will I end up encountering issues when trying to upload larger payloads? Thank you! -- B.
2
0
1.3k
May ’23
Are there any background processing restrictions for Audio background mode?
Hi, I'd like to develop an iOS application that keeps the mic open for voice recording and processing even when the screen is off. I want to perform speech-to-text requests whenever samples of voice are detected (using a voice activity detection library) and also send requests to the cloud based on what is spoken. I've enabled the Audio background mode and preliminary testing seems to indicate that this is working. That is, I can press "record" in my app, switch to another app then shut the screen off, and speak for several seconds before auto-stopping the recording and sending it to a SFSpeechRecognizer task, which appears to succeed. However, I have read that this should not be supported so before going further down this path, I wanted to understand what exactly are the processing limitations in this mode? The documentation doesn't seem very clear to me. Thanks, -- B.
5
0
1.4k
May ’23
ARKit Body Tracking Sample: Wrist joints (left_hand_joint, right_hand_joint) are not updating
Hi, On my iPhone 12 Pro Max (iOS 16.2, Xcode 14.2), I'm noticing that the body tracking sample produces ARBodyAnchors whose wrist joints (left_hand_joint, right_hand_joint) transforms never change. For example, when waving my hand or twisting my wrist. Other joints work (I haven't checked digits, however). Has anyone experienced this? I'm not even sure how to begin debugging this head scratcher. Thanks, Bart
0
0
682
Jan ’23