Post

Replies

Boosts

Views

Activity

Reply to Are there any background processing restrictions for Audio background mode?
Hi Quinn, CPU and network usage. I would like to at least: Continuously perform voice activity detection (this does seem to work with a basic VAD algo; and I imagine streaming apps are doing more work decoding audio than this anyway). Send voice to a server for processing. Receive and store (with minimal processing) JSON responses. Play back synthesized voice. Ideally, rather than sending voice to the server, I'd like to perform Siri speech-to-text transcription and speech synthesis on the way back, allowing me to upload only text and receive text responses. My understanding is there are some limitations on CPU usage for at least some of these cases. However, I imagine that audio streaming apps (YouTube, Spotify, etc.) must be doing a fair bit of decoding work themselves? Thank you, -- B.
May ’23
Reply to Are there any background processing restrictions for Audio background mode?
Thanks, Quinn, that is incredibly helpful! Re: Item 1, I currently have my own VAD but was also thinking of just using Siri speech-to-text as well. Will test it out. CPU limits are definitely a concern but I can test and see what happens. It becomes a CPU vs. network (which has a stable CPU cost) trade-off. On-device voice transcription is highly desirable from an economic perspective because doing this on the server is costly (not to mention the user's data plan bandwidth caps). Did not know about constrained vs. expensive network flags. Very helpful. Another related question (but maybe should start a new thread on this?): Background Bluetooth mode (separate but related project): apps can receive Bluetooth events in the background but do similar constraints apply? That is, can I safely perform a REST API request and be confident that I will have time to process the response? Specific use case: Receive an audio sample from a Bluetooth peripheral (not headphones nor anything that can present itself as such) Upload audio to a voice-to-text API (or use Siri speech-to-text). Receive result of [2]. Hit a REST service with text obtained from [2]. Receive result of [4]. Send result of [4] (just some text data) back to the peripheral.
May ’23
Reply to Bluetooth Background Mode: Network-related errors and sporadic failures
I was able to eliminate these errors by configuring a custom URLSession for background mode however there are is still an issue when I attempt to throw SFSpeech into the mix to perform a voice transcription request before uploading its results to the server. I get this error: Lost connection to background transfer service This post mentions that dataTask isn't supposed to work with a background configuration, but for me it does. I don't think I can use downloadRequest because I'm making a POST request. Why would using SFSpeech (and therefore starting the dataTask within one of its delegate calls) cause this issue? Once again the sequence of events is: Bluetooth data received in background mode SFSpeech kicked off to convert speech to text POST request fired on a background URLSession in SFSpeech delegate handler Here is how I configure the URLSession: let configuration = URLSessionConfiguration.background(withIdentifier: "ChatGPT") configuration.isDiscretionary = false configuration.shouldUseExtendedBackgroundIdleMode = true configuration.sessionSendsLaunchEvents = true configuration.allowsConstrainedNetworkAccess = true configuration.allowsExpensiveNetworkAccess = true _session = URLSession(configuration: configuration, delegate: self, delegateQueue: nil) Thank you, -- B.
May ’23