It looks like Apple has added some new API(s) to SFSpeechRecognition My app, which is currently listed on App Store does feature speech recognition. Yet, trying to use it under iOS 18.0 throws errors: -[SFSpeechRecognitionTask localSpeechRecognitionClient:speechRecordingDidFail:]_block_invoke Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)" What happens is that after several words are transcribed and displayed, the next sentence results in previous words disappearance. That's probably what that portion of the error text - "Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)" means. The problem occurs ONLY when the app is running under iOS 18.0 Even when it's compiled in Xcode 16.0 using iOS 17.5 everything works fine. Any suggestions?
Speech Recognition Problem in iOS 18.0
I may have come up with a solution for now. I closer into SFSpeechRecognitionResult -> SFSpeechRecognitionMetadata and saw that there was a variable 'speechDuration'.
Turns out that speechDuration will spit out how long the previous utterance was. And while speech is coming in it will default to nil. So with that, I created another published var "accumulatedTranscript" and checked to see if speechDuration != nil then append whatever the current transcript is, then reset the transcript to an empty string (to clear out the UI's text).
For the UI I'm using a combined var of accumulatedTranscript + transcript to give the appearance of a continuous stream of text. And from my screenshots you can see it will use the last transcript/final result that comes in after the pause
Some things worth noting:
- I haven't seen iOS17 display a non-nil speech duration so this solution shouldn't affect how iOS17 works but there may be some edge cases I'm not able to think of now.
- The new transcript appended will begin with a capital letter, you'll want to deal with this however you need to for your app (for me, I'll just make everything past the first word lowercase since the pause timer is finicky).
- I haven't done a robust test of this solution yet but I've tested on iOS18 simulator and physical device and iOS17 simulator only
- I'm not sure how this workaround will affect any changes Apple might make to address this so, you know, keep that in mind.
Thanks for those bug numbers (FB15166325
, FB15192539
). Those are both quite new, filed within the last few days, so there no news to report on that front yet.
Does anyone have a bug they filed earlier in the beta cycle?
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
@DTS Engineer
Another # FB15245186 though it's even more recent then the previous ones.
Wow that's pretty incredible this bug snuck into iOS 18.
There is also FB15110263 and FB15110251
I'm experiencing the same issue on iOS 18, although it works fine on older versions. The problem is that I'm receiving partial results, but the text disappears and returns as empty later in the repeated callbacks.
Adding the screenshot and code for reference here.
import UIKit import Speech
public protocol SpeechRecognizerWrapperDelegate: AnyObject { func speechRecognitionFinished(transcription: String) func speechRecognitionPartialResult(transcription: String) func speechRecognitionRecordingNotAuthorized(statusMessage: String) func speechRecognitionTimedOut() }
public class SpeechRecognizerWrapper: NSObject, SFSpeechRecognizerDelegate { public weak var delegate: SpeechRecognizerWrapperDelegate?
private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: (LocalData.sharedInstance.UPAppLanguage == LanguageCode.Hindi.rawValue) ? "hi-IN" : "en-IN"))!
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
var notAuthorise = true
var noAuthStatus = ""
var allPermissionGranted:(()->())?
public override init() {
super.init()
setupSpeechRecognition()
}
private func setupSpeechRecognition() {
speechRecognizer.delegate = self
}
func requestAuthorization() {
if SFSpeechRecognizer.authorizationStatus() == .authorized && AVAudioSession.sharedInstance().recordPermission == .granted {
self.notAuthorise = false
return
}
self.notAuthorise = true
SFSpeechRecognizer.requestAuthorization { [weak self] authStatus in
guard let self = self else { return }
/*
The callback may not be called on the main thread. Add an
operation to the main queue to update the record button's state.
*/
OperationQueue.main.addOperation {
if authStatus != .authorized {
self.notAuthorise = true
self.noAuthStatus = ""
if authStatus == .denied {
self.noAuthStatus = "User denied access to speech recognition"
} else if authStatus == .restricted {
self.noAuthStatus = "Speech recognition restricted on this device"
}
} else {
self.checkTheRecord()
self.notAuthorise = false
}
}
}
}
func checkTheRecord() {
switch AVAudioSession.sharedInstance().recordPermission {
case AVAudioSession.RecordPermission.granted:
// self.allPermissionGranted?() break case AVAudioSession.RecordPermission.denied: break case AVAudioSession.RecordPermission.undetermined: AVAudioSession.sharedInstance().requestRecordPermission({ [weak self] (granted) in if granted { // self?.allPermissionGranted?() } else { self?.notAuthorise = true } }) default: break } }
private var speechRecognitionTimeout: Timer?
public var speechTimeoutInterval: TimeInterval = 2 {
didSet {
restartSpeechTimeout()
}
}
private func restartSpeechTimeout() {
speechRecognitionTimeout?.invalidate()
speechRecognitionTimeout = Timer.scheduledTimer(timeInterval: speechTimeoutInterval, target: self, selector: #selector(timedOut), userInfo: nil, repeats: false)
}
public func startRecording() throws {
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.audioEngine.stop()
self.audioEngine.inputNode.removeTap(onBus: 0)
self.recognitionTask = nil
self.recognitionRequest = nil
self.recognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
let inputNode = audioEngine.inputNode
let mixerNode = AVAudioMixerNode()
audioEngine.attach(mixerNode)
audioEngine.connect(inputNode, to: mixerNode, format: nil)
guard let recognitionRequest = recognitionRequest else { return }
// Configure request so that results are returned before audio recording is finished
recognitionRequest.shouldReportPartialResults = true
// A recognition task represents a speech recognition session.
// We keep a reference to the task so that it can be cancelled.
recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { [weak self] result, error in
guard let self = self else { return }
var isFinal = false
if let result = result {
print("formattedString: \(result.bestTranscription.formattedString)")
isFinal = result.isFinal
self.delegate?.speechRecognitionPartialResult(transcription: result.bestTranscription.formattedString)
}
if error != nil || isFinal {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
}
if isFinal {
self.delegate?.speechRecognitionFinished(transcription: result!.bestTranscription.formattedString)
self.stopRecording()
} else {
if error == nil {
self.restartSpeechTimeout()
} else {
// cancel voice recognition
}
}
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { [weak self] (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
guard let self = self else { return }
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
}
@objc private func timedOut() {
stopRecording()
self.delegate?.speechRecognitionTimedOut()
}
public func stopRecording() {
audioEngine.stop()
audioEngine.inputNode.removeTap(onBus: 0) // Remove tap on bus when stopping recording.
recognitionRequest?.endAudio()
speechRecognitionTimeout?.invalidate()
speechRecognitionTimeout = nil
}
}
iOS 18.1 Beta 5 (22B5054e) seems to have resolved this issue and improved U.S. English language recognition & punctuation.
https://developer.apple.com/download/
Here's hoping its Speech framework makes it into the next release.
18.1 Beta 5 (22B5054e) does not fix it. Not quite.
--- @jsnbro stated above "iOS 18.1 Beta 5 (22B5054e) seems to have resolved this issue and improved U.S. English language recognition & punctuation."
I upgraded to 22B5054e to re-test. What I am seeing is not quite a fix. It seems to have reverted back to the behavior I saw (and reported in this thread on page 1) on iOS 17.6, specifically this:
- the bug does not manifest if you set requiresOnDeviceRecognition = false
- the bug does manifest if you set requiresOnDeviceRecognition = true
As before I am using Apple's SpokenWord example app to test.
My first bug report here was using: (Context: iphone12 running 17.6.1, XCode Version 15.4 (15F31d))
For this update: (Context: iphone12 running 18.1 Beta (22B5054e), XCode Version 16.0 (16A242d))
Tagging you, @DTS Engineer. Looks like your efforts are helping.
Looks like your efforts are helping.
Nah, I’m just watching the bugs go by |-:
Seriously though folks, if you have a product that’s affected by this issue and you haven’t already filed a bug, please do so, and post your bug number here, just for the record.
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
FB15166325
, FB15192539
, FB15245186
, FB15110263
, FB15110251
@DTS Engineer
I did receive a request from Apple to clarify the framework(s):
"Apple Sep 26, 2024 at 1:53 PM Engineering has requested the following information regarding your report:
Is this with mainstream Dictation or Voice Control?"
Sure enough I clarified that it's Dictation and the frameworks I used are SFSpeechRecognizer, SFSpeechAudioBufferRecognitionRequest, etc.
As you can see it was a week ago, and so far I haven't heard from them.
What surprises me is that I don't see other reports on that bug. All of the numbers except mine - FB15245186 - return 'Not found'. Needless to say that if I get any response, I'll post it here.
All of the numbers except mine … return 'Not found'.
That’s expected. Feedback Assistant only shows you bugs that you filed [1]. I address this explicitly in Bug Reporting: How and Why?.
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
[1] Or members of your team, if you use that feature.
@DTS Engineer
I've read you post about the rules of Feedback Assistant. Thank you for clarifying certain points. Nevertheless, I can't help asking a question: if "Feedback Assistant only shows you bugs that you filed", then what the purpose of the header of the page: "Recent Similar Reports:None Resolution:Open"
I guess that boils down to your definition of “shows”. IMO showing you a count of similar bugs isn’t showing your the other bugs. You can’t, for example, see the titles of the bugs, the initial problem description, the attachments, any communication with the originator, and so on. That’s the definition of “shows” that I’m using.
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
Another report here: FB15498488
Is anyone seeing progress on their submitted bugs in Feedback Assistant? I just checked mine and was disappointed to see it updated with:
Resolution:Investigation complete - Unable to diagnose with current information
I submitted details about my phone/build. I told them very specifically how to repro the bug using the Apple-written example app 'SpokenWord'. I provided an MP4 showing that app running and manifesting the bug. I provided links to other reports of this same bug (other FB* submissions) in this thread.
I'm not sure what is missing with respect to being able to diagnose it.
Is anyone else having better luck than me?
As of the latest beta (18.1 22B5075a), the 'dropping words' bug as reported here still occurs if requiresOnDeviceRecognition is set to true.