Whenever switching the AVSpeechSynthesizer voice to a (different) German voice the app waits/hangs for a few seconds (depending on device) before speech output starts.
Looking into the console output I see that the German language rules data is five to nine times large than e.g. English or Italian:
10:55:16.137820+0200 ... #MobileAsset listing ...'[Available: true, Language: de-DE]'
...
10:55:07.636488+0200 ... Loading on disk rule data: 4392529
10:55:10.818661+0200 ... processing rules: 459, NS: 28669
10:55:10.840711+0200 ... Creating playback session rate: 22050, channels 1
showing the loading on disk rules took about 3.2 secs in this German voice case.
If I look into loading of an English or Italian voice those load times are much shorter:
10:55:16.137820+0200 ... #MobileAsset listing ...'[Available: true, Language: en-US]'
...
10:55:16.148741+0200 ... Loading on disk rule data: 862210
10:55:16.407063+0200 ... processing rules: 12192, NS: 1611
10:55:16.428606+0200 ... Creating playback session rate: 22050, channels 1
10:55:16.137820+0200 ... #MobileAsset listing ...'[Available: true, Language: it-IT]'
...
10:54:50.816431+0200 ... Loading on disk rule data: 536565
10:54:51.493129+0200 ... processing rules: 2567, NS: 4149
10:54:51.514703+0200 ... Creating playback session rate: 22050, channels 1
showing load times of 0.25 and 0.7 secs, only!
Interestingly, if I do a small test app which has exactly the same setup and usage of AVSpeechSynthesizer as in my main app, I can NOT reproduce those lengthy load times, respectively lags/waits until speech output starts following a voice change to a (different) German voice.
This is my code for calling AVSpeechSynthesizer.speak(_):
func speak(_ textToSpeak: String) {
appState.isPrePause = true
// Detect language of incoming text to speak.
var lang = ""
if let dominantLanguage = NLLanguageRecognizer.dominantLanguage(for: textToSpeak) {
lang = dominantLanguage.rawValue
} else {
lang = "en"
}
// Select a voice based on the detected lanuage.
let voice = AVSpeechSynthesisVoice(language: lang)
if voice == nil {
print("WARNING: no voice for the current language \(lang). Falling back to default voice.")
}
let utterance = AVSpeechUtterance(string: textToSpeak)
utterance.voice = voice
utterance.preUtteranceDelay = appState.preUtteranceDelay
utterance.postUtteranceDelay = appState.postUtteranceDelay
avSpeechSynth.speak(utterance)
}
The described lag is happening between calling avSpeechSynth.speak(utterance) and the Synthesizer Delegate callback "didStart".
I have files a feedback report to Apple (FB11380447) about three weeks ago with regular updates afterwards.
Has anybody experienced something like this? Any suggestions on where to dig further?