How to make AVSpeechSynthesizer work for write and delegate (Catalina)

I am unable to get AVSpeechSynthesizer to write or to acknowledge the delegate did finish.
when I call the function, it merely speaks the string aloud.
I am running on macOS 10.15.7 (Catalina).
What am I missing?
Code Block language
code-block
SpeakerTest().writeToBuffer("This should write to buffer and call didFinish delegate.")
class SpeakerTest: NSObject, AVSpeechSynthesizerDelegate {
let synth = AVSpeechSynthesizer()
override init() {
super.init()
synth.delegate = self
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
print("Utterance didFinish")
}
func speak(_ string: String) {
let utterance = AVSpeechUtterance(string: string)
synth.speak(utterance)
}
func writeToBuffer(_ string: String)
{
let utterance = AVSpeechUtterance(string: string)
synth.write(utterance) { (buffer: AVAudioBuffer) in
guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
fatalError("unknown buffer type: \(buffer)")
}
if pcmBuffer.frameLength == 0 {
print("buffer is empty")
} else {
print("buffer has content \(buffer)")
}
}
}
}



Replies

This was fixed in macOS 11 versions
Fixed -- YES! Thank you for that information. I'm not going mad :-).

Hopefully this is not a silly question but is there a way to resolve or work around this in Catalina? It would seem a major upgrade as a fix is a big ask.

Also does AVSpeechSynthesizer support offline rendering for faster than real-time. Its not something I can test under Catalina and I havent been able to find documentation for this feature, as it does exist in NSSpeechSynthesizer.

Why? I have been using NSSpeechSynthesizer for years, using an AVAudioEngine + Speech mixing workflow for recording audio to a file, faster than real-time rendering. I have a system and it works but I must workaround some challenges with pronunciations.

I cant make the speech dictionary (addSpeechDictionary) work under NSSpeechSynthesizer (I dont know if its a known problem or just me) and resorted to butchered spellings to get a voice to pronounce a words correctly. My hope is that I can utilize AVSpeechSynthesizer in IPA mode or some other method to pronounce words correctly while also rendering to disk, all in faster than real-time.
I took the plunge today and upgraded to BigSur (11.2.3) and unfortunately, neither write, nor the call backs for didFinish nor willSpeakRangeOfSpeechString get called. I only upgraded to utilize this framework. Is this working for anyone and if so, what is the secret? Is there an entitlement that must be enabled?

In the writeToBuffer function, none of the print statements are executed, so synth.write(utterance) is not executing.

Any advice?
Code Block
func writeToBuffer(_ string: String)
{
let utterance = AVSpeechUtterance(string: string)
utterance.voice = AVSpeechSynthesisVoice(language: "en-gb")
synth.write(utterance) { (buffer: AVAudioBuffer) in
print("we made it to the utterance\(utterance)")
guard let pcmBuffer = buffer as? AVAudioPCMBuffer else {
fatalError("unknown buffer type: \(buffer)")
}
print("we are here")
if pcmBuffer.frameLength == 0 {
print("buffer is empty")
} else {
print("buffer has content \(buffer)")
}
}
}
Code Block
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance)
{
print("Utterance didFinish")
}
Code Block
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer,
willSpeakRangeOfSpeechString characterRange: NSRange,
utterance: AVSpeechUtterance)
{
print("speaking range: \(characterRange)")
}
I also tried considered that the callback didnt execute because of using the write method, so I only executed the speak method.
The speech was heard, however, the callbacks were not executed. I am at a loss for what is needed to make this work. I will gladly continue to utilize NSSpeechSynthesizer if someone has information on addSpeechDictionary. This is my only impediment in speech, getting the correct pronunciation without resorting to creating spelling workarounds for word pronunciations.
Code Block
func speak(_ string: String) {
let utterance = AVSpeechUtterance(string: string)
utterance.voice = AVSpeechSynthesisVoice(language: "en-gb")
synth.speak(utterance)
}