Unable to Install Tap on Input Node in MacOS for Live Audio Speech Recognition

I'm developing a game that will use speech recognition to execute various commands. I am using code from Apple's Recognizing Speech in Live Audio documentation page.

When I run this in a Swift Playground, it works just fine. However, when I make a SpriteKit game application (basic setup from Xcode's "New Project" menu option), I get the following error:

required condition is false: IsFormatSampleRateAndChannelCountValid(hwFormat)

Upon further research, it appears that my input node has no channels. The following is the relevant portion of my code, along with debug output:

let inputNode = audioEngine.inputNode
print("Number of inputs: \(inputNode.numberOfInputs)") 
// 1
print("Input Format: \(inputNode.inputFormat(forBus: 0))")
// <AVAudioFormat 0x600001bcf200:  0 ch,      0 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved>

let channelCount = inputNode.inputFormat(forBus: 0).channelCount
print("Channel Count: \(channelCount)")
// 0 <== Agrees with the inputFormat output listed previously

// Configure the microphone input.
print("Number of outputs: \(inputNode.numberOfOutputs)")
// 1
let recordingFormat = inputNode.outputFormat(forBus: 0)
print("Output Format: \(recordingFormat)")
// <AVAudioFormat 0x600001bf3160:  2 ch,  44100 Hz, Float32, non-inter>

inputNode.installTap(onBus: 0, bufferSize: 256, format: recordingFormat, block: audioTap) // <== This is where the error occurs.

// NOTE: 'audioTap' is a function defined in this class. Using this defined function instead of an inline, anonymous function.

The code snippet is included in the game's AppDelegate class (which includes import statements for Cocoa, AVFoundation, and Speech), and executes during its applicationDidFinishLaunching function. I'm having trouble understanding why Playground works, but a game app doesn't work. Do I need to do something specific to get the application to recognize the microphone?

NOTE: This if for MacOS, NOT iOS. While the "How To" documentation cited earlier indicates iOS, Apple stated at WWDC19 that it is now supported on the MacOS.

NOTE: I have included the NSSpeechRecognitionUsageDescription key in the applications plist, and successfully acknowledged the authorization request for the microphone.

Accepted Reply

!! SOLVED !!

It turns out that preparing an application for speech recognition using live audio from the microphone is a two step process. The first step is well documented and involves adding a key to the app's info.plist document. The following screenshot shows the entry that is needed in the app's project file.

To add a key via the UI, mouse-over any key, and click the plus icon that shows to the right of that key. You can then choose the needed key from the drop-down list that appears next. After choosing the key, you will need to add some descriptive text that will show in the permissions request window when the app is first launched on the computer/device.

The second step in this process is not well documented. I stumbled upon it accidentally while researching something for a different matter. Evidently, MacOS apps run in a sandbox and need to register which capabilities they want to use. The following screenshot is found in the Signing & Capabilities tab of the app's project file. Look for the section labelled 'App Sandbox'. If it isn't there, you may need to add it via the +Capability button found just below the tabs. Look for Audio Input in the 'Hardware' group and ensure that the checkbox is checked.

Replies

!! SOLVED !!

It turns out that preparing an application for speech recognition using live audio from the microphone is a two step process. The first step is well documented and involves adding a key to the app's info.plist document. The following screenshot shows the entry that is needed in the app's project file.

To add a key via the UI, mouse-over any key, and click the plus icon that shows to the right of that key. You can then choose the needed key from the drop-down list that appears next. After choosing the key, you will need to add some descriptive text that will show in the permissions request window when the app is first launched on the computer/device.

The second step in this process is not well documented. I stumbled upon it accidentally while researching something for a different matter. Evidently, MacOS apps run in a sandbox and need to register which capabilities they want to use. The following screenshot is found in the Signing & Capabilities tab of the app's project file. Look for the section labelled 'App Sandbox'. If it isn't there, you may need to add it via the +Capability button found just below the tabs. Look for Audio Input in the 'Hardware' group and ensure that the checkbox is checked.