Unable to Install Tap on Input Node in MacOS for Live Audio Speech Recognition

Question

Created Dec ’21

Replies 1

Boosts 0

Views 1.9k

Participants 1

I'm developing a game that will use speech recognition to execute various commands. I am using code from Apple's Recognizing Speech in Live Audio documentation page.

When I run this in a Swift Playground, it works just fine. However, when I make a SpriteKit game application (basic setup from Xcode's "New Project" menu option), I get the following error:

required condition is false: IsFormatSampleRateAndChannelCountValid(hwFormat)

Upon further research, it appears that my input node has no channels. The following is the relevant portion of my code, along with debug output:

let inputNode = audioEngine.inputNode
print("Number of inputs: \(inputNode.numberOfInputs)") 
// 1
print("Input Format: \(inputNode.inputFormat(forBus: 0))")
// <AVAudioFormat 0x600001bcf200:  0 ch,      0 Hz, 'lpcm' (0x00000029) 32-bit little-endian float, deinterleaved>

let channelCount = inputNode.inputFormat(forBus: 0).channelCount
print("Channel Count: \(channelCount)")
// 0 <== Agrees with the inputFormat output listed previously

// Configure the microphone input.
print("Number of outputs: \(inputNode.numberOfOutputs)")
// 1
let recordingFormat = inputNode.outputFormat(forBus: 0)
print("Output Format: \(recordingFormat)")
// <AVAudioFormat 0x600001bf3160:  2 ch,  44100 Hz, Float32, non-inter>

inputNode.installTap(onBus: 0, bufferSize: 256, format: recordingFormat, block: audioTap) // <== This is where the error occurs.

// NOTE: 'audioTap' is a function defined in this class. Using this defined function instead of an inline, anonymous function.

The code snippet is included in the game's AppDelegate class (which includes import statements for Cocoa, AVFoundation, and Speech), and executes during its applicationDidFinishLaunching function. I'm having trouble understanding why Playground works, but a game app doesn't work. Do I need to do something specific to get the application to recognize the microphone?

NOTE: This if for MacOS, NOT iOS. While the "How To" documentation cited earlier indicates iOS, Apple stated at WWDC19 that it is now supported on the MacOS.

NOTE: I have included the NSSpeechRecognitionUsageDescription key in the applications plist, and successfully acknowledged the authorization request for the microphone.

Answered by PhasersOnStun in 699605022

!! SOLVED !!

It turns out that preparing an application for speech recognition using live audio from the microphone is a two step process. The first step is well documented and involves adding a key to the app's info.plist document. The following screenshot shows the entry that is needed in the app's project file.

Info plist - Privacy - Speech Recognition.png

To add a key via the UI, mouse-over any key, and click the plus icon that shows to the right of that key. You can then choose the needed key from the drop-down list that appears next. After choosing the key, you will need to add some descriptive text that will show in the permissions request window when the app is first launched on the computer/device.

Privacy - Add to plist.png

The second step in this process is not well documented. I stumbled upon it accidentally while researching something for a different matter. Evidently, MacOS apps run in a sandbox and need to register which capabilities they want to use. The following screenshot is found in the Signing & Capabilities tab of the app's project file. Look for the section labelled 'App Sandbox'. If it isn't there, you may need to add it via the +Capability button found just below the tabs. Look for Audio Input in the 'Hardware' group and ensure that the checkbox is checked.

App Sandbox - Audio Input.png

Boost

Answer 1

PhasersOnStun OP

Dec ’21

Accepted Answer

!! SOLVED !!

It turns out that preparing an application for speech recognition using live audio from the microphone is a two step process. The first step is well documented and involves adding a key to the app's info.plist document. The following screenshot shows the entry that is needed in the app's project file.

Info plist - Privacy - Speech Recognition.png

To add a key via the UI, mouse-over any key, and click the plus icon that shows to the right of that key. You can then choose the needed key from the drop-down list that appears next. After choosing the key, you will need to add some descriptive text that will show in the permissions request window when the app is first launched on the computer/device.

Privacy - Add to plist.png

The second step in this process is not well documented. I stumbled upon it accidentally while researching something for a different matter. Evidently, MacOS apps run in a sandbox and need to register which capabilities they want to use. The following screenshot is found in the Signing & Capabilities tab of the app's project file. Look for the section labelled 'App Sandbox'. If it isn't there, you may need to add it via the +Capability button found just below the tabs. Look for Audio Input in the 'Hardware' group and ensure that the checkbox is checked.

App Sandbox - Audio Input.png

2