I'm having trouble using SFSpeechRecognizer & SFSpeechRecognitionTask to show me the words from an audio file. I found a solution on stackoverflow to separate the audio file into smaller sizes. How would I do that programmatically using Swift for a macOS app Xcode project?
I would prefer not to separate the file into smaller files. I will submit another post with more information for that.
You can use AVAudioFile for that. Below is an example. The firstHalfUrl
and secondHalfUrl
variables are URL
s where you'd want to save the first and second halves of the original file.
Instead of saving smaller files, you might want to append(_:) the AVAudioPCMBuffer
s directly to a SFSpeechAudioBufferRecognitionRequest.
let url = Bundle.main.url(forResource: "file", withExtension: "ext")
let file = try AVAudioFile(forReading: url, commonFormat: .pcmFormatInt16, interleaved: true)
let frames = file.length / 2
let buffer = AVAudioPCMBuffer(pcmFormat: file.processingFormat, frameCapacity: AVAudioFrameCount(frames))
try file.read(into: buffer, frameCount: AVAudioFrameCount(frames))
try AVAudioFile(forWriting: firstHalfUrl, settings: file.fileFormat.settings).write(from: buffer)
file.framePosition = frames
try file.read(into: buffer, frameCount: AVAudioFrameCount(frames))
try AVAudioFile(forWriting: secondHalfUrl, settings: file.fileFormat.settings).write(from: buffer)