Updated info below
Full disclosure: I do have this question over on StackOverflow, but I am at a standstill till I find a way to move forward, debug, etc.
I am trying to recognize prerecorded speech in Swift. Essentially it either detects no speech, detects blank speech, or works on the one prerecorded file where I screamed a few words.
I can't tell where the headache lies and can't figure out if there's a more detailed way to debug this. I can't find any properties that give more detailed info.
Someone on SO did recommend I go through Apple's demo, here. This works just fine, and my code is very similar to it. Yet the main difference remains if there is something about the way I save my audio files or something else is leading to my headaches.
If anyone has any insight into this I would very much appreciate any hints.
My question over on StackOverflow
Updated info below, and new code
Updated info It appears that I was calling SFSpeechURLRecognitionRequest too often, and before I completed the first request. Perhaps I need to create a new instance of SFSpeechRecognizer? Unsure.
Regardless, I quickly/sloppily adjusted the code to only run it once the previous instance returned its results.
The results were much better, except one audio file still came up as no results. Not an error, just no text.
This file is the same as the previous file, in that I took an audio recording and split it in two. So the formats and volumes are the same.
So I still need a better way to debug this, to find out what it going wrong with that file. The code where I grab the file and attempt to read it
func findAudioFiles(){
let fm = FileManager.default
var aFiles : URL
print ("\(urlPath)")
do {
let items = try fm.contentsOfDirectory(atPath: documentsPath)
let filteredInterestArray1 = items.filter({$0.hasSuffix(".m4a")})
let filteredInterestArray2 = filteredInterestArray1.filter({$0.contains("SS-X-")})
let sortedItems = filteredInterestArray2.sorted()
for item in sortedItems {
audioFiles.append(item)
}
NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: myDic)
} catch {
print ("\(error)")
}
}
@objc func goAndRead(){
audioIndex += 1
if audioIndex != audioFiles.count {
let fileURL = NSURL.fileURL(withPath: documentsPath + "/" + audioFiles[audioIndex], isDirectory: false)
transcribeAudio(url: fileURL, item: audioFiles[audioIndex])
}
}
func requestTranscribePermissions() {
SFSpeechRecognizer.requestAuthorization { [unowned self] authStatus in
DispatchQueue.main.async {
if authStatus == .authorized {
print("Good to go!")
} else {
print("Transcription permission was declined.")
}
}
}
}
func transcribeAudio(url: URL, item: String) {
guard let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US")) else {return}
let request = SFSpeechURLRecognitionRequest(url: url)
if !recognizer.supportsOnDeviceRecognition { print ("offline not available") ; return }
if !recognizer.isAvailable { print ("not available") ; return }
request.requiresOnDeviceRecognition = true
request.shouldReportPartialResults = true
recognizer.recognitionTask(with: request) {(result, error) in
guard let result = result else {
print("\(item) : There was an error: \(error.debugDescription)")
return
}
if result.isFinal {
print("\(item) : \(result.bestTranscription.formattedString)")
NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: self.myDic)
}
}
}