Posts

Post not yet marked as solved
3 Replies
865 Views
I am using SFSpeechRecognizer on Mac OS Catalina to transcribe audio files that are potentially longer than one minute. In my tests I see that in the middle of the file the timestamps of SFTranscriptionSegments again start at zero, just as if the audio clock would be reset. This sometimes happens after a minute but later in the file also at different times. This renders the timestamps useless. Is this something that can be configured or worked around? Chopping audio files into one-minute segments will have the danger of splitting words, thus hurting result quality.The documentation states that there is a limit of one minute, which seems to be for mobile devices as it clearly works for longer files (I am reliably transcribing 8 minute files with results that contain the entire text).Does anyone have an insight to share? It doesn't feel right that such a general-purpose AI functionality should be so limited on MacOS. It looks like a perfectly valid use case to transcribe long audio files using the power of a multi-core Mac Pro, for example.Thanks
Posted
by lpde.
Last updated
.