Hi,
I'd like to upload audio samples to the OpenAI Whisper API or others that take m4a, mp3, and wav data. I capture from the microphone and perform some basic signal processing to try to filter out non-voice samples and am left with AVAudioPCMBuffer.
It seems there are two approaches:
-
Use AVAudioConverter to create AVAudioCompressedBuffer in MPEG-4 audio format. I've gotten this working (although I can't verify that the compressed data is valid because I'm unable to export a file with proper MPEG-4 headers).
-
Use AVAssetWriter, but this writes to disk, which strikes me as inefficient.
Neither of these readily produces a memory buffer with a .m4a file inside.
Am I missing some obvious way to do this? How do people upload compressed audio data to remote endpoints?
I've also explored just trying to create my own MPEG-4 compliant header but I was unable to produce a valid file.
Thanks,
-- B.