How to compress an AVAudioPCMBuffer to m4a file format without writing to disk?

Hi,

I'd like to upload audio samples to the OpenAI Whisper API or others that take m4a, mp3, and wav data. I capture from the microphone and perform some basic signal processing to try to filter out non-voice samples and am left with AVAudioPCMBuffer.

It seems there are two approaches:

  1. Use AVAudioConverter to create AVAudioCompressedBuffer in MPEG-4 audio format. I've gotten this working (although I can't verify that the compressed data is valid because I'm unable to export a file with proper MPEG-4 headers).

  2. Use AVAssetWriter, but this writes to disk, which strikes me as inefficient.

Neither of these readily produces a memory buffer with a .m4a file inside.

Am I missing some obvious way to do this? How do people upload compressed audio data to remote endpoints?

I've also explored just trying to create my own MPEG-4 compliant header but I was unable to produce a valid file.

Thanks,

-- B.

Replies

I just had a quick glance at the OpenAI Whisper API - the translation and transcription endpoints expect a file, not a stream. Your path of least resistance is likely to be to create a RAM disk using hdiutil. hth, Stuart

  • Thank you! Will look into that! They do indeed expect a file. That is why I was hoping there was a way to simply write the file data to memory. Theoretically, it's possible to wrap the data output by AVAudioConverter in the appropriate headers but that's a fairly deep rabbit hole I spent a few hours going down without being able to produce a functioning m4a file.

    I did not know about hdiutil and will look into this. A RAM disk would probably work!

Add a Comment

Oh gosh, I forgot to say explicitly (although I did tag it) that this is iOS. Just realized that RAM disks aren't an option there. There really isn't any API for producing asset files in memory?

use AudioToolBox' API :AudioConverterFillComplexBuffer() it can convert pcm buffer to any other compressed audio format.