How to specify bit rate when writing with AVAudioFile

I can currently write, using AVAudioFile, to any of the file formats specified by Core Audio.

It can create files in all formats (except one, see below ) that can be read into iTunes, Quicktime and other apps and played back.

However some formats appear to be ignoring values in the AVAudioFile settings dictionary.



e.g:

• An MP4 or AAC will save and write successfully at any sample rate but any bit rates I attempt to specify are ignored.

• Wave files saved with floating point data are always converted to Int32 even though I specify float. Even though the PCM buffers I’m using as input and output for sample rate conversion are float on input and output. So the AVAudioFile is taking Float input but converting it to Int for some reason I can’t fathom.

• The only crash/exception/failure I see is if I attempt to create an AVAudioFile as WAV/64 bit float. … bang, AVAudioFile isn’t having that one!



The technique I’m using is:

• Create AVAudioFile for writing with a settings dictionary.

• Get processing and file format from AVAudioFile

• Client format is always 32 bit Float, AVAudioFile generally reports its processing format as some other word sized Float format at the sample rate and size I’ve specified in the fileFormat.

• Create a converter to convert from client format to processing format.

• Process input data through the converter to the file using converter.convert(to: , error:&error, withInputFrom )



So this works … sort of

The files ( be they wav, aiff, flac, pp3, aac, mp4 etc ) are written out and will play back just fine.

… but …

If the processing word format is Float, in a PCM file like WAV, the AVAudioFile will always report its fileFormat as Int32.

And if the file is a compressed format such as mp4/aac, any bit rates I attempt to specify are just ignored but the sample rate appears to be respected as if the converters/encoders just choose a bit rates based on sample rate.



So after all that waffle, I've missed something that's probably meant to be obvious, so my questions are …


• For lpcm float formats why is Int32 data written even though the AVAudioFile settings dictionary has AVLinearPCMIsFloatKey to true ?


• How do arrange the setup so that I can specify the bit rate for compressed audio?



The only buffers I actually create are both PCM, the client output buffer, and the AVAudioConverter/AVAudioFile processing buffer.

I’ve attempted using AVAudioCompressedBuffer but haven’t had any luck.



I hope someone has some clues because I’ve spent more hours on this than anyone should ever need to!

For my Christmas present I’d like Core Audio to be fully and comprehensively documented please!

Answered by ForumsContributor in

I agree that the documentation for working with audio types is incomplete and messy. I've done some playing around with audio queues (hidden in the AudioToolbox framework), so figured maybe I could give this a shot.


On specifying the bit rate for compressed audio: I doubt you can actually do this. In CoreAudio, the AudioStreamBasicDescription structure has a property called mBitsPerChannel, and the discussion of that property includes this line: "Set the number of bits to 0 for compressed formats." So if I were to open an audio queue for playback that expects data to come in a compressed audio format (as indicated by the mFormatID property), I wouldn't specify the number of bits per audio sample. Something similar may be happening on your end. Maybe your desired bit rate is ignored because the audio format itself determines what the bit rate will be?


On trying to write float PCM data: I have no idea what might be going wrong. Judging by the documentation, it should work like you described, but just to make sure, are you doing something along the lines of (in Swift)

let audioFile = try! AVAudioFile(forWriting: url,
                                   settings: [AVLinearPCMIsFloatKey: true],
                                   commonFormat: .pcmFormatFloat32,
                                   interleaved: true)

or are you doing something totally different in the settings dictionary?


Sorry for the late reply, and it looks like you didn't get the Christmas present you wanted either. But I hope this helps a little bit.

Thanks Scott, Since originally writing this post I've discovered 2 things ...


1) Integers being written to WAV files when float is specified is actually a bug in Core Audio. It's been there a very long time from what I can see, others have reported it in the past and I have also just reported it. But I'm not holding my breath for a fix.


At the moment it seems one must use AIFC or CAF to successfully write floating point audio files using core audio.


2) You're quite right. It doesn't seem possible to specify an actual bit rate in compressed forms. ( So I've stopped worrying about that ! 😉 )

Accepted Answer

Some time later ... ...

In the end my solution to the compression bit-rate was to revert to using the old ExtAudioFile API with an input consisting of one or two 32 bit uncompressed float channels and utilising the kAudioConverterEncodeBitRate property on the compressed formats I needed.

The problem with the WAV files incorrect format turned out to be a bug which Apple acknowledged and is now fixed :)

Full example of generation and saving:

import AVFoundation
import Foundation

struct FMSynth {

    static let sampleRate = 44100.0
    static let carrierFrequency: Float32 = 440.0
    static let unitVelocity = Float32(2.0 * .pi / sampleRate)
    static let modulatorFrequency: Float32 = 679.0
    static let channels: UInt32 = 1

    let modulatorAmplitude: Float32 = 0.8
    let carrierVelocity = carrierFrequency * unitVelocity
    let modulatorVelocity = modulatorFrequency * unitVelocity
    let samplesPerBuffer: AVAudioFrameCount = 1024 * 16
    let engine = AVAudioEngine()
    let player = AVAudioPlayerNode()
    let format = AVAudioFormat(
        standardFormatWithSampleRate: sampleRate,
        channels: channels
    )

    func generateAndPlay() {
        do {
            if let buffer = AVAudioPCMBuffer(pcmFormat: format!, frameCapacity: samplesPerBuffer) {

                // generate
                let channelL = buffer.floatChannelData?[0]
                let channelR = buffer.floatChannelData?[1]
                var sampleTime: Float32 = 0
                for sampleIndex in 0..<Int(samplesPerBuffer) {
                    let sample = sin(carrierVelocity * sampleTime + modulatorAmplitude * sin(modulatorVelocity * sampleTime))
                    channelL?[sampleIndex] = sample
                    channelR?[sampleIndex] = sample
                    sampleTime += 1.0
                }
                buffer.frameLength = samplesPerBuffer

                // save to file
                let settings: [String: Any] = [
                    AVFormatIDKey         : buffer.format.settings[AVFormatIDKey]          ?? kAudioFormatLinearPCM,
                    AVNumberOfChannelsKey : buffer.format.settings[AVNumberOfChannelsKey]  ?? 1,
                    AVSampleRateKey       : buffer.format.settings[AVSampleRateKey]        ?? 44100,
                    AVLinearPCMBitDepthKey: buffer.format.settings[AVLinearPCMBitDepthKey] ?? 16
                ]
                let fileURL = URL(filePath: "/tmp/out.wav")
                let file = try AVAudioFile(forWriting: fileURL, settings: settings, commonFormat: .pcmFormatFloat32, interleaved: true)
                try file.write(from: buffer)
                file.close()

                // play
                engine.attach(player)
                engine.connect(player, to: engine.mainMixerNode, format: format)
                try engine.start()
                player.scheduleBuffer(buffer)
                player.play()

            }
        } catch {
            print("Error: \(error).")
        }
    }

}

let fmSynth = FMSynth()
fmSynth.generateAndPlay()
How to specify bit rate when writing with AVAudioFile
 
 
Q