Recording stereo audio with `AVCaptureAudioDataOutput`

Question

mrousavy OP

Created Apr ’24

Replies 0

Boosts 0

Participants 1

Hey all!

I'm building a Camera app using AVFoundation, and I am using AVCaptureVideoDataOutput and AVCaptureAudioDataOutput delegates. (I cannot use AVCaptureMovieFileOutput because I am doing some processing inbetween)

When recording the audio CMSampleBuffers to the AVAssetWriter, I noticed that compared to the stock iOS camera app, they are mono-audio, not stereo audio.

I wonder how recording in stereo audio works, are there any guides or documentation available for that?

Is a stereo audio frame still one CMSampleBuffer, or will it be multiple CMSampleBuffers? Do I need to synchronize them? Do I need to set up the AVAssetWriter/AVAssetWriterInput differently?

This is my Audio Session code:

func configureAudioSession(configuration: CameraConfiguration) throws {
  ReactLogger.log(level: .info, message: "Configuring Audio Session...")

  // Prevent iOS from automatically configuring the Audio Session for us
  audioCaptureSession.automaticallyConfiguresApplicationAudioSession = false
  let enableAudio = configuration.audio != .disabled

  // Check microphone permission
  if enableAudio {
    let audioPermissionStatus = AVCaptureDevice.authorizationStatus(for: .audio)
    if audioPermissionStatus != .authorized {
      throw CameraError.permission(.microphone)
    }
  }

  // Remove all current inputs
  for input in audioCaptureSession.inputs {
    audioCaptureSession.removeInput(input)
  }
  audioDeviceInput = nil

  // Audio Input (Microphone)
  if enableAudio {
    ReactLogger.log(level: .info, message: "Adding Audio input...")
    guard let microphone = AVCaptureDevice.default(for: .audio) else {
      throw CameraError.device(.microphoneUnavailable)
    }
    let input = try AVCaptureDeviceInput(device: microphone)
    guard audioCaptureSession.canAddInput(input) else {
      throw CameraError.parameter(.unsupportedInput(inputDescriptor: "audio-input"))
    }
    audioCaptureSession.addInput(input)
    audioDeviceInput = input
  }

  // Remove all current outputs
  for output in audioCaptureSession.outputs {
    audioCaptureSession.removeOutput(output)
  }
  audioOutput = nil

  // Audio Output
  if enableAudio {
    ReactLogger.log(level: .info, message: "Adding Audio Data output...")
    let output = AVCaptureAudioDataOutput()
    guard audioCaptureSession.canAddOutput(output) else {
      throw CameraError.parameter(.unsupportedOutput(outputDescriptor: "audio-output"))
    }
    output.setSampleBufferDelegate(self, queue: CameraQueues.audioQueue)
    audioCaptureSession.addOutput(output)
    audioOutput = output
  }
}

This is how I activate the audio session just before I start recording:

let audioSession = AVAudioSession.sharedInstance()

try audioSession.updateCategory(AVAudioSession.Category.playAndRecord,
                                mode: .videoRecording,
                                options: [.mixWithOthers,
                                          .allowBluetoothA2DP,
                                          .defaultToSpeaker,
                                          .allowAirPlay])

if #available(iOS 14.5, *) {
  // prevents the audio session from being interrupted by a phone call
  try audioSession.setPrefersNoInterruptionsFromSystemAlerts(true)
}

if #available(iOS 13.0, *) {
  // allow system sounds (notifications, calls, music) to play while recording
  try audioSession.setAllowHapticsAndSystemSoundsDuringRecording(true)
}

audioCaptureSession.startRunning()

And this is how I set up the AVAssetWriter:

let audioSettings = audioOutput.recommendedAudioSettingsForAssetWriter(writingTo: options.fileType)
let format = audioInput.device.activeFormat.formatDescription

audioWriter = AVAssetWriterInput(mediaType: .audio, outputSettings: audioSettings, sourceFormatHint: format)
audioWriter!.expectsMediaDataInRealTime = true
assetWriter.add(audioWriter!)
ReactLogger.log(level: .info, message: "Initialized Audio AssetWriter.")

The rest is trivial - I receive CMSampleBuffers of the audio in my delegate's callback, write them to the audioWriter, and it ends up in the .mov file - but it is not stereo, it's mono.

Is there anything I'm missing here?

Boost