AVAudioMixerNode not mixing <1 node with voice processing formats

Hi there, I'm having some trouble with AVAudioMixerNode only working when there is a single input, and outputting silence or very quiet buzzing when >1 input node is connected. My setup has voice processing enabled, input going to a sink, and N source nodes going to the main mixer node, going to the output node. In all cases I am connecting nodes in the graph with the same declared format: 48kHz 1 channel Float32 PCM.

This is working great for 1 source node, but as soon as I add a second it breaks. I can reproduce this behaviour in the SignalGenerator sample, when the same format is used everywhere. Again, it'll work fine with 1 source node even in this configuration, but add another and there's silence.

Am I doing something wrong with formats here? Is this expected? As I understood it with voice processing on and use of a mixer node I should be able to use my own format essentially everywhere in my graph?

My SignalGenerator modified repro example follows:

import Foundation
import AVFoundation

// True replicates my real app's behaviour, which is broken.
// You can remove one source node connection
// to make it work even when this is true.
let showBrokenState: Bool = true

// SignalGenerator constants.
let frequency: Float = 440
let amplitude: Float = 0.5
let duration: Float = 5.0
let twoPi = 2 * Float.pi
let sine = { (phase: Float) -> Float in
    return sin(phase)
}
let whiteNoise = { (phase: Float) -> Float in
    return ((Float(arc4random_uniform(UINT32_MAX)) / Float(UINT32_MAX)) * 2 - 1)
}

// My "application" format.
let format: AVAudioFormat = .init(commonFormat: .pcmFormatFloat32,
                                  sampleRate: 48000,
                                  channels: 1,
                                  interleaved: true)!

// Engine setup.
let engine = AVAudioEngine()
let mainMixer = engine.mainMixerNode
let output = engine.outputNode
try! output.setVoiceProcessingEnabled(true)
let outputFormat = engine.outputNode.inputFormat(forBus: 0)
let sampleRate = Float(format.sampleRate)
let inputFormat = format

var currentPhase: Float = 0
let phaseIncrement = (twoPi / sampleRate) * frequency

let srcNodeOne = AVAudioSourceNode { _, _, frameCount, audioBufferList -> OSStatus in
    let ablPointer = UnsafeMutableAudioBufferListPointer(audioBufferList)
    for frame in 0..<Int(frameCount) {
        let value = sine(currentPhase) * amplitude
        currentPhase += phaseIncrement
        if currentPhase >= twoPi {
            currentPhase -= twoPi
        }
        if currentPhase < 0.0 {
            currentPhase += twoPi
        }
        for buffer in ablPointer {
            let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
            buf[frame] = value
        }
    }
    return noErr
}

let srcNodeTwo = AVAudioSourceNode { _, _, frameCount, audioBufferList -> OSStatus in
    let ablPointer = UnsafeMutableAudioBufferListPointer(audioBufferList)
    for frame in 0..<Int(frameCount) {
        let value = whiteNoise(currentPhase) * amplitude
        currentPhase += phaseIncrement
        if currentPhase >= twoPi {
            currentPhase -= twoPi
        }
        if currentPhase < 0.0 {
            currentPhase += twoPi
        }
        for buffer in ablPointer {
            let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
            buf[frame] = value
        }
    }
    return noErr
}
engine.attach(srcNodeOne)
engine.attach(srcNodeTwo)
engine.connect(srcNodeOne, to: mainMixer, format: inputFormat)
engine.connect(srcNodeTwo, to: mainMixer, format: inputFormat)
engine.connect(mainMixer, to: output, format: showBrokenState ? inputFormat : outputFormat)

// Put the input node to a sink just to match the formats and make VP happy.
let sink: AVAudioSinkNode = .init { timestamp, numFrames, data in
    .zero
}
engine.attach(sink)
engine.connect(engine.inputNode, to: sink, format: showBrokenState ? inputFormat : outputFormat)
mainMixer.outputVolume = 0.5

try! engine.start()
CFRunLoopRunInMode(.defaultMode, CFTimeInterval(duration), false)
engine.stop()

Bothering to write the repro for this post got me thinking, if I don't use the format everywhere and instead use the output format for the mixer->output connection, and then add another mixer on the input side so I can do input->input mixer in the output format, and input mixer->sink in my desired format, I can mix N nodes and hear them! However, I get from AU (0x92e7e0b2): auou/vpio/appl, render err: -10,874 (kAudioUnitErr_TooManyFramesToProcess) on the input side now.

How did you get the error code from the throw? Im trying to do voice processing and track down an error so I know whats going wrong. I just get the following

AU (0x10811fb40): auou/vpio/appl, render err: 18,446,744,073,709,551,615 18446744073709551615 seems to be the max value for 64 bits...

What are you attaching the sink for?

AVAudioMixerNode not mixing &lt;1 node with voice processing formats
 
 
Q