Realtime audio processing within an AVAudioEngine

Hello All,


on my goal to creat a simple sampler instrument in Swift for iOS I came across a problem that I could not find a solution for -> Realtime Audio Processing.

First of all I am pretty new to programming (Swift around 7 months - no experience in Obj-C and C++) but I am having multiple years of experience in hands on sound engineering.


Scope

My goal is to create a simple sampler with the following processing graph:

AVAudioPlayerNode -> "Volume Envelope" -> AVAudioEngine.mainMixerNode


Here my Idea was to use the AVAudioPlayerNode to play my sound files at different pitches - then trigger a volume envelope that fades in the volume over a certain period of time to avoid the noisy click at the start of the sound. After stopping a note I wanted to create a fade out effect by triggering a fade out volume ramp.

By using multiple AVAudioPlayerNode + Volume Evnelope pairs I wanted to achieve polyphony. The processing of the volume envelope should have been done on a per sample level to leave room for other precessing extensions like filtering.


Unfortunatelly I could not yet find a way to access the samples that come out of the AVAudioPlayerNode to process them in real time.


Question

Is there a way to access the audio samples in one of the AVAudioNodes to process them within one AVAudioNode in realtime? The way I imagine that it might work is like the scheduling of the buffers of the AVAudioPlayerNode. Similar like installing a tap on the microphone input and then directly processing the sample before sheduling them on an AVAudioPlayerNode. http://stackoverflow.com/questions/24383080/realtime-audio-with-avaudioengine-in-ios-8


Current Workarround

My current workarround on this issue is to do everything before sheduling the file in the AVAudioPlayerNode. I got the basic Idea from this FMSynthesizer: https://www.snip2code.com/Snippet/168625/An-FM-Synthesizer-in-Swift-using-AVAudio. I changed the approach by using the length of the aduio file vs. an endless while loop to fill the audio buffers. Unfortunatelly this gets very complex and leaves nearly no way for modular extension:

// kSamplesPerBuffer and kInFlightAudioBuffers are set o na global scope. Usually 512 and 3 are good values to start.

class AudioSampler {
     
    let audioFilePath: String
    let audioFilePathURL: NSURL
    let audioFile: AVAudioFile


    let engine: AVAudioEngine = AVAudioEngine()
    let playerNode: AVAudioPlayerNode = AVAudioPlayerNode()

    let audioFormat = AVAudioFormat(standardFormatWithSampleRate: 44100.0, channels: 2)
    var audioBuffers: [AVAudioPCMBuffer] = [AVAudioPCMBuffer]()
    var bufferIndex: Int = 0

    let audioQueue: dispatch_queue_t = dispatch_queue_create("FMSynthesizerQueue", DISPATCH_QUEUE_SERIAL)
    let audioSemaphore: dispatch_semaphore_t = dispatch_semaphore_create(kInFlightAudioBuffers)

    let buffersInFile: Int

    // Load file, init buffers, setup and start engine
    init(){
        audioFilePath = NSBundle.mainBundle().pathForResource("pong2.caf", ofType: nil)!
        audioFilePathURL = NSURL(fileURLWithPath: audioFilePath)!
        audioFile = AVAudioFile(forReading: audioFilePathURL, error: nil)
        buffersInFile = Int(Int(audioFile.length) / Int(kSamplesPerBuffer)) + 1
     
        for i in 0..<kInFlightAudioBuffers {
            var audioBuffer = AVAudioPCMBuffer(PCMFormat: audioFormat, frameCapacity: kSamplesPerBuffer)
            audioBuffers.append(audioBuffer)
        }
     
        engine.attachNode(playerNode)
        engine.connect(playerNode, to: engine.mainMixerNode, format: audioFormat)
     
        var error: NSError? = nil
        if !engine.startAndReturnError(&error) {
            NSLog("Error starting audio engine: \(error)")
        }
    }

    // Load file into buffers, process individual samples of each buffer
    func play() {
        audioFile.framePosition = 0
        dispatch_async(audioQueue) {
         
            for i in 0..<self.buffersInFile {
             
                dispatch_semaphore_wait(self.audioSemaphore, DISPATCH_TIME_FOREVER)
             
                let audioBuffer = self.audioBuffers[self.bufferIndex]
                var leftChannel = audioBuffer.floatChannelData[0]
                var rightChannel = audioBuffer.floatChannelData[1]
             
                var tempBuffer = AVAudioPCMBuffer(PCMFormat: self.audioFormat, frameCapacity: kSamplesPerBuffer)
                self.audioFile.readIntoBuffer(tempBuffer, frameCount: kSamplesPerBuffer, error: nil)          
                let leftChannelTemp = audioBuffer.floatChannelData[0]
                let rightChannelTemp = audioBuffer.floatChannelData[1]
             
                for n in 0..<Int(kSamplesPerBuffer) {
                    // Do processing of audio samples here
                }
             
                self.playerNode.scheduleBuffer(audioBuffer) {
                    dispatch_semaphore_signal(self.audioSemaphore)
                }
             
                self.bufferIndex = (self.bufferIndex + 1) % self.audioBuffers.count
            }
        }
        playerNode.play()
    }
}


Thank you for your time.

Best regards,

Tobias

Post not yet marked as solved Up vote post of Schmidt Down vote post of Schmidt
14k views

Replies

While there is no there is no “realtime tap” currently in AVAudioEngine, one way you may want to go about tacking this is by writing your own effect v3 AudioUnit (Audio Unit Extensions) and then inserting that into the engine.


The idea would be to create and publish the AudioUnit as an effect, then create an instance of it in AVAudioEngine using a new iOS 9 AVAudioUnit method:


+ (void)instantiateWithComponentDescription:(AudioComponentDescription)audioComponentDescription options:(AudioComponentInstantiationOptions)options completionHandler:(void (^)(__kindof AVAudioUnit * __nullable audioUnit, NSError * __nullable error))completionHandler NS_AVAILABLE(10_11, 9_0);


Of course this is more work than just having an AVAudioEngine provided realtime processing tap, so please file an enhancement request for that feature https://bugreport.apple.com/


Audio Unit Extensions were discussed at WWDC 2015 and are going to provide a very powerful way to to extend audio processing on iOS 9 (AU components have of course been around for years on OS X and El Capitan also gets Audio Unit Extensions). Check out the session. I'm sure many other audio developers are going to find this capability quite exciting and may be very interested in some pioneering progress.


https://developer.apple.com/videos/wwdc/2015/?id=508

I am also looking for a solution to the noisy clicks produced by AvAudioPlayerNode. Did you ever find an easy solution or did you go with the audio unit extension solution? If so could you give me a hint on where to start with creating your own AU v3 effect from volume ramps?

I'm trying the solution mentioned, but whenever I set the outputProvider, I get a -[AUAudioUnitV2Bridge setOutputProvider:]: unrecognized selector sent to instance.

// TEST
    AudioComponentDescription mixerDesc;
    mixerDesc.componentType = kAudioUnitType_Generator;
    mixerDesc.componentSubType = kAudioUnitSubType_ScheduledSoundPlayer;
    mixerDesc.componentManufacturer = kAudioUnitManufacturer_Apple;
    mixerDesc.componentFlags = 0;
    mixerDesc.componentFlagsMask = 0;
   
    [AVAudioUnit instantiateWithComponentDescription:mixerDesc options:kAudioComponentInstantiation_LoadInProcess completionHandler:^(__kindof AVAudioUnit * _Nullable audioUnit, NSError * _Nullable error) {
        NSLog(@"here");
       
        // Crashes here
        audioUnit.AUAudioUnit.outputProvider = ^AUAudioUnitStatus(AudioUnitRenderActionFlags *actionFlags, const AudioTimeStamp *timestamp, AUAudioFrameCount frameCount, NSInteger inputBusNumber, AudioBufferList *inputData)
        {
            const double amplitude = 0.2;
            static double theta = 0.0;
            double theta_increment = 2.0 * M_PI * 880.0 / 44100.0;
            const int channel = 0;
            Float32 *buffer = (Float32 *)inputData->mBuffers[channel].mData;
           
            memset(inputData->mBuffers[channel].mData, 0, inputData->mBuffers[channel].mDataByteSize);
            memset(inputData->mBuffers[1].mData, 0, inputData->mBuffers[1].mDataByteSize);
           
            // Generate the samples
            for (UInt32 frame = 0; frame < inputBusNumber; frame++)
            {
                buffer[frame] = sin(theta) * amplitude;
               
                theta += theta_increment;
                if (theta >= 2.0 * M_PI)
                {
                    theta -= 2.0 * M_PI;
                }
            }
           
            return noErr;
        };
       
    }];

The idea would be to subclass AUAudioUnit yourself and use it as a node in the AVAudioEngine. I would suggest looking at the sample code noting that you wouldn't have to build/package an extension. You can just subclass AUAudioUnit directly in your app. The outputProvider property is the block an output unit will call to get audio to send to the output hardware. So, if you created an instance of an AUAudioUnit that was the RIO you could do that -- I believe this was a demo from the 2015 WWDC audio presentation.

I too would be interested in hearing of the solution to the noisy clicks produced by AvAudioPlayerNode.