I am looping through an audio file, below is my very simple code.
Am looping through 400 frames each time, but I picked 400 here as a random number.
I would prefer to read in by time instead. Let's say a quarter of second. So I was wondering how can I determine the time length of each frame in the audio file?
I am assuming that determining this might differ based on audio formats? I know almost nothing about audio.
var myAudioBuffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: 400)!
guard var buffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: AVAudioFrameCount(input.length)) else {
return nil
}
var myAudioBuffer = AVAudioPCMBuffer(pcmFormat: input.processingFormat, frameCapacity: 400)!
while (input.framePosition < input.length - 1 ) {
let fcIndex = ( input.length - input.framePosition > 400) ? 400 : input.length - input.framePosition
try? input.read(into: myAudioBuffer, frameCount: AVAudioFrameCount(fcIndex))
let volUme = getVolume(from: myAudioBuffer, bufferSize: myAudioBuffer.frameLength)
...manipulation code
}
The property you're looking for is input.processingFormat.sampleRate
.
When you use AVAudioFile
to read audio data, it converts whatever format is contained in the file into a "processing" format, which is just an audio format that's more convenient for … processing. The important point is that the processing format is always a linear PCM format, which consists of sampleRate
samples per second of audio (per channel, if there's more than 1 channel).
For example, if the sample rate of the processing format is 48000, there are 48000 audio samples in each channel for each second of audio.
So, the "time length" of each sample is 1 / sampleRate
seconds. The time length of 400 of them is 400 / sampleRate
seconds. It's very straightforward, because the processing format is deliberately chosen to make things simple for you.
Note that I've been talking about "samples" so far. AVAudioPCMBuffer
s count audio in "frames", not samples, but a frame just consists of a sample from each channel at one time. If your audio is 48K stereo, for example, there are 48000 frames every second, with 2 samples per frame.
Keeping track of the difference between frames and samples is the only "gotcha" here. Usually, it's easiest to think in frames, not samples. In that sense, the sampleRate
property can be thought of as the "frame rate" as well, and the time length of each frame is also 1 / sampleRate
seconds.