AVAudioPlayerNode playback delay/current time tracking

I use AVAudioEngine, along with multiple AVAudioPlayerNode to play multiple audio files.

The setup is basic, in the graph there are, along with the player nodes, mixer units, pitch units.


Every audio file is played back, synchronized with each other.


I now want to keep track very precisely of the current playing time.

To compute the current time I use this code :


guard player.engine != nil,
  let lastRenderTime = player.lastRenderTime,
  lastRenderTime.isSampleTimeValid,
  lastRenderTime.isHostTimeValid else {
  return seekTime
}

let sampleRate = player.outputFormat(forBus: 0).sampleRate
let sampleTime = player.playerTime(forNodeTime: lastRenderTime)?.sampleTime ?? 0
if sampleTime > 0 && sampleRate != 0 {
  return max(seekTime + (Double(sampleTime) / sampleRate), seekTime)
}
return seekTime


where seekTime is a time offset of where the player was last seek (0 at start).


While this code produces a relatively correct value, I observe a small (well a big in fact) difference, around 100ms between this value and the real time.

The lastRenderTime starts to advance before the first frames are rendered it seems.

I verified in audio software for example in the click track, the first click falls at 3.82s when in my application the value of the current time is around 3.95s when I hear the click.

I suspect it take some time for the engine to decode, buffer, the audio data.

Can anyone confirm this ? And suggest how I could/should compute this delay to return a correct current time value ?


EDIT:


HEre are more details about the setup.

I use the function scheduleSegment() on player nodes to provide audio data. Once scheduled, I call prepare(withFrameCount:) on each, I don't use buffers at all myself. Waiting before the call to play(at:) doesn't make any difference.

THe AVAudioEngine never stop and is started before I even schedule the first segment on each player node.

SEgments are different parts of the audio files, several seconds...

SCheduling shorter segments (I tried 0.5s segments) doesn't seem to make any difference.

ONce I call play(at:) on player nodes, the lastRenderTime immediately starts to advance, while the playback seems to take time to start, hence the noticeable delay I notice (~100 ms).

Can someone explain me how this latency could relate to buffer size ? A tap block I install on AVAudioPlayerNode give me a buffer of length 4410 at rate 44100 Hz, can I rely on this being the latency I observe ?