Is there a reason mediafilesegmenter generates fMP4s with non-0 tfdt?

We are writting our own distribution platform, which for the most part, works well with HLS on iOS/macOS/Safari. However, in trying to generate DASH-compatible playlist files, we noticed that the fMP4s mediafilesegmenter generates have a non-0 `tfdt` box (the track decode time) for the first segment. Typically, we've seen this value be exactly 10 seconds (seen in video), but sometimes it can also be 9.945 seconds as well (seen in audio), so we were wondering what the reason for this was. Would it be a bad idea to "correct" this value to 0 for the first segment, and adjust downstream segments accordingly? Is the first segment's `tfdt` actually dependent on the source?


Thank you for any insight you may be able to provide,

Dimitri

Accepted Reply

Seems like all my questions were answered in https://developer.apple.com/videos/play/wwdc2020/10011

Basically, audio and video tracks are moved forward by ~10seconds to account for audio priming, and although audio timing is started earlier, HLS waits for the first track of video to start before starting it’s presentation clock.

Replies

Seems like all my questions were answered in https://developer.apple.com/videos/play/wwdc2020/10011

Basically, audio and video tracks are moved forward by ~10seconds to account for audio priming, and although audio timing is started earlier, HLS waits for the first track of video to start before starting it’s presentation clock.
HLS segments, even in VOD assets, come from a wide variety of sources. This includes VODs "recorded" from live streams and excerpts from longer assets (clips). In general HLS clients cannot assume any relationship between the media timestamp of the first segment (tfdt in fmp4, ES PTS in MPEG2-TS, clock time in WebVTT) and presentation time.

As a reference tool, mediafilesegmenter is intended to support interoperability. Creating example streams that do not start at media timestamp 0 by default helps reinforce the fact that in general they do not. This initial value is hard-coded and does not depend on the input file.

It won't harm HLS playback to start the first segment at timestamp 0. So if there's some non-HLS reason to do so, go ahead and do that. Maybe double-check that it's really necessary first though.
Thank you for clarifying! Up until now, I was hard coding it in out other players, but since we control the whole pipeline for our media, it’s good to know that we can simplify things.