Seeking to specific frame index wi… | Apple Developer Forums

Seeking to specific frame index with AVAssetReader

We're using code based on AVAssetReader to get decoded video frames through AVFoundation.

The decoding part per se works great but the seeking just doesn't work reliably. For a given H.264 file (in the MOV container) the decoded frames have presentation time stamps that sometimes don't correspond to the actual decoded frames.

So for example: the decoded frame's PTS is 2002/24000 but the frame's actual PTS is 6006/24000. The frames have burnt-in timecode so we can clearly tell.

Here is our code:

- (BOOL) setupAssetReaderForFrameIndex:(int32_t) frameIndex
{
    NSError* theError = nil;
    NSDictionary* assetOptions = @{ AVURLAssetPreferPreciseDurationAndTimingKey: @YES };
    self.movieAsset = [[AVURLAsset alloc] initWithURL:self.filePat options:assetOptions];

    if (self.assetReader)
        [self.assetReader cancelReading];

    self.assetReader = [AVAssetReader assetReaderWithAsset:self.movieAsset error:&theError];

    NSArray<AVAssetTrack*>* videoTracks = [self.movieAsset tracksWithMediaType:AVMediaTypeVideo];
    if ([videoTracks count] == 0)
        return NO;

    self.videoTrack = [videoTracks objectAtIndex:0];
    [self retrieveMetadata];

    NSDictionary* outputSettings = @{ (id)kCVPixelBufferPixelFormatTypeKey: @(self.cvPixelFormat) };

    self.videoTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:self.videoTrack outputSettings:outputSettings];
    self.videoTrackOutput.alwaysCopiesSampleData = NO;
    [self.assetReader addOutput:self.videoTrackOutput];

    CMTimeScale timeScale = self.videoTrack.naturalTimeScale;
    CMTimeValue frameDuration = (CMTimeValue)round((float)timeScale/self.videoTrack.nominalFrameRate);
    CMTimeValue startTimeValue = (CMTimeValue)frameIndex * frameDuration;
    CMTimeRange timeRange = CMTimeRangeMake(CMTimeMake(startTimeValue, timeScale), kCMTimePositiveInfinity);

    self.assetReader.timeRange = timeRange;
    [self.assetReader startReading];

    return YES;
}

This is then followed by this code to actually decode the frame:

CMSampleBufferRef sampleBuffer = [self.videoTrackOutput copyNextSampleBuffer];
CVPixelBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

if (!imageBuffer)
{
    CMSampleBufferInvalidate(sampleBuffer);
    AVAssetReaderStatus theStatus = self.assetReader.status;
    NSError* theError = self.assetReader.error;
    NSLog(@"[AVAssetVideoTrackOutput copyNextSampleBuffer] didn't deliver a frame - %@", theError);
    return false;
}

Is this method by itself the correct way of seeking and if not: what is the correct way?

Thanks!