I need some clarity on AVPlayer/AVAssetReader...

Hello -- I've been trying to use AVFoundation to read frames from a quicktime file and display them in Metal as texture maps. It's turning out to be more difficult than I thought. I've tried one approach which uses AVPlayerItemVideoOutput.hasNewPixelBuffer() to grab frames as they are playing. This sort of works, but it only plays for a fraction of a second and then starts freezing and skipping. Also, it's not clear how to pause the video and display one random frame this way -- it seems like AVPlayer is meant to only be used for dynamic, changing video. I want to be able to display a frame, then step forward or backward. I want to be able to just jump to the frame at 3.5 seconds, for example.


I thought that maybe AVAssetReader would be better, but I cannot get it to work at all (see this thread:). But besides this problem, it seems like even AVAssetReader is not equipped for random-access. It only has copyNextSampleBuffer() to access sequential frames of video.


Why is it so hard to just access the frame data for one particular point in the timeline? Anyone have any suggestions on this? Am I understanding AVPlayer and AVAssetReader correctly? Neither one seems to be able to just give me the frame that would be showing at a requested time in the video.


Thanks

Bob

Replies

AVPlayer isn't going to work, because you almost certainly can't grab the frames fast enough (as you seem to have experienced).


An AVAssetReader is (AFAIK) intended primarily for sequential access, as when you're converting a video frame-by-frame. You could presumably use this to step through the frames until you find the one you want, but I agree it's not ideal.


Looking at the documentation, I see there is an AVAssetImageGenerator class that looks like it does what you want. Note though that it may not be frame-accurate, depending on the video format, because finding a specific frame can be hard in some formats. It also uses a default combination of tracks, so if you want something different, you may have to construct your own AV[Mutable]Asset to use a known set of track.


I've never tried to use AVAssetImageGenerator, but it does sort of look like what you want.

Thanks, I will check out AVAssetImageGenerator. I had not seen that mentioned anywhere before.


I would agree about AVPlayer if it stuttered right from the beginning, but it plays a good chunk of a second before it halts -- if it's able to get those first frames, why can't it keep going? I've seen some other threads where they say CVMetalTextureCacheCreate may be reusing textures before the renderer is finished with them -- but I've tried holding on to references to the textures and it hasn't helped.

>> if it's able to get those first frames, why can't it keep going?


Because playback is in Real™ real-time. (Maybe not "kills your grandmother" real-time, but at least "stutters your video" real-time.) I assume the playback decoding is synchronized to the I/O, not the other way around. You put yourself in the mix, and bam! — dead granny.

Well, I can't say I really understand that answer, but I understand that AVPlayer is probably not going to work.

I've started to try AVAssetImageGenerator -- I'm able to get the first frame, but the colors are wrong. It looks like the cgImage is in a different pixel format from what MTKTextureLoader thinks it is -- even though MTKTextureLoader is supposed to get it pixel format from the image data. I'm going to try not using MTKTextureLoader.


I'd like to use CVMetalTextureCacheCreateTextureFromImage, but that needs a CVImageBufferRef and converting to that from a cgImage looks like it is not a simple operation unfortunately. Gah!

Using AVPlayer and AVPlayerItemVideoOutput has worked flawlessly for me for playing out a full video. While it may not suit your request for random access anywhere in the movie, I can't see why my solution couldn't be adapted provided that your appropriately adjusting the 'outputTime' you're feeding the thing.


Anyways, I use a the render loop of a CVDisplayLink to query for available pixel buffers...

// 'outputTime' is the CVTimeStamp from the master display link callback
AVPlayerItemVideoOutput *output = (AVPlayerItemVideoOutput *)myAVPlayer.currentItem.outputs.lastObject;
CMTime curTime = [output itemTimeForCVTimeStamp:outputTime];
if ([output hasNewPixelBufferForItemTime:curTime]) {
   CVPixelBufferRef cvBuff = [output copyPixelBufferForItemTime:curTime itemTimeForDisplay:nil];


I then pass this pixel buffer off to the MTKView for processing. The pixel buffer can (and must) then safely be discarded.
I should mention that I have the MTKView set up for manual drawing... ie on the setup of the view, I have this:

self.enableSetNeedsDisplay = NO;
self.paused = YES;

which necessitates my needing to manually issue the draw call from the CVDisplayLink render loop each time a pixel buffer is available. You'll get warnings about calling the UI off the main thread (ie. with the draw call), but these warnings can be safely ignored.


BTW, an advantage to using a CVDisplayLink is that you don't incur any stutter when mousing up on your application's menus (or when an NSCollectionView populates). Other solutions (and even some of Apple's code samples) seem not to avoid this issue -- my experience has shown that, if you want to avoid that stuttering, you must issue the draw call off the main thread.