How can I prevent AVCaptureVideoDataOutput from automatically downscaling image buffers?

Hi,


In our iOS App, we need to access uncompressed full-resolution frames from the back-facing video camera in an AVCaptureSession with an AVCaptureVideoDataOutput. Up to this point, we've been configuring our session using AVCaptureSessionPresetInputPriority (through the -setActiveFormat: API), but according to WWDC 2016 Session 501 "Advances in iOS Photography", RAW and Live Photo capture in iOS 10 (using AVCapturePhotoOutput) will only be supported for AVCaptureSessionPresetPhoto, which is why we're forced to make a switch to the -setSessionPreset: API.


Unfortunately, this change somehow causes AVCaptureVideoDataOutput to apply an internal downscaling operation (see example below) to the image buffers prior to delivering them to our AVCaptureVideoDataOutputSampleBufferDelegate callback! The delivered frame turns out to be much lower in resolution compared to the active device format and doesn't even match the format description provided by the AVCaptureVideoDataOutput's AVCaptureInputPort.


Code example (for iOS 9.3/XCode 7):


#import <UIKit/UIKit.h>
@import AVFoundation;

@interface ViewController : UIViewController<AVCaptureVideoDataOutputSampleBufferDelegate>
@property (nonatomic) AVCaptureSession* captureSession;
@property (nonatomic) AVCaptureVideoDataOutput* captureVideoDataOutput;
@property (nonatomic) AVCaptureDevice* videoDevice;
@property (nonatomic) dispatch_queue_t  sampleBufferQueue;
@end

@implementation ViewController

- (id)initWithCoder:(NSCoder *)aDecoder
{
    if (self = [super initWithCoder:aDecoder])
    {
        self.captureSession = [[AVCaptureSession alloc] init];
        self.sampleBufferQueue = dispatch_queue_create("sampleBufferQueue", DISPATCH_QUEUE_CONCURRENT);
    }
    return self;
}

- (void)configureCaptureSession;
{
    NSArray *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
    self.videoDevice = devices[[devices indexOfObjectPassingTest:^BOOL(AVCaptureDevice* device, NSUInteger idx, BOOL *stop) {
        if(device.position == AVCaptureDevicePositionBack) {
            *stop = TRUE; return(YES);
        }
        return(NO);
    }]];
    assert(self.videoDevice);

    [self.captureSession beginConfiguration];
    NSError* error;
    AVCaptureDeviceInput *videoDeviceInput = [AVCaptureDeviceInput deviceInputWithDevice:self.videoDevice error:&error];

    assert(videoDeviceInput && [self.captureSession canAddInput:videoDeviceInput]);
    [self.captureSession addInput:videoDeviceInput];

    self.captureSession.sessionPreset = AVCaptureSessionPresetPhoto;
    NSLog(@"videoDevice.activeFormat: %@", self.videoDevice.activeFormat);

    /
    self.captureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
    {
        assert([self.captureSession canAddOutput:self.captureVideoDataOutput]);
        [self.captureSession addOutput:self.captureVideoDataOutput];

        [self.captureVideoDataOutput setAlwaysDiscardsLateVideoFrames:YES];
        [self.captureVideoDataOutput setVideoSettings:@{ (__bridge NSString*)kCVPixelBufferPixelFormatTypeKey : @(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)}];

        [self.captureVideoDataOutput setSampleBufferDelegate:self queue:self.sampleBufferQueue];
    }

    [self.captureSession commitConfiguration];
    NSLog(@"captureVideoDataOutput.videoSettings: %@", self.captureVideoDataOutput.videoSettings);
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    const size_t pixelBufferWidth = CVPixelBufferGetWidth(imageBuffer);
    const size_t pixelBufferHeight = CVPixelBufferGetHeight(imageBuffer);
    NSLog(@"Pixel buffer dimensions in delegate callback: %zu x %zu", pixelBufferWidth, pixelBufferHeight);

    for(AVCaptureInputPort* port in connection.inputPorts)
    {
        if(port.mediaType == AVMediaTypeVideo)
        {
            CMVideoDimensions videoDimensions = CMVideoFormatDescriptionGetDimensions(port.formatDescription);
            NSLog(@"AVCaptureInputPort video format dimensions: %d x %d", videoDimensions.width, videoDimensions.height);
            break;
        }
    }
}

- (void)viewDidDisappear:(BOOL)animated
{
    [self.captureSession stopRunning];
    [super viewDidDisappear:animated];
}

- (void)viewWillAppear:(BOOL)animated
{
    [super viewWillAppear:animated];
    [self.captureSession startRunning];
}

- (void) viewDidAppear:(BOOL)animated
{
    [super viewDidAppear:animated];
    [AVCaptureDevice requestAccessForMediaType:AVMediaTypeVideo completionHandler:nil];
}

- (void)viewDidLoad
{
    [super viewDidLoad];
    [self configureCaptureSession];
}

@end


Running the example above on an iPhone 6 (under iOS 9.3), the console output will read as follows (interesting parts marked bold):


--START--


videoDevice.activeFormat: <AVCaptureDeviceFormat: 0x12ee051f0 'vide'/'420f' 3264x2448, { 2- 30 fps}, fov:58.040, max zoom:153.00 (upscales @1.00), AF System:2, ISO:29.0-1856.0, SS:0.000025-0.500000>


captureVideoDataOutput.videoSettings: {

AVVideoScalingModeKey = AVVideoScalingModeResize;

Height = 750;

PixelFormatType = 875704422;

Width = 1000;

}


Pixel buffer dimensions in delegate callback: 1000 x 750

AVCaptureInputPort video format dimensions: 3264 x 2448


--END--


Note the discrepancies between the active device format (3264x2448) and the frame size inside the delegate callback (1000x750)!

Also note the undocumented dictionary keys AVVideoScalingModeKey, Width and Height in captureVideoDataOutput.videoSettings. (I tried using a combination of those keys as input to setVideoSettings:, but that only resulted in the error "- videoSettings dictionary contains one or more unsupported (ignored) keys:").



So, here are my questions:


  • How can I prevent this unwanted downscaling from happening with AVCaptureSessionPresetPhoto?
  • Will there be any way to support RAW and Live Photo capture (in iOS 10) trough the -setActiveFormat: API?

Accepted Reply

For the time being, RAW support and LivePhoto support is limited to AVCaptureSessionPresetPhoto, which unfortunately means you will not be able to get full res video data output AND get RAW still images. Please file an enhancement request at bugreport.apple.com.

Replies

You have a fundamental misunderstanding about AVCaptureSessionPresetInputPriority. This is not a preset you're meant to set. It's a preset that gets set automatically when you choose an AVCaptureDeviceFormat yourself using AVCaptureDevice setActiveFormat:.


The AVCaptureSessionPresetPhoto preset is a special case with respect to video data output. It always provides preview sized buffers to video data output. Always has. This is because most applications doing photographic things use video data output as a stand-in for video preview (perhaps they want to show a filter in real -time by drawing the preview themselves). Real-time preview filtering would be nigh unto impossible with full resolution buffers.


If you want the full-resolution photo-sized buffers, you can use AVCaptureDevice's setActiveFormat: to set the format to whatever you want, including 420f 4032x3024 — and in that case, you will get 12 MP buffers out of VideoDataOutput. But full res VDO is *incompatible* with LivePhoto.


So in summary, if you want Live Photo, which is only supported in Photo preset, your video data output will be scaled down to the screen resolution. That's just the way it is. There are technical reasons for it.

Thanks for your quick reply, but I don't think I "fundamentally misunderstood" AVCaptureSessionPresetInputPriority. In my question above, I used the terms "AVCaptureSessionPresetInputPriority" and "AVCaptureDevice setActiveFormat:" interchangeably, which perhaps caused some confusion. Of course, as is evident from the SDK documentation, the correct way to enable AVCaptureSessionPresetInputPriority is to call "AVCaptureDevice setActiveFormat:".


That was not the point of my question, though.


We are actually working on an application doing "photographic things" and I'm very interested in enabling RAW (i.e. Bayer-mosaic frame buffer) capture using the new AVCapturePhotoOutput API. For now, we don't really care about Live Photos.


From my tests with the iOS 10 Beta SDK, I discovered that unless I use AVCaptureSessionPresetPhoto to configure my AVCaptureSession, AVCapturePhotoOutput won't offer me any RAW pixel formats (i.e. availableRawPhotoPixelFormatTypes returns an empty array). This behavior is consistent with the infos from the WWDC 2016 Session 501 presentation.


For a live-preview of the video data we don't use AVCaptureVideoPreviewLayer (as many other apps probably do), but our own Metal-based renderer (using CVMetalTextureCacheCreateTextureFromImage to obtain a CVMetalTexture object from the CVImageBuffer we get from the AVCaptureVideoDataOutputSampleBufferDelegate callback). In addition, we're applying our own ROI-based image processing to certain parts of the frame buffer. That's our main reason for wanting full-resolution video data output from AVCaptureVideoDataOutput. We don't really care if the VDO frame rate is 30fps, 15fps, or just 5fps.


What I described above already works fine with "AVCaptureDevice setActiveFormat:" (a.k.a. AVCaptureSessionPresetInputPriority), with the important exception that AVCapturePhotoOutput refuses to capture RAW images (because the capture session was not configured with AVCaptureSessionPresetPhoto). However, when I use AVCaptureSessionPresetPhoto, I'm not getting full-resolution video data output from AVCaptureVideoDataOutput.


I accept that full-resolution VDO is "incompatible" with Live Photos for technical reasons. However, the same should not be true for capturing RAW (Bayer-mosaic) frame buffers. Would it not be possible in principle to get full-res VDO when Live Photo capture is disabled (i.e. AVCapturePhotoOutput setLivePhotoCaptureEnabled:FALSE)?

For the time being, RAW support and LivePhoto support is limited to AVCaptureSessionPresetPhoto, which unfortunately means you will not be able to get full res video data output AND get RAW still images. Please file an enhancement request at bugreport.apple.com.

Filed as enhancement request #28027968.

Thanks, Brad, but I find myself needing to set the preset to inputPriority, otherwise when I add the device to the session, it gets reset. My code is:


output = AVCaptureMovieFileOutput()
session = AVCaptureSession()
try! device.lockForConfiguration()
defer { device.unlockForConfiguration() }
device.setFormatWithHighestIso()
// This iterates through the formats and sets the activeFormat to one with the highest maxISO,
// which is 2176 on the iPhone 5s.


device.activeVideoMinFrameDuration = CMTime(value: 1, timescale: 24)

session.sessionPreset = AVCaptureSessionPresetInputPriority
// If you remove this, the next line of code resets the format to one with maxISO 544
// (on the iPhone 5s).

session.addInput(try! AVCaptureDeviceInput(device: device))

session.addOutput(movieOutput)
session.startRunning()


What am I doing wrong?

Add your device to the session first. Then configure it to your liking.


Whenever you add an AVCaptureDeviceInput to a session, it configures the device's activeFormat to whatever is appropriate for its current sessionPreset. When the session's preset is input priority, it does not change the active format, since it is assumed the client (you) set the format you wanted already. But be aware that several device properties are reset when you add it to the session, such as its zoom factor, and active min/max frame rates, whether the preset is input priority or not. This is because AVCaptureDevices are singletons, and the principle of least surprise would dictate that a previous use of a device in another session not leave it in a specialized (wonky) state that's contrary to expected/standard use. This behavior is documented in AVCaptureDevice.h.


The most predictable way to get what you want is to add all your input and outputs first, then perform all your configuration, including setting of session presets, or setting of device active formats.


Hope that helps.