naturalSize is (4.0, 3.0) even though the source is (1280x960)

I need to merge multiple videos in a track, and at the same time, save the last frame of each source track.

Here is what I'm doing :

Code Block swift
var composition = AVMutableComposition()
guard let track=self.composition.addMutableTrack(withMediaType: .video, preferredTrackID: kCMPersistentTrackID_Invalid) else {
                    return
                }
for i in 0..<self.videos.count {
    guard let asset_track = self.videos[i].asset.tracks(withMediaType: .video).first else {
        return
    }
    let trackStart = CMTime(seconds: self.videos[i].date.timeIntervalSince(start), preferredTimescale: 10000)
    try? track.insertTimeRange(asset_track.timeRange, of:asset_track , at: trackStart)
    let p = AVPlayerItem(asset: track.asset!)
    let q = AVPlayer(playerItem: p)
    print(track.naturalSize)
    framesTimeRanges.append(NSValue(time:CMTime(value: 0, timescale: 10000)))
}
let generator = AVAssetImageGenerator(asset: track.asset!)
generator.requestedTimeToleranceBefore = CMTime.zero
generator.requestedTimeToleranceAfter = CMTime.zero
var i=self.videos.count-1
generator.generateCGImagesAsynchronously(forTimes: framesTimeRanges.reversed(), completionHandler: { time, image, actual, result, error in
    if result == .succeeded {
        self.videos[i].lastFrame=NSImage(cgImage: image!, size: NSSize(width: 1280, height: 960))
    }
    i=i-1
})


On line 19, the output is (4.0, 3.0), that's where things starts to go sideways.

After that, every CGImage I got from line 31 are 4x3 tiny images.

This happens only if I use the composition track from line 1, if I use each individual tracks from line 7 to get their last frame, whether I'm using generateCGImagesAsynchronously or copyCGImage it works fine, but I thought about using generateCGImagesAsynchronously only once to improve performance.

I should add that the composition is actual playing with from its AVPlayerLayer, it's just that it seems its internal size representation is (4.0, 3.0) for some reason.

How can I generate frames from the composition track with the correct size of the actual video in the track ?
After more digging, it seems that these video I'm working with indeed has a naturalSize of (4.0, 3.0).

When opened in QuickTime, when you bring the Inspector, it says they are display at Current Scale of 272x or so, while a regular video I tool from internet is displayed at 0.62x for example if that's a HD video.

Anybody knows :
  • what metadata that information is and how to deal with it?

  • Why it is different on the original track than when put on a different composition?

  • How I can extract a frame on such videos?

naturalSize is (4.0, 3.0) even though the source is (1280x960)
 
 
Q