Below, the sampleBufferProcessor
closure is where the Vision body pose detection occurs.
/// Transfers the sample data from the AVAssetReaderOutput to the AVAssetWriterInput,
/// processing via a CMSampleBufferProcessor.
///
/// - Parameters:
/// - readerOutput: The source sample data.
/// - writerInput: The destination for the sample data.
/// - queue: The DispatchQueue.
/// - completionHandler: The completion handler to run when the transfer finishes.
/// - Tag: transferSamplesAsynchronously
private func transferSamplesAsynchronously(from readerOutput: AVAssetReaderOutput,
to writerInput: AVAssetWriterInput,
onQueue queue: DispatchQueue,
sampleBufferProcessor: SampleBufferProcessor,
completionHandler: @escaping () -> Void) {
/*
The writerInput continously invokes this closure until finished or
cancelled. It throws an NSInternalInconsistencyException if called more
than once for the same writer.
*/
writerInput.requestMediaDataWhenReady(on: queue) {
var isDone = false
/*
While the writerInput accepts more data, process the sampleBuffer
and then transfer the processed sample to the writerInput.
*/
while writerInput.isReadyForMoreMediaData {
if self.isCancelled {
isDone = true
break
}
// Get the next sample from the asset reader output.
guard let sampleBuffer = readerOutput.copyNextSampleBuffer() else {
// The asset reader output has no more samples to vend.
isDone = true
break
}
// Process the sample, if requested.
do {
try sampleBufferProcessor?(sampleBuffer)
} catch {
/*
The `readingAndWritingDidFinish()` function picks up this
error.
*/
self.sampleTransferError = error
isDone = true
}
// Append the sample to the asset writer input.
guard writerInput.append(sampleBuffer) else {
/*
The writer could not append the sample buffer.
The `readingAndWritingDidFinish()` function handles any
error information from the asset writer.
*/
isDone = true
break
}
}
if isDone {
/*
Calling `markAsFinished()` on the asset writer input does the
following:
1. Unblocks any other inputs needing more samples.
2. Cancels further invocations of this "request media data"
callback block.
*/
writerInput.markAsFinished()
/*
Tell the caller the reader output and writer input finished
transferring samples.
*/
completionHandler()
}
}
}
The processor closure runs body pose detection on every sample buffer so that later in the VNDetectHumanBodyPoseRequest
completion handler, VNHumanBodyPoseObservation
results are fed into a custom Core ML action classifier.
private func videoProcessorForActivityClassification() -> SampleBufferProcessor {
let videoProcessor: SampleBufferProcessor = { sampleBuffer in
do {
let requestHandler = VNImageRequestHandler(cmSampleBuffer: sampleBuffer)
try requestHandler.perform([self.detectHumanBodyPoseRequest])
} catch {
print("Unable to perform the request: \(error.localizedDescription).")
}
}
return videoProcessor
}
How could I improve the performance of this pipeline? After testing with an hour long 4K video at 60 FPS, it took several hours to process running as a Mac Catalyst app on M1 Max.