Hello everyone,
I have a PyTorch model that outputs an image. I converted this model to CoreML using coremltools, and the resulting CoreML model can be used in my iOS project to perform inference using the MLModel
's prediction
function, which returns a result of type CVPixelBuffer
.
I want to avoid allocating memory every time I call the prediction
function. Instead, I would like to use a pre-allocated buffer. I noticed that MLModel
provides an overloaded prediction
function that accepts an MLPredictionOptions
object. This object has an outputBackings
member, which allows me to pass a pre-allocated CVPixelBuffer
.
However, when I attempt to do this, I encounter the following error:
Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported.
Could someone point out what I might be doing wrong? How can I make MLModel
use my pre-allocated CVPixelBuffer
instead of creating a new one each time?
Here is the Python code I used to convert the PyTorch model to CoreML, where I specified the color_layout
as coremltools.colorlayout.BGR
:
def export_ml(model, resolution="640x360"):
ml_path = f"model.mlpackage"
print("exporting ml model")
width, height = map(int, resolution.split('x'))
img0 = torch.randn(1, 3, height, width)
img1 = torch.randn(1, 3, height, width)
traced_model = torch.jit.trace(model, (img0, img1))
input_shape = ct.Shape(shape=(1, 3, height, width))
output_type_img = ct.ImageType(name="out", scale=1.0, bias=[0, 0, 0], color_layout=ct.colorlayout.BGR)
ml_model = ct.convert(
traced_model,
inputs=[input_type_img0, input_type_img1],
outputs=[output_type_img]
)
ml_model.save(ml_path)
Here is the Swift code in my iOS project that calls the MLModel
's prediction
function:
func prediction(image1: CVPixelBuffer, image2: CVPixelBuffer, model: MLModel) -> CVPixelBuffer? {
let options = MLPredictionOptions()
guard let outputBuffer = outputBacking else {
fatalError("Failed to create CVPixelBuffer.")
}
options.outputBackings = ["out": outputBuffer]
// Perform the prediction
guard let prediction = try? model.prediction(from: RifeInput(img0: image1, img1: image2), options: options) else {
Log.i("Failed to perform prediction")
return nil
}
// Extract the result
guard let cvPixelBuffer = prediction.featureValue(for: "out")?.imageBufferValue else {
Log.i("Failed to get results from the model")
return nil
}
return cvPixelBuffer
}
Here is the code I used to create the outputBacking
:
let attributes: [String: Any] = [
kCVPixelBufferCGImageCompatibilityKey as String: true,
kCVPixelBufferCGBitmapContextCompatibilityKey as String: true,
kCVPixelBufferWidthKey as String: Int(640),
kCVPixelBufferHeightKey as String: Int(360),
kCVPixelBufferIOSurfacePropertiesKey as String: [:]
]
let status = CVPixelBufferCreate(kCFAllocatorDefault, 640, 360, kCVPixelFormatType_32BGRA, attributes as CFDictionary, &outputBacking)
guard let outputBuffer = outputBacking else {
fatalError("Failed to create CVPixelBuffer.")
}
Any help or guidance would be greatly appreciated!
Thank you!