Post

Replies

Boosts

Views

Activity

Issue with Using Pre-Allocated CVPixelBuffer for CoreML Model Prediction
Hello everyone, I have a PyTorch model that outputs an image. I converted this model to CoreML using coremltools, and the resulting CoreML model can be used in my iOS project to perform inference using the MLModel's prediction function, which returns a result of type CVPixelBuffer. I want to avoid allocating memory every time I call the prediction function. Instead, I would like to use a pre-allocated buffer. I noticed that MLModel provides an overloaded prediction function that accepts an MLPredictionOptions object. This object has an outputBackings member, which allows me to pass a pre-allocated CVPixelBuffer. However, when I attempt to do this, I encounter the following error: Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported. Could someone point out what I might be doing wrong? How can I make MLModel use my pre-allocated CVPixelBuffer instead of creating a new one each time? Here is the Python code I used to convert the PyTorch model to CoreML, where I specified the color_layout as coremltools.colorlayout.BGR: def export_ml(model, resolution="640x360"): ml_path = f"model.mlpackage" print("exporting ml model") width, height = map(int, resolution.split('x')) img0 = torch.randn(1, 3, height, width) img1 = torch.randn(1, 3, height, width) traced_model = torch.jit.trace(model, (img0, img1)) input_shape = ct.Shape(shape=(1, 3, height, width)) output_type_img = ct.ImageType(name="out", scale=1.0, bias=[0, 0, 0], color_layout=ct.colorlayout.BGR) ml_model = ct.convert( traced_model, inputs=[input_type_img0, input_type_img1], outputs=[output_type_img] ) ml_model.save(ml_path) Here is the Swift code in my iOS project that calls the MLModel's prediction function: func prediction(image1: CVPixelBuffer, image2: CVPixelBuffer, model: MLModel) -> CVPixelBuffer? { let options = MLPredictionOptions() guard let outputBuffer = outputBacking else { fatalError("Failed to create CVPixelBuffer.") } options.outputBackings = ["out": outputBuffer] // Perform the prediction guard let prediction = try? model.prediction(from: RifeInput(img0: image1, img1: image2), options: options) else { Log.i("Failed to perform prediction") return nil } // Extract the result guard let cvPixelBuffer = prediction.featureValue(for: "out")?.imageBufferValue else { Log.i("Failed to get results from the model") return nil } return cvPixelBuffer } Here is the code I used to create the outputBacking: let attributes: [String: Any] = [ kCVPixelBufferCGImageCompatibilityKey as String: true, kCVPixelBufferCGBitmapContextCompatibilityKey as String: true, kCVPixelBufferWidthKey as String: Int(640), kCVPixelBufferHeightKey as String: Int(360), kCVPixelBufferIOSurfacePropertiesKey as String: [:] ] let status = CVPixelBufferCreate(kCFAllocatorDefault, 640, 360, kCVPixelFormatType_32BGRA, attributes as CFDictionary, &outputBacking) guard let outputBuffer = outputBacking else { fatalError("Failed to create CVPixelBuffer.") } Any help or guidance would be greatly appreciated! Thank you!
1
0
396
Sep ’24