I have trained a model to classify some symbols using Create ML.
In my app I am using VNImageRequestHandler and VNCoreMLRequest to classify image data.
If I use a CVPixelBuffer obtained from an AVCaptureSession then the classifier runs as I would expect. If I point it at the symbols it will work fairly accurately, so I know the model is trained fairly correctly and works in my app.
If I try to use a cgImage that is obtained by cropping a section out of a larger image (from the gallery), then the classifier does not work. It always seems to return the same result (although the confidence is not a 1.0 and varies for each image, it will be to within several decimal points of it, eg 9.9999).
If I pause the app when I have the cropped image and use the debugger to obtain the cropped image (via the little eye icon and then open in preview), then drop the image into the Preview section of the MLModel file or in Create ML, the model correctly classifies the image.
If I scale the cropped image to be the same size as I get from my camera, and convert the cgImage to a CVPixelBuffer with same size and colour space to be the same as the camera (1504, 1128, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange) then I get some difference in ouput, it's not accurate, but it returns different results if I specify the 'centerCrop' or 'scaleFit' options. So I know that 'something' is happening, but it's not the correct thing.
I was under the impression that passing a cgImage to the VNImageRequestHandler would perform the necessary conversions, but experimentation shows this is not the case. However, when using the preview tool on the model or in Create ML this conversion is obviously being done behind the scenes because the cropped part is being detected.
What am I doing wrong.
tl;dr
- my model works, as backed up by using video input directly and also dropping cropped images into preview sections
- passing the cropped images directly to the VNImageRequestHandler does not work
- modifying the cropped images can produce different results, but I cannot see what I should be doing to get reliable results.
- I'd like my app to behave the same way the preview part behaves, I give it a cropped part of an image, it does some processing, it goes to the classifier, it returns a result same as in Create ML.