When is Preprocessing necessary

I was under the impression that vision handles the scaling of images, am I to understand that it does not? If I had a Core ML model that had an image input of 416 x 416 and the image being passed into the model was 1920 x 1080 should I be using Core Image to scale that to the size that the input is looking for?
Answered by Engineer in 615126022
As @kerfuffle mentioned you can use the imageCropAndScale property on Vision VNCoreMLRequest to control how Vision scales inputs to match the requirements of the Core ML model. See: https://developer.apple.com/documentation/vision/vncoremlrequest/2890144-imagecropandscaleoption


If you are using Core ML directly, you can also use some of the MLFeatureValue constructors to help with resizing. Particularly the constructors which take a CGImage or image URL allow you to specify the desired size in pixels or or via the corresponding MLImageConstraint for the input feature. These constructors also take an option dictionary in which you can supply .cropAndScale as the key and a VNImageCropAndScaleOption as the value. See: https://developer.apple.com/documentation/coreml/mlfeaturevalue/3200161-init

Also note that with the Xcode 12 beta, the code generated interface for your model will allow your image inputs to be supplied as CGImages or URL and it will do a default resizing for you via this MLFeatureValue mechanism.

Vision does indeed resize your images, according to the VNImageCropAndScaleOption that you set on the VNCoreMLRequest object.

If you're not using Vision, but you're using the Core ML API, you'll have to do the resizing yourself. Or you can give the model flexible inputs so that it can handle images of different sizes.

Accepted Answer
As @kerfuffle mentioned you can use the imageCropAndScale property on Vision VNCoreMLRequest to control how Vision scales inputs to match the requirements of the Core ML model. See: https://developer.apple.com/documentation/vision/vncoremlrequest/2890144-imagecropandscaleoption


If you are using Core ML directly, you can also use some of the MLFeatureValue constructors to help with resizing. Particularly the constructors which take a CGImage or image URL allow you to specify the desired size in pixels or or via the corresponding MLImageConstraint for the input feature. These constructors also take an option dictionary in which you can supply .cropAndScale as the key and a VNImageCropAndScaleOption as the value. See: https://developer.apple.com/documentation/coreml/mlfeaturevalue/3200161-init

Also note that with the Xcode 12 beta, the code generated interface for your model will allow your image inputs to be supplied as CGImages or URL and it will do a default resizing for you via this MLFeatureValue mechanism.

When is Preprocessing necessary
 
 
Q