Unable to replicate CreateML performance outside of CreateML

I've trained an MLImageClassifier with a direcectory of images, and it has 100% accuracy and recall on a training set. However, when I test the model outside of the MLImageClassifier, I get poor performance with the same training set.


Originally I used the Vision framwork with an exported MLModel, which is where I first noticed the degraded accuracy. However, I can get the same inaccurate results with the `prediction(cgImage:_)` method on MLIMageClassifier.


I suspect it has something to do with the way the images are loaded from the URLs by MLImageClassifier - they are being prepared in a way I don't know how to replicate.


I believe I've accounted for image orientation. The images all have EXIF orientation of 1, which I seems to correspond to UP. Explicitly setting this in a VNIMageRequestHandler produces the same result.


guard let builder = try? MLImageClassifier(trainingData: .labeledDirectories(at: fileUrl)) else {
    fatalError("Could not load data source")
}

//100% precision and recall
builder.evaluation(on: .labeledDirectories(at: URL(fileURLWithPath:"/Users/rob/test_images/")))

let testImageUrl = URL(fileURLWithPath:"/Users/rob/test_images/5.2.jpeg")
let image = NSImage(contentsOf: testImageUrl)!
let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil)!

let r1 = try! builder.prediction(from: testImageUrl)   //CORRECT!
let r2 = try! builder.prediction(from: cgImage)          //INCORRECT!
print("predictions: \(r1) and \(r2)")

////////

// Lets try again using the Vision framework:

func detect(cgImage: CGImage, model: MLModel) {
    guard  let vnModel = try? VNCoreMLModel(for: model) else { return }

    let request = VNCoreMLRequest(model: vnModel) { request, error in
        guard let results = request.results as? [VNClassificationObservation],
            let topResult = results.first else {
                fatalError("unexpected result type from VNCoreMLRequest")
        }
        print(topResult)
    }

    request.imageCropAndScaleOption = .scaleFit

    let handler = VNImageRequestHandler(cgImage: cgImage, orientation: .up, options: [:])
    DispatchQueue.global(qos: .userInitiated).async {
        do {
            try handler.perform([request])
        } catch {
            print(error)
        }
    }
}

let model = builder.model

detect(cgImage: cgImage, model: model)  //Prints the same inaccurate result as line 13



====== UPDATE =======



It seems I can get the correct performance by converting my image to data, and then using the VNImageRequestHandler(data: _) constructor.


On MacOS:


let handler = VNImageRequestHandler(data: nsImage.tiffRepresentation!, options: [:])


On iOS:


let imageData = UIImagePNGRepresentation(UIImage(cgImage: cgImage))!
let handler = VNImageRequestHandler(data: imageData, options: [:])


I also tried using the pixelBuffer constructor, but that had the same poor results as the CGImage constructor. The constructor that took an URL worked well, but wasn't practical for me as I want to deal images that aren't saved somewhere.