Here is the setup.
I have an UIImageView in which I write some text, using UIGraphicsBeginImageContext.
I pass this image to the OCR func:
func ocrText(onImage: UIImage?) {
let request = VNRecognizeTextRequest { request, error in
guard let observations = request.results as? [VNRecognizedTextObservation] else {
fatalError("Received invalid observations")
}
print("observations", observations.count)
for observation in observations {
if observation.topCandidates(1).isEmpty {
continue
}
}
} // end of request handler
request.recognitionLanguages = ["fr"]
let requests = [request]
DispatchQueue.global(qos: .userInitiated).async {
let ocrGroup = DispatchGroup()
guard let img = onImage?.cgImage else { return } // Conversion to cgImage works OK
print("img", img, img.width)
let (_, _) = onImage!.logImageSizeInKB(scale: 1)
ocrGroup.enter()
let handler = VNImageRequestHandler(cgImage: img, options: [:])
try? handler.perform(requests)
ocrGroup.leave()
ocrGroup.wait()
}
}
Problem is that observations is an empty array. I get the following logs:
img <CGImage 0x7fa53b350b60> (DP)
<<CGColorSpace 0x6000032f1e00> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1)>
width = 398, height = 164, bpc = 8, bpp = 32, row bytes = 1600
kCGImageAlphaPremultipliedFirst | kCGImageByteOrder32Little | kCGImagePixelFormatPacked
is mask? No, has masking color? No, has soft mask? No, has matte? No, should interpolate? Yes 398
ImageSize(KB): 5 ko
2022-06-02 17:21:03.734258+0200 App[6949:2718734] Metal API Validation Enabled
observations 0
Which shows image is loaded and converted correctly to cgImage. But no observations.
Now, if I use the same func on a snapshot image of the text drawn on screen, it works correctly.
Is there a difference between the image created by camera and image drawn in CGContext ?
Here is how mainImageView!.image (used in ocr) is created in a subclass of UIImageView:
override func touchesEnded(_ touches: Set<UITouch>, with event: UIEvent?) {
// Merge tempImageView into mainImageView
UIGraphicsBeginImageContext(mainImageView!.frame.size)
mainImageView!.image?.draw(in: CGRect(x: 0, y: 0, width: frame.size.width, height: frame.size.height), blendMode: .normal, alpha: 1.0)
tempImageView!.image?.draw(in: CGRect(x: 0, y: 0, width: frame.size.width, height: frame.size.height), blendMode: .normal, alpha: opacity)
mainImageView!.image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
tempImageView?.image = nil
}
I also draw the created image in a test UIImageView and get the correct image.
Here are the logs for the drawn texte and from the capture:
Drawing doesn't work
img <CGImage 0x7fb96b81a030> (DP)
<<CGColorSpace 0x600003322160> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1)>
width = 398, height = 164, bpc = 8, bpp = 32, row bytes = 1600
kCGImageAlphaPremultipliedFirst | kCGImageByteOrder32Little | kCGImagePixelFormatPacked
is mask? No, has masking color? No, has soft mask? No, has matte? No, should interpolate? Yes 398
ImageSize(KB): 5 ko
2022-06-02 15:38:51.115476+0200 Numerare[5313:2653328] Metal API Validation Enabled
observations 0
Screen shot : Works
img <CGImage 0x7f97641720f0> (IP)
<<CGColorSpace 0x60000394c960> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; iMac)>
width = 570, height = 276, bpc = 8, bpp = 32, row bytes = 2280
kCGImageAlphaNoneSkipLast | 0 (default byte order) | kCGImagePixelFormatPacked
is mask? No, has masking color? No, has soft mask? No, has matte? No, should interpolate? Yes 570
ImageSize(KB): 5 ko
2022-06-02 15:43:32.158701+0200 Numerare[5402:2657059] Metal API Validation Enabled
2022-06-02 15:43:33.122941+0200 Numerare[5402:2657057] [WARNING] Resource not found for 'fr_FR'. Character language model will be disabled during language correction.
observations 1
Is there an issue with kCGColorSpaceModelRGB ?
I finally found a way to get it.
I save the image to a file as jpeg and read the file back.
This didn't work with png, but works with jpeg.
Here is the simple code (in case someone has a better solution to propose):
var formattedImage: UIImage?
let imageData = imageView?.image!.jpegData(compressionQuality: 1.0) // image is drawn in an imageView
let fileManager = FileManager.default
let paths = NSSearchPathForDirectoriesInDomains(
FileManager.SearchPathDirectory.documentDirectory,
FileManager.SearchPathDomainMask.userDomainMask, true)
let documentsDirectory = paths[0] as NSString
let fileExt = "TempJpegImage.jpg"
let fileName = documentsDirectory.appendingPathComponent(fileExt) as String
fileManager.createFile(atPath: fileName, contents: imageData, attributes: nil) // Let's create a temp file
if let imageJPEG = UIImage(contentsOfFile: fileName) { // Read the image back
formattedImage = imageJPEG
}
Now, formattedImage is passed succesfully to ocrText