Hello,
I'm wondering if there's any image manipulation done internally by the Vision framework itself, before the text recognition is performed?
I'm asking because I know there's a CIDocumentEnhancer filter that I could apply to the image before passing it over to the VNRecognizeTextRequest. However, I would like to avoid it, if something similar is already done internally.
If no manipulation is done internally, would you recommend performing something like below?
Also, based on my testing, the text recognition request consistently took ~10% longer, when provided with the filtered image vs. the original captured image. Any ideas why?
Looking forward to hearing your thoughts. Thanks!
I'm wondering if there's any image manipulation done internally by the Vision framework itself, before the text recognition is performed?
I'm asking because I know there's a CIDocumentEnhancer filter that I could apply to the image before passing it over to the VNRecognizeTextRequest. However, I would like to avoid it, if something similar is already done internally.
If no manipulation is done internally, would you recommend performing something like below?
Code Block let capturedImage = CIImage(cvPixelBuffer: pixelBuffer) let filter = CIFilter(name: "CIDocumentEnhancer") filter?.setValue(capturedImage, forKey: kCIInputImageKey) filter?.setValue(5, forKey: kCIInputAmountKey) let filteredImage = filter?.outputImage
Also, based on my testing, the text recognition request consistently took ~10% longer, when provided with the filtered image vs. the original captured image. Any ideas why?
Looking forward to hearing your thoughts. Thanks!
The VNRecognizeTextRequest does not do any preprocessing. Based on your knowledge of what you are trying to read you can significantly enhance the results through preprocessing the image using CoreImage like contrast enhancement or doing perspective correction when used together with the rectangle detector.