I am using VNRecognizeTextRequest to read Chinese characters. It works fine with text written horizontally, but if even two characters are written vertically, then nothing is recognized. Does anyone know how to get the vision framework to either handle vertical text or recognize characters individually when working with Chinese?
I am setting VNRequestTextRecognitionLevel to accurate, since setting it to fast does not recognize any Chinese characters at all. I would love to be able to use fast recognition and handle the characters individually, but it just doesn't seem to work with Chinese. And, when using accurate, if I take a picture of any amount of text, but it's arranged vertically, then nothing is recognized. I can take a picture of 1 character and it works, but if I add just 1 more character below it, then nothing is recognized. It's bizarre.
I've tried setting usesLanguageCorrection = false and tried using VNRecognizeTextRequestRevision3, ...Revision2 and ...Revision1. Strangely enough, revision 2 seems to recognize some text if it's vertical, but the bounding boxes are off. Or, sometimes the recognized text will be wrong.
I tried playing with DataScannerViewController and it's able to recognize characters in vertical text, but I can't figure out how to replicate it with VNRecognizeTextRequest. The problem with using DataScannerViewController is that it treats the whole text block as one item, and it uses the live camera buffer. As soon as I capture a photo, I still have to use VNRecognizeTextRequest.
Below is a code snippet of how I'm using VNRecognizeTextRequest. There's not really much to it and there aren't many other parameters I can try out (plus I've already played around with them). I've also attached a sample image with text laid out vertically.
func detectText(
in sourceImage: CGImage,
oriented orientation: CGImagePropertyOrientation
) async throws -> [VNRecognizedTextObservation] {
return try await withCheckedThrowingContinuation { continuation in
let request = VNRecognizeTextRequest { request, error in
// ...
continuation.resume(returning: observations)
}
request.recognitionLevel = .accurate
request.recognitionLanguages = ["zh-Hant", "zh-Hans"]
// doesn't seem have any impact
// request.usesLanguageCorrection = false
do {
let requestHandler = VNImageRequestHandler(
cgImage: sourceImage,
orientation: orientation
)
try requestHandler.perform([request])
} catch {
continuation.resume(throwing: error)
}
}
}