




VNRecognizedText returns wrong bounding box
I am trying to parse text from an image, split it into words and store the words in a String array. Additionally I want to store the bounding box of each recognized word. My code works but for some reason the bounding boxes of words that are not separated by a space but by an apostrophe come out wrong. Here is the complete code of my VNRecognizeTextRequestHander: let request = VNRecognizeTextRequest { request, error in guard let observations = request.results as? [VNRecognizedTextObservation] else { return } // split recognized text into words and store each word with corresponding observation let wordObservations = observations.flatMap { observation in observation.topCandidates(1).first?.string.unicodeScalars .split(whereSeparator: { CharacterSet.letters.inverted.contains($0) }) .map { (observation, $0) } ?? [] } // store recognized words as strings recognizedWords = { (observation, word) in String(word) } // calculate bounding box for each word recognizedWordRects = { (observation, word) in guard let candidate = observation.topCandidates(1).first else { return .zero } let stringRange = word.startIndex..<word.endIndex guard let rect = try? candidate.boundingBox(for: stringRange)?.boundingBox else { return .zero } let bottomLeftOriginRect = VNImageRectForNormalizedRect(rect, Int(captureRect.width), Int(captureRect.height)) // adjust coordinate system to start in top left corner let topLeftOriginRect = CGRect(origin: CGPoint(x: bottomLeftOriginRect.minX, y: captureRect.height - bottomLeftOriginRect.height - bottomLeftOriginRect.minY), size: bottomLeftOriginRect.size) print("BoundingBox for word '\(String(word))': \(topLeftOriginRect)") return topLeftOriginRect } } And here's an example for what's happening. When I'm processing the following image: the code above produces the following output: BoundingBox for word 'In': (23.00069557577264, 5.718113962610181, 45.89460636656961, 32.78087073878238) BoundingBox for word 'un': (71.19064286904202, 6.289275587192936, 189.16024359557852, 34.392966621800475) BoundingBox for word 'intervista': (71.19064286904202, 6.289275587192936, 189.16024359557852, 34.392966621800475) BoundingBox for word 'del': (262.64622870703477, 8.558512219726875, 54.733978711037985, 32.79967358237818) Notice how the bounding boxes of the words 'un' and 'intervista' are exactly the same. This happens consistently for words that are separated by an apostrophe. Why is that? Thank you for any help Elias
Dec ’23