freedmand’s Profile | Apple Developer Forums

VisionKit - get bounding boxes from ImageAnalysis

I am developing a command line application to extract text from images and PDF files. The ImageAnalysis class from VisionKit provides high quality OCR but does not appear to have functionality to get the position of extracted text (words, etc.). This functionality appears to be in place in a private unexposed API, since the ImageAnalysisOverlayView is able to leverage it to show the live text interface. Is there any way to get this information in a terminal application with no displayed UI? (Note: I filed a feedback request for this over 3 months ago and have yet to hear back)

Machine Learning & AI General VisionKit

864

Dec ’22

freedmand

Post

Replies

Boosts

Views

Activity