Can we use Vision API + NLP to read text from image and categorize like Cloud Vision API ?

Can we use Vision API to extract text from image and use that text to categorize it. For example get improtant infromation from scanned document.

Replies

Nope. Vision only gives you the rectangles that contain text, it does not have an API to convert these image regions to text.

Actually it's possible by adding a CoreML model such as MNIST.

So, with Vision, you detect the bounding box, then you extract the image portion inside the bounding box and you give it to MNIST model.


One precompiled model for CoreML is here:

https://github.com/ph1ps/MNIST-CoreML

Take a look at my blog:


neurosurg dot de


It also includes training and samples ..