Can we use Vision API + NLP to read text from image and categorize like Cloud Vision API ?

Question

Can we use Vision API to extract text from image and use that text to categorize it. For example get improtant infromation from scanned document.

967

Posted by

LeelaRamala

Reply

Answer 1

Nope. Vision only gives you the rectangles that contain text, it does not have an API to convert these image regions to text.

Posted by

kerfuffle

Answer 2

Actually it's possible by adding a CoreML model such as MNIST.

So, with Vision, you detect the bounding box, then you extract the image portion inside the bounding box and you give it to MNIST model.

One precompiled model for CoreML is here:

Posted by

EinharchAltPlus

Answer 3

Take a look at my blog:

neurosurg dot de

It also includes training and samples ..

Posted by

DrNeurosurg

Replies