How do CoreML and Vision classify an image?

Question

Hello! I am new at Machine Learning and Vision, and since WWDC I have been interested in the image classification Apple showed off as well as CreateML. I immediately created a very simple ML model that classified between a hamburger, fries, or a cup of soda, and worked well. I never looked back into it, as I could not think of a use for it. Recently, I have been playing again with it, but I have not had much success with classifying objects.

My first attempt was classifying a traffic light to determine if it is red, yellow, or green, but it was not able to successfully classify images of other traffic lights. My second attempt was classifying 3 hand gestures: holding 1 finger up, 2 fingers up, and 3 fingers up. I had pictures of the hand from 5 different angles (top, front, back, right, left), and again, it was not able to classify them correctly.

After this, I kept wondering if CoreML classifies an image by color, shape, size, or all and if there is a way to only classify by only one, such as color in the traffic light example, or shape as in the hand gesture example. I was using CreateML as specified in this documentation: https://developer.apple.com/documentation/vision/training_a_create_ml_model_to_classify_flowers

Any help or suggestions on how that could be achieved would be highly appreciated. Thanks 🙂

Core ML

1.2k

Posted by

FrankFabregat

Reply

Add a Comment

Answer 1

Create ML uses a pre-trained feature extractor called Vision FeaturePrint. It takes in an image and returns a vector of 2048 numbers -- the features -- that say something interesting about that image. Exactly what these features are is unknown, but they contain information about all kinds of things you can expect to see in photos of everyday objects. Create ML then trains a classifier on top of this that maps these features to your classes.

It's important that you have enough training data, otherwise Create ML cannot learn to make this mapping between the features and your classes. You should enable data augmentation while training, which help with this.

It's also possible that Vision FeaturePrint isn't suitable for your data. In that case you should use a tool like Keras to train the model yourself.

By the way, I wrote a book about this (with my two co-authors). It's called Machine Learning by Tutorials and is sold by Ray Wenderlich. I won't link directly to it, because advertisements on forums are usually frowned upon, but I do believe the book answers your questions in more detail.

Posted by

kerfuffle

Add a Comment

Answer 2

Look at this which deals with colors in CoreML (I have not experienced it)

h ttps://www.raywenderlich.com/188-beginning-machine-learning-with-keras-core-ml

See this thread as well

https://forums.developer.apple.com/thread/94324

Posted by

Claude31

Add a Comment

Answer 3

Create ML uses a pre-trained feature extractor called Vision FeaturePrint. It takes in an image and returns a vector of 2048 numbers -- the features -- that say something interesting about that image. Exactly what these features are is unknown, but they contain information about all kinds of things you can expect to see in photos of everyday objects. Create ML then trains a classifier on top of this that maps these features to your classes.

It's important that you have enough training data, otherwise Create ML cannot learn to make this mapping between the features and your classes. You should enable data augmentation while training, which help with this.

It's also possible that Vision FeaturePrint isn't suitable for your data. In that case you should use a tool like Keras to train the model yourself.

By the way, I wrote a book about this (with my two co-authors). It's called Machine Learning by Tutorials and is sold by Ray Wenderlich. I won't link directly to it, because advertisements on forums are usually frowned upon, but I do believe the book answers your questions in more detail.

Posted by

kerfuffle

Add a Comment

How do CoreML and Vision classify an image?

Accepted Reply

Replies