How to classify different foods with CoreML machine learning

Hi,


I was wondering if there was a way to take a user-supplied image of a food and have machine learning classify it. For instance, if I gave it a picture of a banana it would return "banana" or something like that using CoreML. I haven't gotten into any of this stuff yet, but I would like to. Does anyone know if there is a simple model for this, or if there is a way I could create my own and train it with images, and how to get to this into an Xcode app using Swift?


Thanks in advance. Any help is appreciated.

Replies

I would suggest starting with a “Hotdog or Not Hotdog” classifier and work your way up from there. Just kidding. Yes this is definitely possible. However there is a huge variety of different types of food out there so you’d have a hard time with it. But if you limit it to, say, the 100 most common types of food that should be relatively simple. I would suggest that you take a course in machine learning, such as Andrew Ng’s courses, I believe they’re free.

I'm building a model to do exactly this. The existing models like Inception and VGG only work for a certain subsets of food. If you want all popular foods, you have to roll your own. I have a convoluted neural network in Keras working with a few food items now.


This is a good place to start: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

,Working on that now.


It's quite hard because of the number of food and how similar they look like (there are many varieties of pasta in red sauce, unless you'd be just happy with the word "pasta").


Now if you need just to recognize simple food (like fruits, cake, etc), there are few models out there:

- A simple model is Food101, there is even a precompiled version for CoreML:

https://github.com/ph1ps/Food101-CoreML


- One of the best models is UECFood256 that you can find the dataset here:

http://foodcam.mobi/dataset256.html

This one recognizes especially Asian food very well, and can even differenciate between different types of noodles for example. The bummer is there is no pre-trained model that I can find (if anyone has one, feel free to share), so you'll have to train your own.


Some limitations:

- Most of the models can recognize one food at a time (so if you have 5 things on your plate, you'd need 5 photos, one of every item, or find a way to correctly find the food items and cut their bounding boxes, maybe by asking the user to select areas of the image)

- UECFood256 can solve the problem of validating multiple food at the same time, but you'll have a more complex model to work with.

- Most of the models can't find count (like "3 bananas" or ["detected" => {"banana", "banana", "banana"}]) nor the quantity (like "small banana", "large serving", etc)

- When confidence levels are close, it can be hard to know if it detected multiple food or it's the same food but unsure.

I used this as a starting point to build up my own model with keras. Unfortuantely I'm getting a CoreML error: The size of the output layer 'classLabelProbs' in the neural network does not match the number of classes in the classifier.


Had anyone success convertig this to CoreML?

  • thank you - this is bananas helpful

Add a Comment