Does CoreML use Neural Engine?

Hi,


I have an Iphone 8 model no

mq6h2gh/a


This phone has an A11 bionic chip. I created a toy caffe model with just two convolutional layers.

I calculated the computing needs for this net is ~150 million MAC.

The Neural Engine is supposed to have 600 billion MAC capacity (or half of it if one MAC is 2 operations).

However my toy network runs for 40-50 msecond! This implies 3-4 GMAC/sec computing capacity, way below the advertised 600GMAC/sec.

I checked and the useCPUOnly feature of MLModel is set to false.

How do I know that CoreML uses the built in neural engine?


For reference the same model did 18 ms forward pass on a 960M Nvidia GPU.

Regards


Gabor

Replies

Correction. The model I created has 1.2 GMAC complexity per inference. This implies 30GMAC runtime inference which is still 1/10 of the advertised.

Now I am doing a 193 GMAC model.

now with the bigger model I reached 148 GMAC/second, which is just about half of the publeshedfigure (600 GOPs=300GMAC).

I wonder if it is because only one of the cores is used or because high precision is used (16 bit instead of 8).

So to sum it up:

Iphone 8 can do 150 GMACS top performance.

On the long run the performance drops to 100GMACS because of heating I guess.

I have been continouously running my inference for an hour and the battery dropped to 50%.

So I calculated 3 Watt power consumption for the 100 GMACS. Thats30 GMACS/Watt

And the precision is FP16.

(For comparision I measured the same caffe model on my Nvidia 960M. It gives 600 GMACS for 60Watts. Thats 10 GMACS/Watts.)

Thats hardy a neural accelerator. Seems like GPU inference to me.

I don't believe that CoreML utilizes the A11 Neural Engine.


I have the suspicion that this chip is quite locked down due to it's use in FaceID.

Has there been any updates regarding this?

Seems like a waste not to let Apple Developers access this power with our own CoreML apps.