Correction. The model I created has 1.2 GMAC complexity per inference. This implies 30GMAC runtime inference which is still 1/10 of the advertised.
Now I am doing a 193 GMAC model.
now with the bigger model I reached 148 GMAC/second, which is just about half of the publeshedfigure (600 GOPs=300GMAC).
I wonder if it is because only one of the cores is used or because high precision is used (16 bit instead of 8).
So to sum it up:
Iphone 8 can do 150 GMACS top performance.
On the long run the performance drops to 100GMACS because of heating I guess.
I have been continouously running my inference for an hour and the battery dropped to 50%.
So I calculated 3 Watt power consumption for the 100 GMACS. Thats30 GMACS/Watt
And the precision is FP16.
(For comparision I measured the same caffe model on my Nvidia 960M. It gives 600 GMACS for 60Watts. Thats 10 GMACS/Watts.)
Thats hardy a neural accelerator. Seems like GPU inference to me.
I don't believe that CoreML utilizes the A11 Neural Engine.
I have the suspicion that this chip is quite locked down due to it's use in FaceID.
Has there been any updates regarding this?
Seems like a waste not to let Apple Developers access this power with our own CoreML apps.