Training LSTM: Low Accuracy on M1 GPU vs CPU

Summary:

I have noticed low test accuracy during and after training Tensorflow LSTM models on M1-based Macs with tensorflow-metal/GPU. While I have observed this over more than one model, I chose to use a standard Github repo to reproduce the problem, utilizing a community and educational example based on the Sequential Models Course, Deep Learning Specialization (Andrew Ng).

Steps to Reproduce:

  1. git clone https://github.com/omerbsezer/LSTM_RNN_Tutorials_with_Demo.git

  2. cd LSTM_RNN_Tutorials_with_Demo/SentimentAnalysisProject

  3. python main.py

  4. Results

Test accuracy (CPU only, without tensorflow-metal): ~83% Test accuracy (GPU using tensorflow-metal): ~37%

A similar pattern can be observed in epoch steps for accuracy, loss etc.

System Details

  • Model: Macbook Pro (16-inch, 2021)
  • Chip: Apple M1 Max
  • Memory: 64GB
  • OS: MacOS 12.0.1
  • Key Libraries: tensforflow-metal (0.2), tensorflow-macos (2.6.0), python (3.9.7)

Hello,

I see exactly the same thing. On recent versions tf 2.7 / 2.8 / 2.9 / 2.10 & 2.11 generally the cpu version works without errors first.

if there is no errors on GPU I generally see a bad performance on accuracy of my models. While those models are good using the same python code on CPU.

this is not a isolated bug but a real trend.

any idea on how to trigger the route cause and to have a fix ?

Yes, I do confirm this observation on my M1 Max. Training TF 2.0 on Nvidia cards is WAY MUCH better than Apple Silicon GPU regarding the accuracy of results. I suspect that it is due to Apple Neural Engine which isn't yet optimized for TF processing.

Training LSTM: Low Accuracy on M1 GPU vs CPU
 
 
Q