Training LSTM: Low Accuracy on M1 GPU vs CPU

Question

radagast OP

Created Nov ’21

Replies 2

Boosts 1

Views 1.2k

Participants 3

Summary:

I have noticed low test accuracy during and after training Tensorflow LSTM models on M1-based Macs with tensorflow-metal/GPU. While I have observed this over more than one model, I chose to use a standard Github repo to reproduce the problem, utilizing a community and educational example based on the Sequential Models Course, Deep Learning Specialization (Andrew Ng).

Steps to Reproduce:

git clone https://github.com/omerbsezer/LSTM_RNN_Tutorials_with_Demo.git
cd LSTM_RNN_Tutorials_with_Demo/SentimentAnalysisProject
python main.py
Results

Test accuracy (CPU only, without tensorflow-metal): ~83% Test accuracy (GPU using tensorflow-metal): ~37%

A similar pattern can be observed in epoch steps for accuracy, loss etc.

System Details

Model: Macbook Pro (16-inch, 2021)
Chip: Apple M1 Max
Memory: 64GB
OS: MacOS 12.0.1
Key Libraries: tensforflow-metal (0.2), tensorflow-macos (2.6.0), python (3.9.7)

Boost

Answer 1

thegodone OP

Feb ’23

Hello,

I see exactly the same thing. On recent versions tf 2.7 / 2.8 / 2.9 / 2.10 & 2.11 generally the cpu version works without errors first.

if there is no errors on GPU I generally see a bad performance on accuracy of my models. While those models are good using the same python code on CPU.

this is not a isolated bug but a real trend.

any idea on how to trigger the route cause and to have a fix ?

1

Answer 2

mr.aseeri OP

Mar ’23

Yes, I do confirm this observation on my M1 Max. Training TF 2.0 on Nvidia cards is WAY MUCH better than Apple Silicon GPU regarding the accuracy of results. I suspect that it is due to Apple Neural Engine which isn't yet optimized for TF processing.

1