gtsoukas’s Profile | Apple Developer Forums

CRNN training slower on GPU than on CPU

I was training the CRNN model described here (https://keras.io/examples/vision/handwriting_recognition/) in tensorflow 2.8 with and without tensorflow metal version 0.4. This model has 424,081 trainable parameters. Even when varying the batch size, GPU is always much slower than CPU, as shown in below graph. Surprisingly, training gets even slower on GPU for larger batch sizes. Please let me know, how I can make GPU training much faster than CPU. System: M1 Max 64GB, macOS 12.2.1. P.s. since there were differences in the loss trajectory between CPU and metal in metal versions prior to 0.4, I am happy to report, that this has been resolved. See below graph

Machine Learning & AI General tensorflow-metal

748

Feb ’22

Wrong results when using tensor flow-metal

After installation of tensorflow-metal, loss does not decrease as it does in CPU-only training or when running the same training on a Nvidia CUDA environment. The results when training on M1 Max with metal are completely useless. I get good result again when uninstalling tensorflow-metal via pip uninstall tensorflow-metal and leaving everything else unchanged, but then training is slow and fans seem louder as when doing GPU training. Without tensor flow-metal With tensorflow-metal X-axis in both graphs is the epoch. OS is macOS 12.0.1. Tensorflow version is 2.6, since 2.7 crashes. tensorflow-metal version is 0.3, but the behaviour was the same with 0.2. The network is an RCNN. I am following this tutorial https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html. Only the recogniser part, to be exact. I would much appreciate a solution to this issue, since the reason for buying a 64GB m1 Max was mainly to use it for nn training.

Machine Learning & AI General tensorflow-metal

2.0k

Dec ’21

gtsoukas

Post

Replies

Boosts

Views

Activity