After installation of tensorflow-metal, loss does not decrease as it does in CPU-only training or when running the same training on a Nvidia CUDA environment.
The results when training on M1 Max with metal are completely useless. I get good result again when uninstalling tensorflow-metal via pip uninstall tensorflow-metal
and leaving everything else unchanged, but then training is slow and fans seem louder as when doing GPU training.
Without tensor flow-metal
With tensorflow-metal
X-axis in both graphs is the epoch.
OS is macOS 12.0.1.
Tensorflow version is 2.6, since 2.7 crashes.
tensorflow-metal version is 0.3, but the behaviour was the same with 0.2.
The network is an RCNN. I am following this tutorial https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html. Only the recogniser part, to be exact.
I would much appreciate a solution to this issue, since the reason for buying a 64GB m1 Max was mainly to use it for nn training.