Hello, is there any news on that front?
I'm a total newb with TS so I have zero sense of what is going on,
but I consistently have this error "Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support."
either with this test (first reply here)
or "TensorFlow 2 quickstart for beginners"
Strangly the training does seem to run : simple tests actually go through epochs pretty fast (I guess) and my AMD usage goes around 30-50%
My specs are :
Intel Macbook Pro with Monterrey and AMD Radeon Pro 5500M 8 Go
Python 3.8.10
Here's an example of the simple test output :
(tensorflow-metal-test) jv@192 tensorflow-metal-test % python /Users/jv/tensorflow-exp/test.py
2021-11-22 23:50:48.066315: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: AMD Radeon Pro 5500M
systemMemory: 32.00 GB
maxCacheSize: 3.99 GB
2021-11-22 23:50:48.067311: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-11-22 23:50:48.067826: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-11-22 23:50:48.505048: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-11-22 23:50:48.505092: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-11-22 23:50:48.712043: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:48.734335: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:48.827487: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:48.858801: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:49.081885: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:49.113821: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:49.169179: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:49.208235: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-11-22 23:50:49.243817: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Train on 469 steps, validate on 79 steps
2021-11-22 23:50:49.282608: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Epoch 1/12
2021-11-22 23:50:49.309804: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
469/469 [==============================] - ETA: 0s - batch: 234.0000 - size: 1.0000 - loss: 0.1564 - accuracy: 0.9539/Users/julienvincenot/tensorflow-metal-test/lib/python3.8/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`Model.state_updates` will be removed in a future version. '
2021-11-22 23:51:01.268461: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
469/469 [==============================] - 14s 21ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.1564 - accuracy: 0.9539 - val_loss: 0.0707 - val_accuracy: 0.9782
Epoch 2/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0453 - accuracy: 0.9857 - val_loss: 0.0487 - val_accuracy: 0.9848
Epoch 3/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0284 - accuracy: 0.9912 - val_loss: 0.0378 - val_accuracy: 0.9878
Epoch 4/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0191 - accuracy: 0.9939 - val_loss: 0.0346 - val_accuracy: 0.9886
Epoch 5/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0135 - accuracy: 0.9958 - val_loss: 0.0400 - val_accuracy: 0.9892
Epoch 6/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0099 - accuracy: 0.9968 - val_loss: 0.0332 - val_accuracy: 0.9902
Epoch 7/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0069 - accuracy: 0.9978 - val_loss: 0.0376 - val_accuracy: 0.9894
Epoch 8/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0078 - accuracy: 0.9973 - val_loss: 0.0389 - val_accuracy: 0.9889
Epoch 9/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0059 - accuracy: 0.9980 - val_loss: 0.0448 - val_accuracy: 0.9887
Epoch 10/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0047 - accuracy: 0.9985 - val_loss: 0.0434 - val_accuracy: 0.9902
Epoch 11/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0053 - accuracy: 0.9984 - val_loss: 0.0486 - val_accuracy: 0.9873
Epoch 12/12
469/469 [==============================] - 12s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.0047 - accuracy: 0.9984 - val_loss: 0.0383 - val_accuracy: 0.9896