The script is in my GitHub: my script code
Just follow the steps to reproduce [1] with GPU (slow) and [2] with CPU (fast)
How to reproduce:
[1]: SLOW (with tensorflow-metal installed) == GPU
conda create --name lab_slow python=3.9
conda activate lab_slow
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
python -m pip install gym
python -m pip install icecream
With [1] I see the GPU burning but the results are very slow and some Information are showed:
(lab_slow) fernando@minidefernando restml-muzero % python mymuzero.py
Init Plugin
Init Graph Optimizer
Init Kernel
Metal device set to: Apple M1
2021-07-23 17:10:41.158886: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-07-23 17:10:41.159108: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-07-23 17:10:41.301535: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-07-23 17:10:41.303570: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2021-07-23 17:10:41.304081: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
ic| elapsed: -8.216638209
ic| elapsed: -11.702662209000001
ic| elapsed: -7.210992999999998
ic| elapsed: -7.76780325
ic| elapsed: -6.552902541999998
ic| elapsed: -7.051604083000001
ic| elapsed: -7.802484458000002
The time of each iteration is ~7 seconds using GPU.
[2]: FAST (without tensorflow-metal installed) == CPU
conda create --name lab_fast python=3.9
conda activate lab_fast
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install gym
python -m pip install icecream
With [2] the CPU is used and the speed is good.
I got this:
(lab_fast) fernando@minidefernando restml-muzero % python mymuzero.py
2021-07-23 17:12:54.201964: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-07-23 17:12:54.204263: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
ic| elapsed: -0.6991699170000001
ic| elapsed: -1.001101625
ic| elapsed: -0.745720833
ic| elapsed: -0.5131082500000002
ic| elapsed: -0.680853667
ic| elapsed: -0.5841365830000003
ic| elapsed: -0.6107562499999997
ic| elapsed: -0.5290834160000006
ic| elapsed: -1.3289379590000001
ic| elapsed: -0.508716166000001
ic| elapsed: -1.1658726250000004
ic| elapsed: -0.5299397080000006
ic| elapsed: -0.5714273750000007
ic| elapsed: -0.6107442499999998
The time of each iteration is ~0.5 seconds using 1 CPU.