Hi Team,
Any updates ? Because I have to progress my research I'm now using alternative platforms (i..e I've bought a new laptop with a CUDA / nVidia GPU). However, it would be good to solve this issue of not being able to use technically higher specification machine without having to boot it into Ubuntu on dual-boot (which leads to its own problems), I should be able to run tensorflow optimized for GPUs on my mac and its native OS and I'm sure many other ML/AI people do as well. Are we going to see any progress on this issue ?
Kind Regards,
Alze.
Post
Replies
Boosts
Views
Activity
Hi Team,
Any updates please ? Appreciate if you have any findings so I can add to my research log.
Kind Regards,
Alze
deleted.
Hi,
tensorflow-macos 2.9.2
tensorflow-metal 0.5.0
macOS Montery 12.4 (patched and upto date)
Machine : iMac Retina 5K,
27 Inch, 2020,
3.8GHz 8-Core Intel Core i7,
128Gb 2667 Mhz DDR4,
Graphics AMD Radeon Pro 5500 XT 8GB
Command to run (as per documentation)
python3 train.py -c config/stm32f415_tinyaes.json
When running on GPU the slow down occurs exactly the same epoch (19), as a test I disabled the GPU in a duplicate script and whilst taking considerably longer, passed epoch 19, as you can see on GPU enable epoch 19 the time has gone upto 122:06:17
Commend to run (for CPU only, slight modification to script included)
python3 train_cpu.py -c config/stm32f415_tinyaes.json
Script modification to disable GPU (I have left in the last line and first line of the original script so the placement can be identified, else its identical.
from scaaml.utils import tf_cap_memory
try:
# Disable all GPUS
tf.config.set_visible_devices([], 'GPU')
visible_devices = tf.config.get_visible_devices()
for device in visible_devices:
assert device.device_type != 'GPU'
except:
# Invalid device or cannot modify virtual devices once initialized.
pass
def train_model(config):
CPU ONLY
2048/2048 [==============================] - 5014s 2s/step - loss: 1.3966 - acc: 0.4811 - val_loss: 1.5574 - val_acc: 0.4297
Epoch 25/30
1502/2048 [=====================>........] - ETA: 22:02 - loss: 1.3701 - acc: 0.4919
GPU ENABLED
2022-07-05 14:43:20.822168: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 46). These functions will not be directly callable after loading.
2048/2048 [==============================] - 516s 252ms/step - loss: 1.9292 - acc: 0.3521 - val_loss: 1.9108 - val_acc: 0.3503
Epoch 18/30
2048/2048 [==============================] - ETA: 0s - loss: 1.8986 - acc: 0.35982022-07-05 14:52:39.447402: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
2022-07-05 14:52:39.450685: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
2048/2048 [==============================] - 546s 267ms/step - loss: 1.8986 - acc: 0.3598 - val_loss: 2.0514 - val_acc: 0.3303
Epoch 19/30
741/2048 [=========>....................] - ETA: 122:06:17 - loss: 1.8543 - acc: 0.3750/Users/alan/.pyenv/versions/3.9.5/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker
I have run the code on an external system with GPUs based on linux and it runs without problem. This is blocking my research project (MSc) and whilst I can still use the CPU mode, the idea is to compare/baseline against various platforms and functionalities (whilst also using my own traces), so relevant to be able to use all the features available of the host system (GPUs in this case).
Hope this helps and you can offer a solution.
Regards,
alz0r
After running the same code on the same samples, it happend again.
Epoch 19/30
504/2048 [======>.......................] - ETA: 21:47:06 - loss: 1.8561 - acc: 0.371
I dont think it is concidence.