Post

Replies

Boosts

Views

Activity

Reply to neural engine for model training?
From my understanding and information I gathered here and there over time : the neural engine is inferior to the gpu in every aspect for training a TF model and is ... kind of useless to us, developper ? If I extrapolate from the information I found, it's only useful for the tiny model (per today's standard) like the Apple's OCR (eg : you can copy/paste written in image), speech recognition, touchpad gesture, etc ...
Dec ’21
Reply to Odd CPU/GPU behaviour in TF-metal on M1 Pro
We really lack documentation indeed. I had weird case were cpu was faster than gpu too. ^^ I only have the M1 (non pro/max) To fully disable the CPU I use this : tf.config.set_visible_devices([], 'GPU') call it first before doing anything else. You might also want to display what device is used for what operation : tf.debugging.set_log_device_placement(True)) It's very verbose and the 1st step is usually mostly cpu (function tracing). From my experience too : don't use float16 (not faster) and don't use mixed_precision (it fallback to CPU), at least on my M1. Give a try to this option too : physical_devices = tf.config.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(physical_devices[0],True)
Dec ’21
Reply to Why is it so slow?
This isn't unexpected, on any platform with any device. Sometime the CPU is faster than the GPU. Sometime my M1 on my MacBook Air 13" is faster than my Nvidia Quadro, or a Tesla K80. It depend on the workload. It's not specific to TensorFlow metal. To be 100% sure you disable the GPU in order to test : tf.config.set_visible_devices([], 'GPU')
Dec ’21
Reply to Cannot install Tensorflow on Mac m1
I've installed Tensorflow multiple time on Mac M1 using this guide https://developer.apple.com/metal/tensorflow-plugin/ Just follow it step by step, don't skip the miniforge3 installation, it is absolutely mandatory to install and use the one provided in the guide. Tested on python 3.8 and 3.9. Tensorflow is not supported on 3.10 (yet)
Dec ’21
Reply to TensorFlow with Metal start giving wrong results after upgrading macOS from 12.0.1 to 12.1
I'm still on Epoch 5, on a MacBook Air M1 2020, but it look fine too me. so far. My other trainings run just fine too. look like you just got bad luck on this run ? What about the other intermediary result ? do they all look bad ? edit : I also have some very bad result sometimes, weird. is there a problem with random generation ? i have a model that heavily use random.uniform, I'll check. EDIT again : I need to double check but random is broken in some situation
Dec ’21
Reply to TensorFlow with Metal start giving wrong results after upgrading macOS from 12.0.1 to 12.1
wrote a minimal use case, this used to generate 2 different series : import tensorflow as tf x = tf.random.uniform((10,)) y = tf.random.uniform((10,)) tf.print(x) tf.print(y) [0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022] [0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022] works fine on collab : It also works fine if I disable GPU with : tf.config.set_visible_devices([], 'GPU') WORKAROUND : g = tf.random.Generator.from_non_deterministic_state() x = g.uniform((10,)) y = g.uniform((10,)) tf.print(x) tf.print(y)
Dec ’21
Reply to Getting ModuleNotFoundError: No module named 'tensorflow.python.compiler.mlcompute' error
reformatting your code : import tensorflow as tf from tensorflow.python.compiler.mlcompute import mlcompute tf.compat.v1.disable_eager_execution() mlcompute.set_mlc_device(device_name='gpu') print("is_apple_mlc_enabled %s" % mlcompute.is_apple_mlc_enabled()) print("is_tf_compiled_with_apple_mlc %s" % mlcompute.is_tf_compiled_with_apple_mlc()) print(f"eagerly? {tf.executing_eagerly()}") print(tf.config.list_logical_devices()) it look like some seriously old code, just do this instead import tensorflow as tf print(tf.__version__) physical_devices = tf.config.list_physical_devices('GPU') tf.print(physical_devices) ex : 2.7.0 Metal device set to: Apple M1 systemMemory: 8.00 GB maxCacheSize: 2.67 GB 2021-12-20 23:11:09.001976: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-20 23:11:09.002466: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Dec ’21
Reply to Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
It's a perfectly normal and harmless message on a M1. I have it too and my model & code works just fine. 2021-12-20 23:19:04.025952: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-20 23:19:04.026364: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Metal device set to: Apple M1 systemMemory: 8.00 GB maxCacheSize: 2.67 GB __________________________________________________________________________________________________ 2021-12-20 23:19:04.413489: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz Epoch 1/10 2021-12-20 23:19:04.723827: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 32/32 [==============================] - ETA: 0s - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.02562021-12-20 23:19:24.073636: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 32/32 [==============================] - 20s 608ms/step - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.0256 - val_loss: 0.0100 - val_accuracy: 0.9855 - val_mae: 0.0650 - val_mse: 0.0100 Epoch 2/10 32/32 [==============================] - 19s 585ms/step - loss: 0.0079 - accuracy: 0.9787 - mae: 0.0568 - mse: 0.0079 - val_loss: 0.0063 - val_accuracy: 0.9869 - val_mae: 0.0534 - val_mse: 0.0063 Epoch 3/10 32/32 [==============================] - 18s 575ms/step - loss: 0.0060 - accuracy: 0.9700 - mae: 0.0506 - mse: 0.0060 - val_loss: 0.0045 - val_accuracy: 0.9776 - val_mae: 0.0438 - val_mse: 0.0045 Epoch 4/10 ....
Dec ’21
Reply to tf.random is broken since Monterey 12.1
The workaround doesn't work in a tf.function, this is a real problem. I tried other alternative like : randomgen = tf.random.Generator.from_non_deterministic_state() #%% for _ in range(10): g2 = tf.random.get_global_generator() x = g2.uniform((10,),(1,2)) y = g2.uniform((10,),(3,4)) tf.print(x) tf.print(y) But NotFoundError: No registered 'RngReadAndSkip' OpKernel for 'GPU' devices compatible with node {{node RngReadAndSkip}} . Registered: device='CPU' [Op:RngReadAndSkip] And obviously calling this in a tf.function will always generate the same sequence tf.random.stateless_uniform((size,),(1,2),xmin,xmax,tf.float32) this doesn't works too : randomgen = tf.random.Generator.from_non_deterministic_state() @tf.function def MandelbrotDataSet(size=1000, max_depth=100, xmin=-2.0, xmax=0.7, ymin=-1.3, ymax=1.3): global randomgen x = randomgen.uniform((size,),xmin,xmax,tf.float32) y = randomgen.uniform((size,),xmin,xmax,tf.float32) Because of RngReadAndSkip again.
Dec ’21