ker2x’s Profile | Apple Developer Forums

Reply to tf.random is broken since Monterey 12.1

Still broken on 12.3... hello apple ?

Machine Learning & AI General

May ’22

Reply to Not able to install tensor flow-macos

how did you install it ?

Machine Learning & AI General

Dec ’21

Reply to Error installing tensorflow-macos

you must you Miniforge3 as stated in the guide, not the regular conda. if pip install do not works just install it with conda install instead

Machine Learning & AI General

Dec ’21

Reply to tf.random is broken since Monterey 12.1

The workaround doesn't work in a tf.function, this is a real problem. I tried other alternative like : randomgen = tf.random.Generator.from_non_deterministic_state() #%% for _ in range(10): g2 = tf.random.get_global_generator() x = g2.uniform((10,),(1,2)) y = g2.uniform((10,),(3,4)) tf.print(x) tf.print(y) But NotFoundError: No registered 'RngReadAndSkip' OpKernel for 'GPU' devices compatible with node {{node RngReadAndSkip}} . Registered: device='CPU' [Op:RngReadAndSkip] And obviously calling this in a tf.function will always generate the same sequence tf.random.stateless_uniform((size,),(1,2),xmin,xmax,tf.float32) this doesn't works too : randomgen = tf.random.Generator.from_non_deterministic_state() @tf.function def MandelbrotDataSet(size=1000, max_depth=100, xmin=-2.0, xmax=0.7, ymin=-1.3, ymax=1.3): global randomgen x = randomgen.uniform((size,),xmin,xmax,tf.float32) y = randomgen.uniform((size,),xmin,xmax,tf.float32) Because of RngReadAndSkip again.

Machine Learning & AI General

Dec ’21

Reply to Some resource has been exhausted. For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. @@__init__ 2 root error(s) found. (0) RESOURCE_EXHAUSTED: OOM when allocating

shape[114389,320] ? are you sure you're not doing something wrong here ?

Machine Learning & AI General

Dec ’21

Reply to Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.

It's a perfectly normal and harmless message on a M1. I have it too and my model & code works just fine. 2021-12-20 23:19:04.025952: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-20 23:19:04.026364: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Metal device set to: Apple M1 systemMemory: 8.00 GB maxCacheSize: 2.67 GB __________________________________________________________________________________________________ 2021-12-20 23:19:04.413489: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz Epoch 1/10 2021-12-20 23:19:04.723827: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 32/32 [==============================] - ETA: 0s - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.02562021-12-20 23:19:24.073636: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled. 32/32 [==============================] - 20s 608ms/step - loss: 0.0256 - accuracy: 0.9605 - mae: 0.0933 - mse: 0.0256 - val_loss: 0.0100 - val_accuracy: 0.9855 - val_mae: 0.0650 - val_mse: 0.0100 Epoch 2/10 32/32 [==============================] - 19s 585ms/step - loss: 0.0079 - accuracy: 0.9787 - mae: 0.0568 - mse: 0.0079 - val_loss: 0.0063 - val_accuracy: 0.9869 - val_mae: 0.0534 - val_mse: 0.0063 Epoch 3/10 32/32 [==============================] - 18s 575ms/step - loss: 0.0060 - accuracy: 0.9700 - mae: 0.0506 - mse: 0.0060 - val_loss: 0.0045 - val_accuracy: 0.9776 - val_mae: 0.0438 - val_mse: 0.0045 Epoch 4/10 ....

Machine Learning & AI General

Dec ’21

Reply to Getting ModuleNotFoundError: No module named 'tensorflow.python.compiler.mlcompute' error

reformatting your code : import tensorflow as tf from tensorflow.python.compiler.mlcompute import mlcompute tf.compat.v1.disable_eager_execution() mlcompute.set_mlc_device(device_name='gpu') print("is_apple_mlc_enabled %s" % mlcompute.is_apple_mlc_enabled()) print("is_tf_compiled_with_apple_mlc %s" % mlcompute.is_tf_compiled_with_apple_mlc()) print(f"eagerly? {tf.executing_eagerly()}") print(tf.config.list_logical_devices()) it look like some seriously old code, just do this instead import tensorflow as tf print(tf.__version__) physical_devices = tf.config.list_physical_devices('GPU') tf.print(physical_devices) ex : 2.7.0 Metal device set to: Apple M1 systemMemory: 8.00 GB maxCacheSize: 2.67 GB 2021-12-20 23:11:09.001976: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2021-12-20 23:11:09.002466: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Machine Learning & AI General

Dec ’21

Reply to Can´t use tensorflow on Macbook Air M1

See this post, this should help, you have exactly the same problem : https://developer.apple.com/forums/thread/696693

Machine Learning & AI General

Dec ’21

Reply to TensorFlow with Metal start giving wrong results after upgrading macOS from 12.0.1 to 12.1

wrote a minimal use case, this used to generate 2 different series : import tensorflow as tf x = tf.random.uniform((10,)) y = tf.random.uniform((10,)) tf.print(x) tf.print(y) [0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022] [0.178906798 0.8810848 0.384304762 ... 0.162458301 0.64780426 0.0123682022] works fine on collab : It also works fine if I disable GPU with : tf.config.set_visible_devices([], 'GPU') WORKAROUND : g = tf.random.Generator.from_non_deterministic_state() x = g.uniform((10,)) y = g.uniform((10,)) tf.print(x) tf.print(y)

Graphics & Games General

Dec ’21

Reply to TensorFlow with Metal start giving wrong results after upgrading macOS from 12.0.1 to 12.1

I'm still on Epoch 5, on a MacBook Air M1 2020, but it look fine too me. so far. My other trainings run just fine too. look like you just got bad luck on this run ? What about the other intermediary result ? do they all look bad ? edit : I also have some very bad result sometimes, weird. is there a problem with random generation ? i have a model that heavily use random.uniform, I'll check. EDIT again : I need to double check but random is broken in some situation

Graphics & Games General

Dec ’21

Reply to TensorFlow with Metal start giving wrong results after upgrading macOS from 12.0.1 to 12.1

I upgraded to 12.1 today. I just launched a DCGAN, I'll let you know. BUT, I have other model in training (an autoencoder) and haven't noticed any difference since yesterday.

Graphics & Games General

Dec ’21

Reply to Cannot install Tensorflow on Mac m1

I've installed Tensorflow multiple time on Mac M1 using this guide https://developer.apple.com/metal/tensorflow-plugin/ Just follow it step by step, don't skip the miniforge3 installation, it is absolutely mandatory to install and use the one provided in the guide. Tested on python 3.8 and 3.9. Tensorflow is not supported on 3.10 (yet)

Machine Learning & AI General

Dec ’21

Reply to Why is it so slow?

This isn't unexpected, on any platform with any device. Sometime the CPU is faster than the GPU. Sometime my M1 on my MacBook Air 13" is faster than my Nvidia Quadro, or a Tesla K80. It depend on the workload. It's not specific to TensorFlow metal. To be 100% sure you disable the GPU in order to test : tf.config.set_visible_devices([], 'GPU')

Machine Learning & AI General

Dec ’21

Reply to Odd CPU/GPU behaviour in TF-metal on M1 Pro

We really lack documentation indeed. I had weird case were cpu was faster than gpu too. ^^ I only have the M1 (non pro/max) To fully disable the CPU I use this : tf.config.set_visible_devices([], 'GPU') call it first before doing anything else. You might also want to display what device is used for what operation : tf.debugging.set_log_device_placement(True)) It's very verbose and the 1st step is usually mostly cpu (function tracing). From my experience too : don't use float16 (not faster) and don't use mixed_precision (it fallback to CPU), at least on my M1. Give a try to this option too : physical_devices = tf.config.list_physical_devices('GPU') tf.config.experimental.set_memory_growth(physical_devices[0],True)

Machine Learning & AI General

Dec ’21

Reply to neural engine for model training?

From my understanding and information I gathered here and there over time : the neural engine is inferior to the gpu in every aspect for training a TF model and is ... kind of useless to us, developper ? If I extrapolate from the information I found, it's only useful for the tiny model (per today's standard) like the Apple's OCR (eg : you can copy/paste written in image), speech recognition, touchpad gesture, etc ...

Machine Learning & AI General

Dec ’21

ker2x

Post

Replies

Boosts

Views

Activity