Does the M2 chip have 64 bit registers in the GPU?
Post
Replies
Boosts
Views
Activity
https://github.com/pytorch/pytorch/issues/77867
Confirmed. It's an Intel based iMac 27" (2020) with an AMD Radeo Pro 5700 XT GPU running OS X 12.3.1
% ipython
Python 3.8.5 ...
% pip show tensorflow-macos
WARNING: Ignoring invalid distribution -umpy (/Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages)
Name: tensorflow-macos
Version: 2.8.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, keras-preprocessing, libclang, numpy, opt-einsum, protobuf, setuptools, six, tensorboard, termcolor, tf-estimator-nightly, typing-extensions, wrapt
Required-by:
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 top2vec % pip show tensorflow-metal
WARNING: Ignoring invalid distribution -umpy (/Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages)
Name: tensorflow-metal
Version: 0.4.0
Summary: TensorFlow acceleration for Mac GPUs.
Home-page: https://developer.apple.com/metal/tensorflow-plugin/
Author:
Author-email:
License: MIT License. Copyright © 2020-2021 Apple Inc. All rights reserved.
Location: /Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages
Requires: six, wheel
Required-by:
(tensorflow-metal) (base) davidlaxer@x86_64-apple-darwin13 top2vec %
Show us the code ...
Tried: optimizer=tfa.optimizers.RectifiedAdam(). It's not utilizing GPU on iMac 27" with AMD Radeon Pro 5700 XT in this code example:
`inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)
callbacks = [
keras.callbacks.ModelCheckpoint("jena_lstm_dropout.keras",
save_best_only=True)
]
model.compile(optimizer=tfa.optimizers.RectifiedAdam(), loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
epochs=50,
validation_data=val_dataset,
callbacks=callbacks)`
I tried using the CustomAdam() function (above). I runs, but it doesn't appear to be using the GPU (according to the performance meter).
I'm on an iMac 27" with an AMD Radeon Pro 5700XT
`import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
model.compile(optimizer = CustomAdam(), loss=loss_fn)
model.fit(x_train, y_train, epochs=10)`
Here's output:
`2022-03-26 12:29:25.682832: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-26 12:29:25.683388: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-03-26 12:29:25.683664: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
Metal device set to: AMD Radeon Pro 5700 XT
Epoch 1/10
145/1875 [=>............................] - ETA: 1:35 - loss: 0.8532
[ ]:
`
I'm getting some results with the 'Adagrad' optimizer.
At epoch 57:
Epoch 58/100
6332/6332 [==============================] - 1621s 256ms/step - d_loss: 0.6239 - g_loss: 0.8338
Epoch 59/100
1342/6332 [=====>........................] - ETA: 21:07 - d_loss: 0.6044 - g_loss: 0.8635
Did you happen to measure the performance difference between CPU and GPU with the SDG optimizer? Was the CPU 4X faster?
I tried the previous versions of 'tensorflow-macos' and 'tensorflow-metal' and they also had issues:
E.g
https://developer.apple.com/forums/thread/691339
I installed MacOS 12.0 Beta and attempted to train the model with tensorflow_macos and tensorflow_metal 2.6.
Training the model consumed ~110GB DRAM before I killed the kernel (the iMac has 128GB).
I tried reducing the number of input rows in the training set to: 18806 rows × 5 columns,
but it still runs out of memory.
Also, I can't tell from the Activity monitor whether the GPU is running.
You can find the notebook in the notebook directory in this repository:
https://github.com/ddangelov/Top2Vec
Also, why doesn't the GPU run when Tensorflow is in Eager Execution mode?
In the prior version of 'tensorflow-metal', I had to disable Eager Mode for the GPU to run:
tf.compat.v1.enable_v2_behavior()
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
How do I send you a Jupyter notebook?
Several of the profiler tools are not functional
(e.g. - Kernel Stats: There is no GPU data to display because there were no kernels in the capture duration.
tf_data_bottleneck_analysys: No tf.data activitiy captured in your profile. If your job uses tf.data, try to capture a longer profile.
Here's what was generated. Any idea why the tensor board profile tab empty?
% ls -lR train/plugins/profile/2021_08_29_11_07_46
total 232
-rw-r--r-- 1 davidlaxer staff 4867 Aug 29 11:07 BlueDiamond.local.input_pipeline.pb
-rw-r--r-- 1 davidlaxer staff 0 Aug 29 11:07 BlueDiamond.local.kernel_stats.pb
-rw-r--r-- 1 davidlaxer staff 1501 Aug 29 11:07 BlueDiamond.local.memory_profile.json.gz
-rw-r--r-- 1 davidlaxer staff 5938 Aug 29 11:07 BlueDiamond.local.overview_page.pb
-rw-r--r-- 1 davidlaxer staff 4013 Aug 29 11:07 BlueDiamond.local.tensorflow_stats.pb
-rw-r--r-- 1 davidlaxer staff 14817 Aug 29 11:07 BlueDiamond.local.trace.json.gz
-rw-r--r-- 1 davidlaxer staff 74605 Aug 29 11:07 BlueDiamond.local.xplane.pb
(base) davidlaxer@x86_64-apple-darwin13 20210829-102538 %