Will tensorflow-metal ever work with AMD chip?

Question

Created Jan ’23

Replies 2

Boosts 0

Views 1.5k

Participants 2

It's been years, and I keep trying, keep trying different hacks to get things installed. Things do install but nothing ever runs to completion. I would expect Apple would want to play a bigger part in this rather than having us have to move to linux with Nvidia. I wish Apple would just put some resources behind this.

Boost

Answer 1

raymondjiii OP

Jan ’23

tensorboard 2.11.2 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow-estimator 2.11.0 pypi_0 pypi tensorflow-io-gcs-filesystem 0.29.0 pypi_0 pypi tensorflow-macos 2.11.0 pypi_0 pypi tensorflow-metal 0.7.0 pypi_0 pypi

Running the same python test script at the apple metal page:

2023-01-20 12:52:34.536215: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz 169001437/169001437 [==============================] - 7s 0us/step 2023-01-20 12:53:02.967585: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Metal device set to: AMD Radeon Pro 5600M

systemMemory: 64.00 GB maxCacheSize: 3.99 GB

2023-01-20 12:53:02.968211: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-01-20 12:53:02.968256: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) Epoch 1/5 /opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/keras/backend.py:5585: UserWarning: "sparse_categorical_crossentropy received from_logits=True, but the output argument was produced by a Softmax activation and thus does not represent logits. Was this intended? output, from_logits = _get_logits( 2023-01-20 12:53:16.475463: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled. 2023-01-20 12:53:25.908578: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7f9194090a60 2023-01-20 12:53:25.908651: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7f9194090a60

....

Traceback (most recent call last): File "/Users/ray/test.py", line 13, in model.fit(x_train, y_train, epochs=5, batch_size=64) File "/opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:

Detected at node 'StatefulPartitionedCall_212' defined at (most recent call last): File "/Users/ray/test.py", line 13, in model.fit(x_train, y_train, epochs=5, batch_size=64) File "/opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler return fn(*args, **kwargs) File "/opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit tmp_logs = self.train_function(iterator) File "/opt/anaconda3/envs/applemetal/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function .... ....

0

Answer 2

Frameworks Engineer OP

Apple

Jan ’23

@raymondjiii, in base Tensorflow v2.11, the Optimizer api changed and it broke the current pluggable architecture as jit_compile=True was turned on by default for optimizers. This path goes to XLA, which is not supported by Pluggable devices. We are working on a fix to workaround this issue. Meanwhile can you use the Legacy optimizer API to fix the issue:

import tensorflow as tf
from tensorflow.keras.optimizers.legacy import Adam

cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
    include_top=True,
    weights=None,
    input_shape=(32, 32, 3),
    classes=100,)

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=Adam(), loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)

0