Post

Replies

Boosts

Views

Activity

Reply to Tensorflow metal: The Metal Performance Shaders operations encoded on it may not have completed.
Hi, I am getting this error with test script from the tensorflow plugin metal page. I have a power mac M3 on OS 14.4 (latest at this time.) Unfortunately, I created another thread https://developer.apple.com/forums/thread/748413. Should I close that one? Tensorflow metal was working GREAT on my Power Mac Mac M3 until Tuesday. Then my code started freezing. I ran the test script from https://developer.apple.com/metal/tensorflow-plugin/ and it now crashes - this used to work fine, but all of a sudden it does not. The results are shown below. Was there ever any answers on the previous posts? Could this be a hardware problem? The test script is just this: import tensorflow as tf cifar = tf.keras.datasets.cifar100 (x_train, y_train), (x_test, y_test) = cifar.load_data() model = tf.keras.applications.ResNet50( include_top=True, weights=None, input_shape=(32, 32, 3), classes=100,) loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False) model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"]) model.fit(x_train, y_train, epochs=5, batch_size=64) The errors I get are like the following: Epoch 1/5 1/782 [..............................] - ETA: 51:53 - loss: 6.0044 - accuracy: 0.0312Error: command buffer exited with error status. The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored) <AGXG15XFamilyCommandBuffer: 0x1172515e0> label = <none> device = <AGXG15SDevice: 0x1588e6000> name = Apple M3 Pro commandQueue = <AGXG15XFamilyCommandQueue: 0x17427e400> label = <none> device = <AGXG15SDevice: 0x1588e6000> name = Apple M3 Pro retainedReferences = 1 Error: command buffer exited with error status. The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored) <AGXG15XFamilyCommandBuffer: 0x117257b40> label = <none> device = <AGXG15SDevice: 0x1588e6000> name = Apple M3 Pro commandQueue = <AGXG15XFamilyCommandQueue: 0x17427e400> label = <none> device = <AGXG15SDevice: 0x1588e6000> name = Apple M3 Pro retainedReferences = 1
Mar ’24
Reply to The Metal Performance Shaders operations encoded on it may not have completed.
I have fixed this with two changes: python 3.8, rather than 3.9 (specificaly 3.8.18 which is latest at this time) pandas 1.5.3 rather than 2.x As a result of this I'm on the following tensorflow package versions: tensorboard==2.13.0 tensorboard-data-server==0.7.2 tensorflow==2.13.0 tensorflow-datasets==4.9.2 tensorflow-estimator==2.13.0 tensorflow-macos==2.13.0 tensorflow-metadata==1.14.0 tensorflow-metal==1.0.1 With these everything works. I still have no idea why python 3.9 stopped working after working fine for months, but I wasn't particularly attached to it.
Mar ’24