M1 native Python is crashing.

Please, I need help to run M1 native Python again!

I'd been successfully running M1 native Python code on a MacBook Pro (13-inch, M1, 2020) using Jupyter Notebook, but since 10/13/2021 the notebook kernel dies as soon as the M1 CPU is used intensively. The .py version of the same code aborts too.

I have formatted the MacBook several times, followed the instructions on https://developer.apple.com/metal/tensorflow-plugin/%C2%A0and the problem persists.

I am running macOS Big Sur 11.6. As a remedy I am now running the same code on Anaconda (Rosetta) and it is taking 50% more time.

We have more than 50 data scientists in our company and I am leading a research on CoreML and the adoption of the new MacBook Pro as a standard platform to our developers.

I would appreciate very much any help from Apple support or the developers community.

Hello rmdev1960,

Can you please share the script you are using or a piece of code where the script is crashing?

Hello yuliya808,

Thank you for your feedback. My code breaks the kernel as soon as a CPU intensive function is called.

Please, try the following test code that is also killing the kernel:

%%time import tensorflow.compat.v2 as tf import tensorflow_datasets as tfds

tf.enable_v2_behavior()

from tensorflow.python.framework.ops import disable_eager_execution disable_eager_execution()

(ds_train, ds_test), ds_info = tfds.load( 'mnist', split=['train', 'test'], shuffle_files=True, as_supervised=True, with_info=True, )

def normalize_img(image, label): """Normalizes images: uint8 -> float32.""" return tf.cast(image, tf.float32) / 255., label

batch_size = 128

ds_train = ds_train.map( normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE) ds_train = ds_train.cache() ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples) ds_train = ds_train.batch(batch_size) ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)

ds_test = ds_test.map( normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE) ds_test = ds_test.batch(batch_size) ds_test = ds_test.cache() ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'), tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'), tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),

tf.keras.layers.Dropout(0.25),

tf.keras.layers.Flatten(), tf.keras.layers.Dense(128, activation='relu'),

tf.keras.layers.Dropout(0.5),

tf.keras.layers.Dense(10, activation='softmax') ]) model.compile( loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(0.001), metrics=['accuracy'], )

model.fit( ds_train, epochs=12, validation_data=ds_test, )

Hi rmdev1960,

Can you please install the latest macOS beta and try again?

I installed macOS Monterey RC (21A558) on my 13-inch MacBook Air. Followed Getting Started with tensorflow-metal PluggableDevice to install the latest TensorFlow Plugin available and was able to execute the test code you provided without any issues on both python 3.8 and python 3.9 environments.

Hi yuliya808,

Thank you very much for your support! It went all well following your guidance. I did not noticed the OS Requirements on Getting Started with tensorflow-metal PluggableDevice. Actually I am running an optimization problem using the Simulated Annealing algorithm. It seems to me that the Python code is handled more efficiently by the OS since I am just using one core and getting the same speed.

Best regards,

Roberto

Hi Roberto,

Thank you for your feedback! Glad it worked for you!

Best Regards, Yuliya

Hi,

Your example runs for me on Monterey 12.0.1 with Python 3.8 ... if I replace the ADAM optimizer with SGD.

model.compile( loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.SGD(0.001), metrics=['accuracy'], )

I've noticed ADAM crash the session.

Metal device set to: AMD Radeon Pro 5700 XT
2021-10-25 12:01:51.733970: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-25 12:01:51.734526: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-25 12:01:51.734764: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-25 12:01:51.902618: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2021-10-25 12:01:51.902647: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2021-10-25 12:01:52.021880: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.035650: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.081019: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.099696: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.211089: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.229341: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.237014: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.261855: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.279544: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-10-25 12:01:52.304527: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2021-10-25 12:01:52.324218: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Train on 469 steps, validate on 79 steps
Epoch 1/12
469/469 [==============================] - ETA: 0s - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993
/Users/davidlaxer/tensorflow-metal/lib/python3.8/site-packages/keras/engine/training.py:2470: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '
2021-10-25 12:02:06.665054: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
469/469 [==============================] - 15s 19ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.2622 - accuracy: 0.1993 - val_loss: 2.2074 - val_accuracy: 0.4665
Epoch 2/12
469/469 [==============================] - 11s 20ms/step - batch: 234.0000 - size: 1.0000 - loss: 2.1208 - accuracy: 0.3812 - val_loss: 1.9072 - val_accuracy: 0.6792
Epoch 3/12
469/469 [==============================] - 11s 21ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.6169 - accuracy: 0.5601 - val_loss: 1.0289 - val_accuracy: 0.8151
Epoch 4/12
469/469 [==============================] - 12s 22ms/step - batch: 234.0000 - size: 1.0000 - loss: 1.0248 - accuracy: 0.6935 - val_loss: 0.5984 - val_accuracy: 0.8613
Epoch 5/12
469/469 [==============================] - 12s 23ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.7831 - accuracy: 0.7570 - val_loss: 0.4718 - val_accuracy: 0.8799
Epoch 6/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6629 - accuracy: 0.7937 - val_loss: 0.4055 - val_accuracy: 0.8929
Epoch 7/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.6024 - accuracy: 0.8123 - val_loss: 0.3660 - val_accuracy: 0.9007
Epoch 8/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5541 - accuracy: 0.8301 - val_loss: 0.3380 - val_accuracy: 0.9073
Epoch 9/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.5244 - accuracy: 0.8397 - val_loss: 0.3181 - val_accuracy: 0.9121
Epoch 10/12
469/469 [==============================] - 13s 24ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4910 - accuracy: 0.8500 - val_loss: 0.2988 - val_accuracy: 0.9161
Epoch 11/12
469/469 [==============================] - 13s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4683 - accuracy: 0.8570 - val_loss: 0.2857 - val_accuracy: 0.9186
Epoch 12/12
469/469 [==============================] - 14s 25ms/step - batch: 234.0000 - size: 1.0000 - loss: 0.4562 - accuracy: 0.8600 - val_loss: 0.2736 - val_accuracy: 0.9207
[1]:
<keras.callbacks.History at 0x7f8758fd9310>
[ ]:

​
M1 native Python is crashing.
 
 
Q