Today I upgraded tensorflow-macos to 2.9.0 and tensorflow-metal to 0.5.0, and found my old notebook failed to run.
It ran well with tensorflow-macos 2.8.0 and tensorflow-metal 0.4.0.
Specifically, I found that the groups
arg of Conv2d
layer was the cause.
Here is a demo:
import tensorflow as tf
from tensorflow import keras as tfk
# tf.config.set_visible_devices([], 'GPU')
Xs = tf.random.normal((32, 64, 48, 4))
ys = tf.random.normal((32,))
tf.random.set_seed(0)
model = tfk.Sequential([
tfk.layers.Conv2D(
filters=16,
kernel_size=(4, 3),
groups=4, # groups arg
activation='relu',
),
tfk.layers.Flatten(),
tfk.layers.Dense(1, activation='sigmoid'),
])
model.compile(
loss=tfk.losses.BinaryCrossentropy(),
metrics=[
tfk.metrics.BinaryAccuracy(),
],
)
model.fit(Xs, ys, epochs=2, verbose=1)
The error is:
W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:296 : UNIMPLEMENTED: Could not find compiler for platform METAL: NOT_FOUND: could not find registered compiler for platform METAL -- check target linkage
Removing groups
arg would make the code run again.
Training on CPU, by uncommenting line 4, gives different error:
'apple-m1' is not a recognized processor for this target (ignoring processor)
LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!
And removing groups
arg also would make training on CPU work. However I didn't test training on CPU before the upgrade.
My device is a MacBook Pro 14' running macOS 12.4.