Cannot Use tf.zeros_like with tensorflow-metal (Monterey)

Hi,

I am reliably able to get the following results after running pip install tensorflow-metal. Note I did not cull anything (including some device registration messages that only appear the first time you use tensorflow - hopefully not too distracting, but thought it would provide helpful context about my environment in case something is fishy).

Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14)
[Clang 12.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
>>> tf.zeros_like([1])
Metal device set to: Apple M1

systemMemory: 8.00 GB
maxCacheSize: 2.67 GB

2022-06-05 18:54:29.515755: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-06-05 18:54:29.516007: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Caskroom/miniforge/base/envs/ml/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/opt/homebrew/Caskroom/miniforge/base/envs/ml/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 7164, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Multiple Default OpKernel registrations match NodeDef '{{node ZerosLike}}': 'op: "ZerosLike" device_type: "DEFAULT" constraint { name: "T" allowed_values { list { type: DT_INT32 } } } host_memory_arg: "y"' and 'op: "ZerosLike" device_type: "DEFAULT" constraint { name: "T" allowed_values { list { type: DT_INT32 } } } host_memory_arg: "y"' [Op:ZerosLike]

Whereas after uninstalling tensorflow-metal (pip uninstall tensorflow-metal) the same commands produce:

Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14)
[Clang 12.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.config.list_physical_devices('GPU')
[]
>>> tf.zeros_like([1])
<tf.Tensor: shape=(1,), dtype=int32, numpy=array([0], dtype=int32)>

It looks like a simple double registration issue, but I've only just found out about the 'PluggableDevice' API, so I don't know if it has recommendations for resolving multiple registrations. If I had to guess it is unexpected in the extreme for a pluggable device extension to contain default device op registrations, but without being able to see the code I cannot guess further about what might be wrong.

Answered by Frameworks Engineer in 715938022

Can you try updating the tensorflow-macos to 2.9.2

python -m pip install -U tensorflow-macos==2.9.2

This should fix the ZEROS_LIKE error.

Hi @emdeefive

Thanks for reporting the issue! This double registration is being promptly fixed and we will update the installation wheels once the fix is in. I'll send an update to this thread as soon as the new wheels are available.

I'm getting the exact same error

Accepted Answer

Can you try updating the tensorflow-macos to 2.9.2

python -m pip install -U tensorflow-macos==2.9.2

This should fix the ZEROS_LIKE error.

Thank you for the super prompt reply! That worked!

Hopefully you can spin this into a test to check all op registrations in one go.

Cannot Use tf.zeros_like with tensorflow-metal (Monterey)
 
 
Q