Tl;dr: Apple's installation instructions appear to be broken. If you're happy with TensorFlow 2.9.0, set up a Miniconda environment, and run:
conda install -c apple tensorflow-deps ==2.9.0
python -m pip install tensorflow-macos==2.9.0
python -m pip install tensorflow-metal==0.5.0
The latest version of TensorFlow that's usable with Metal acceleration is TensorFlow 2.10.0. For this, set up a Miniforge environment, and run:
conda install -c apple tensorflow-deps ==2.10.0
python -m pip install tensorflow-macos==2.10.0
python -m pip install tensorflow-metal==0.6.0
10 months later, this appears to still be broken. Here's what happens for me.
I set up a fresh Conda environment with https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh.
conda search -c apple tensorflow-deps --info suggests that the latest version of tensorflow-deps is 2.10.0. But it requires exactly NumPy 1.23.2:
tensorflow-deps 2.10.0 0
------------------------
file name : tensorflow-deps-2.10.0-0.tar.bz2
name : tensorflow-deps
version : 2.10.0
build : 0
build number: 0
size : 2 KB
license : Apache2
subdir : osx-arm64
url : https://conda.anaconda.org/apple/osx-arm64/tensorflow-deps-2.10.0-0.tar.bz2
md5 : 93ab322b1297b4fde0dd1f7071a51652
timestamp : 2022-09-15 21:15:58 UTC
dependencies:
- grpcio >=1.37.0,<2.0
- h5py >=3.6.0,<3.7
- numpy >=1.23.2,<1.23.3
- protobuf >=3.19.1,<3.20
- python
Unfortunately, NumPy 1.23.2 is not actually available from the official Anaconda repository:
$ conda search numpy --info
...
numpy 1.23.1 py39h42add53_0
---------------------------
file name : numpy-1.23.1-py39h42add53_0.conda
name : numpy
version : 1.23.1
build : py39h42add53_0
build number: 0
size : 11 KB
license : BSD-3-Clause
subdir : osx-arm64
url : https://repo.anaconda.com/pkgs/main/osx-arm64/numpy-1.23.1-py39h42add53_0.conda
md5 : 836f58d0108ccd6d56f3325ef870717e
timestamp : 2022-08-02 09:40:05 UTC
dependencies:
- blas * openblas
- libcxx >=12.0.0
- libopenblas >=0.3.20,<1.0a0
- numpy-base 1.23.1 py39hadd41eb_0
- python >=3.9,<3.10.0a0
numpy 1.23.3 py310h220015d_0
----------------------------
file name : numpy-1.23.3-py310h220015d_0.conda
name : numpy
version : 1.23.3
build : py310h220015d_0
build number: 0
size : 11 KB
license : BSD-3-Clause
subdir : osx-arm64
url : https://repo.anaconda.com/pkgs/main/osx-arm64/numpy-1.23.3-py310h220015d_0.conda
md5 : 9db0662d7f232643e2e98d3464e39d1e
timestamp : 2022-10-14 18:54:16 UTC
dependencies:
- blas * openblas
- libcxx >=12.0.0
- libopenblas >=0.3.20,<1.0a0
- numpy-base 1.23.3 py310h742c864_0
- python >=3.10,<3.11.0a0
...
I try installing tensorflow-deps anyway with conda install -c apple tensorflow-deps. This installs tensorflow-deps 2.9.0 and NumPy 1.22.3.
Next, tensorflow-macos. https://pypi.org/project/tensorflow-macos/#history suggests that the latest version is 2.11.0. Sure enough, when I run python -m pip install tensorflow-macos, this is what gets installed. I guess this is unsurprising, because tensorflow-deps doesn't seem to show up in pip's package list, so there's no way for tensorflow-macos to depend on the correct version.
Ok, so let's try installing a version of tensorflow-macos that matches tensorflow-deps: python -m pip install tensorflow-macos==2.9.0.
python -m pip install tensorflow-metal. This installs 0.7.0, the latest version.
python train.py (the train script given at https://developer.apple.com/metal/tensorflow-plugin/). Nope:
$ python train.py
Traceback (most recent call last):
File ".../train.py", line 1, in <module>
import tensorflow as tf
File ".../miniconda/lib/python3.10/site-packages/tensorflow/__init__.py", line 443, in <module>
_ll.load_library(_plugin_dir)
File ".../handwriting/miniconda/lib/python3.10/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library
py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(.../miniconda/lib/python3.10/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): symbol not found in flat namespace '__ZN3tsl8internal10LogMessage16VmoduleActivatedEPKci'
The error seems to be related to tensorflow-metal. Maybe we need an older version? https://pypi.org/project/tensorflow-metal/#history suggests that the version corresponding to tensorflow-macos 2.9.0 should be tensorflow-metal 0.5.0. So: python -m pip uninstall tensorflow-metal; python -m pip install tensorflow-metal==0.5.0
train.py now runs successfully. (But it doesn't seem to be using the ANE? Is this intentional? At least according to powermetrics, which seems to be the only way to measure ANE activity (https://eclecticlight.co/2022/03/30/the-hunt-for-the-m1s-neural-engine); powermetrics reports CPU at about 3 W and GPU at 7 W but ANE at 0 mW. It does at least run faster than if I only install tensorflow-macos, taking about 2 minutes per epoch vs 8 minutes per epoch.)
But what if I want to use the latest version? If I set up conda-forge, which does have NumPy 1.23.2, in a fresh environment from https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh. When I conda install -c apple tensorflow-deps, version 2.10.0 is now installed as expected. python -m pip install tensorflow-macos still installs 2.11.0, so I have to force it with python -m pip install tensorflow-macos==2.10.0. Then python -m pip install tensorflow-metal, which installs 0.7.0, as before. But train.py is broken like before:
$ python train.py
Traceback (most recent call last):
File ".../train.py", line 1, in <module>
import tensorflow as tf
File ".../miniconda_forge/lib/python3.10/site-packages/tensorflow/__init__.py", line 439, in <module>
_ll.load_library(_plugin_dir)
File ".../miniconda_forge/lib/python3.10/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library
py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(.../miniconda_forge/lib/python3.10/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): symbol not found in flat namespace '__ZN3tsl8internal10LogMessage16VmoduleActivatedEPKci'
So again, I have to manually specify the version of tensorflow-metal: python -m pip install tensorflow-metal==0.6.0. Now train.py works.
So I think the full list of problems here are:
There appears to be no CI of the instructions at https://developer.apple.com/metal/tensorflow-plugin/.
tensorflow-macos does not check to see whether it's installed with the same version of tensorflow-deps.
tensorflow-metal similarly does not depend on the specific version of tensorflow-macos that is required.