Considering the beta/in-development status of tensorflow-metal and current issues (i.e. see "Tensorflow-metal selected issues" below as noticed by this author in LSTM models alone), and that given it is a closed-source library, it is critical to have open and detailed benchmarks on the quality, accuracy, and performance of tensorflow-metal.
If the M1-type processors are to be trusted and heavily utilized by data scientists and ML engineers, we need a commitment to excellence.
Can apple create a Python pip-based package that can be used to test such benchmarks between tensorflow-metal releases?
Some useful benchmarking
- https://github.com/tensorflow/benchmarks
- https://github.com/tensorflow/models/tree/master/official
- https://github.com/cgnorthcutt/benchmarking-keras-pytorch
- https://github.com/tlkh/tf-metal-experiments
Some of these approaches, notably the Tensorflow official models may require further work on making tensorflow-addons and tensorflow-text available as binary M1 ARM64 packages. The latter is especially hard to compile on M1 Macs per existing Github issues.
Tensorflow-metal selected issues