Massives issues with tensorflow gpu, when will apple do something?

Hello, We all face issues with the latest tensorflow gpu. Incorrect result, errors etc... We all agreed to pay extra for the M1/2/3 so we could work on a professional grade computer but in the end we must use CPU. When will apple actually comment on that and provide updates. I totally understand these issues aren't fixed overnight and take some time, but i've never seen any apple dev answer saying that they understand and they're working on a fix. I've basically bought a Mac M3 Pro to be able to run on GPU some stuff without having to purchase a server and it's now useless. It's really frustrating.

I think we would all appreciate some information on this. Many of the issues have been around for well over a year, with no mention of resolution.

Slow training and wrong results with sonoma, tensorflow 2.15, tensorflow-metal 1.1

If I'm not mistaken, I think this issue is more apparent on Sonoma than Ventura.

I have added a python file from the Coursera TensorFlow: Advanced Techniques Specialization, class Generative Deep Learning with TensorFlow , Course4, Week3, Assignment 1.

This code runs correctly in colab or on local cpu cores, but fails drastically due to obvious untrapped numerical errors on a metal gpu.

To reproduce, switch between environments with the two command lines:

pip install tensorflow-metal
pip uninstall tensorflow-metal

Feel free to contact me if you require additional information, thanks!

Just wanted to add I have tried most combinations of Tensorflow 2.16.1, 2.15, and 2.14, with Pythons 3.9, 3.10 and 3.11, and with conda or pip virtual environments, all with and without tensorflow-metal. The results are always the same, with all moderately complex models I've tried employing convolution and batch normalization total failures with gpu support, and with success under colab or cpu-only Mac configs.

The tensorflow install page does suggest this, though they also link to install procedures for the defective tensorflow-metal package.

macOS 10.12.6 (Sierra) or later (no GPU support)

https://www.tensorflow.org/install

Hi, I use Apple Macbook M3, happens same problem, I solve using Anaconda and create a enviromnet using a version of Python 3.8.18 and than I can install tensorflow-macos and tensorflow-metal. Using Python in latest version > 3.8 tensorflow didn't works...

(AppleTensorflow) antoniothomacelli@mbp-de-antonio Aula001 % python --version
Python 3.8.18
(AppleTensorflow) antoniothomacelli@mbp-de-antonio Aula001 % pip list | grep tensorflow
tensorflow              2.16.1
tensorflow-macos        2.16.1
(AppleTensorflow) antoniothomacelli@mbp-de-antonio Aula001 % 

Try a different activation function than relu (e.g. elu, tanh)

I am still experiencing these issues with relatively simple models catastrophically failing to converge on Apple Silicon machines when using the GPU. When running on the CPU, I get better performance and good convergence. The only change is switching from GPU to CPU.

I am using tensorflow-metal v1.1.0, tensorflow v2.16.2, tensorflow-macos v2.16.2. If anyone has found a solution, or even a temporary work-around that yields good performance on the GPU, please advise.

Massives issues with tensorflow gpu, when will apple do something?
 
 
Q