GAN with tensorflow-metal gives different results on GPU and CPU

I'm running example from TF site and getting different results from CPU and GPU. Results from GPU are obviously wrong (second image). Why?

If I'm executing code with with tf.device('/cpu:0') then the code works as expected, but slower.

It's sufficient to execute this lines on CPU to fix the issue:

with tf.device('/cpu:0'):
    real_output = discriminator(images, training=True)
    fake_output = discriminator(generated_images, training=True)

Source code: https://www.tensorflow.org/tutorials/generative/dcgan

My complete results: https://disk.yandex.ru/d/E-hU5dpffOmkLg

Answered by mpodzhigai in 701998022

Oops, accidentally marked as the answer.

Issue is the same: Calculations on GPU leads to drastically different results compared to CPU. Windows PC with CUDA GPU gives correct result similar to M1 CPU only computation.

Stock prediction source code from: https://www.thepythoncode.com/article/stock-price-prediction-in-python-using-tensorflow-2-and-keras

My implementation and results: https://disk.yandex.ru/d/S0FqJTL582V1Pw

macOs Monterey 12.1, MBA M1 tensorflow-macos 2.7.0 tensorflow-metal 0.3.0

CPU results:

GPU results:

I tested it and had the same problem!

Hi @mpodzhigai,

Thanks for reporting the issue. I will look into what causes the observed discrepancy between CPU and GPU behavior. I'll update here once we have news.

Accepted Answer

Oops, accidentally marked as the answer.

Issue is the same: Calculations on GPU leads to drastically different results compared to CPU. Windows PC with CUDA GPU gives correct result similar to M1 CPU only computation.

Stock prediction source code from: https://www.thepythoncode.com/article/stock-price-prediction-in-python-using-tensorflow-2-and-keras

My implementation and results: https://disk.yandex.ru/d/S0FqJTL582V1Pw

macOs Monterey 12.1, MBA M1 tensorflow-macos 2.7.0 tensorflow-metal 0.3.0

CPU results:

GPU results:

Hi. I have reported this issue more that month ago. You can find in comments to my post that tf.random wrong behavior makes this problem. https://developer.apple.com/forums/thread/696835

Hi. I have reported this issue more that month ago. You can find in comments to my post that tf.random wrong behavior makes this problem. It was ok before upgrade to macOS 12.1. On 12.2 beta 2 also doesn't work. https://developer.apple.com/forums/thread/696835

And there's also a great analysis of this issue here https://developer.apple.com/forums/thread/697057

Hi After a year the problem is still actual on

mac book pro 16' - m1 pro

tensorflow-macos 2.13.0 tensorflow-metal 1.0.1

macos Ventura 13.4.1 Caramba :)

Hi, same here.

M3 Pro tensorflow-macos 2.15.0 tensorflow-metal 1.1.0 macos Sonoma 14.3.1

GAN with tensorflow-metal gives different results on GPU and CPU
 
 
Q