Tensorflow Model Stops Training when Process reaches 47GB (e.g. ru_maxrss)

Details here:

https://stackoverflow.com/questions/68551935/why-does-my-tensorflow-model-stop-training

I can run the VariationalDeepSemantic Hashing model in an Anaconda python 3.85 virtual environment with tensorflow 2.5 on CPUs.

If I run the same code in the tensorflow-metal virtual environment with python 3.82, accessing my AMD Radeon Pro 5700 XT GPU, the process stops training at epoch 5 on the 5200 batch.

Fixed with tensorflow-metal version 0.1.2

Tensorflow Model Stops Training when Process reaches 47GB (e.g. ru_maxrss)
 
 
Q