Post

Replies

Boosts

Views

Activity

Reply to The new tensorflow-macos and tensorflow-metal incapacitate training
Same problem here. The training would simply hang at the same epoch given the same training loop (tested 5 times on two different training loops). The code cell (in jupyter lab) would have the [*] symbol on, indicating the kernel is working, while CPU and GPU usage are at around 0. Another peculiar behaviour: if a training loop would always get stuck at the 47th out of 50 epochs, then changing the number of epochs to 45 would allow this training loop to finish, however the next training loop after it would get stuck at the second epoch (47-45=2 lol), even though it was for a different model and hence a different training loop. Device: macOS 12.6.2 on M2 Software: python==3.10.8 numpy==1.23.2 tensorflow-deps==2.10.0 tensorflow-estimator==2.11.0 tensorflow-macos==2.11.0 tensorflow-metal==0.7.0
Jan ’23