Post

Replies

Boosts

Views

Activity

Apple Tensorflow Internal Error (0000000e:Internal Error)
When I train a model (private, for work) using Apple Tensorflow, I get an error like this: The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Internal Error (0000000e:Internal Error) <AGXG13XFamilyCommandBuffer: 0x355c49fc0> label = <none> device = <AGXG13XDevice: 0x10d981400> name = Apple M1 Pro commandQueue = <AGXG13XFamilyCommandQueue: 0x11dedb600> label = <none> device = <AGXG13XDevice: 0x10d981400> name = Apple M1 Pro retainedReferences = 1 When I run the same script on a server with a Geforce GPU, then it works fine. It happens already during the first epoch. I also see that the memory leaks as it starts with 3 GB and reaches 20 GB within this epoch. Does anyone know how to deal with this problem? Thank you!
1
0
1.1k
Sep ’22