Post

Replies

Boosts

Views

Activity

Tensorflow metal: The Metal Performance Shaders operations encoded on it may not have completed.
This does not seem to be effecting the training, but it seems somewhat important (no clue on how to read it however): Error: command buffer exited with error status. The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Internal Error (0000000e:Internal Error) <AGXG13XFamilyCommandBuffer: 0x29b027b50> label = <none> device = <AGXG13XDevice: 0x12da25600> name = Apple M1 Max commandQueue = <AGXG13XFamilyCommandQueue: 0x106477000> label = <none> device = <AGXG13XDevice: 0x12da25600> name = Apple M1 Max retainedReferences = 1 This is happening during a "heavy" model training on "heavy" dataset, so maybe is related to some memory issue, but I have no clue how to confront it
6
0
2.6k
Aug ’22