Posts

Post not yet marked as solved
5 Replies
2.2k Views
This does not seem to be effecting the training, but it seems somewhat important (no clue on how to read it however): Error: command buffer exited with error status. The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Internal Error (0000000e:Internal Error) <AGXG13XFamilyCommandBuffer: 0x29b027b50> label = <none> device = <AGXG13XDevice: 0x12da25600> name = Apple M1 Max commandQueue = <AGXG13XFamilyCommandQueue: 0x106477000> label = <none> device = <AGXG13XDevice: 0x12da25600> name = Apple M1 Max retainedReferences = 1 This is happening during a "heavy" model training on "heavy" dataset, so maybe is related to some memory issue, but I have no clue how to confront it
Posted Last updated
.