I
prepared a python3.9 environment with tensorflow-macos( 2.9.0 , 2.9.1),
tensorflow-metal( 0.4.0, 0.5.1) to enable m1 core acceleration. The program
architecture is a tf.keras model training on a Celery(5.2.7) worker which is
created by gevent(21.8.0 ). After training, the model is converted to coreml
model(coremltools 6.0 ) and tflite model.
Based on experiments, there are some kinds of memory leak on two different M1
Macs.
1.
Memory grew smoothly from 7GB to 80GB within a 300-epoch
training.
2.
Memory grew slowly from 7GB to 11GB within a 100-epoch training
process, then grow to 13GB after model conversion. We added
tf.keras.clear_session() that expect the memory to be collected after celery
worker task finished. But memory still remained when the next task published to
the worker.
I saw some post about memory leak on M1 core. Is there any suggestion to memory leak?
I know a work around may be to create celery work by prefork method and set memory growth limit. But my code cannot transfer directly because some GPU resource context issue. Is it easy to create tf.keras training in threads on the M1?