ML ANE Model unloaded on first call to predict

Hi,

I am trying to take advantage of my device ANE. I have created a model from torch using coremltools and adapted it until xcode model performance preview indicates it will run on my device ANE.

But when i profile my integration into my app, i can see from the com.apple.ane logs the model has been loaded on device:

Timestamp	Type	Process	Category	Message	
00:00.905.087	Debug	MLBench (11135)	client	doLoadModel:options:qos:error:: model[0x2804500c0] : success=1 : progamHandle=10 000 241 581 886: intermediateBufferHandle=10 000 242 143 532 : queueDepth=32 :err=	

but when i call predict on my model, the ANE is unloaded and the prediction run on CPU:

Timestamp	Type	Process	Category	Message	
00:00.996.015	Debug	MLBench (11135)	client	doUnloadModel:options:qos:error:: model[0x2804500c0]=_ANEModel: { modelURL=file:///var/mobile/Containers/Data/Application/0A9F356B-B8C7-4B86-90A5-6812EF48CC94/tmp/math_custom_trans_decoder_seg_0DB63A47-E84E-4887-A606-BC9986B2C662.mlmodelc/ : key={"isegment":0,"inputs":{"extras":{"shape":[2,1,1,1,1]},"memory":{"shape":[128,5,1,1,1]},"proj_key_seg_in":{"shape":[128,39,1,1,1]},"state_in_k":{"shape":[32,1,1,20,2]},"tgt":{"shape":[5,1,1,1,1]},"state_in_v":{"shape":[32,1,1,20,2]},"pos_enc":{"shape":[128,1,1,1,1]}},"outputs":{"attn_seg":{"shape":[1,5,1,4,1]},"state_out_v":{"shape":[32,2,1,20,2]},"output":{"shape":[292,5,1,1,1]},"state_out_k":{"shape":[32,2,1,20,2]},"extras_tmp":{"shape":[2,1,1,1,1]},"proj_key_seg_in_tmp":{"shape":[128,39,1,1,1]},"attn":{"shape":[1,1,1,5,2]},"proj_key_seg":{"shape":[128,1,1,20,1]}}} : string_id=0x70ac000000015257 : program=_ANEProgramForEvaluation: { programHandle=10000241581886 : intermediateBufferHandle=10000242143532 : queueDepth=32 } : state=3 : programHandle=10000241581886 : intermediateBufferHandle=10000242143532 : queueDepth=32 : attr=... : perfStatsMask=0} 	

i dont see any obvious error messages in com.apple.ane, com.apple.coreml or com.apple.espresso.

where/what should i look for to understand what is going on? and in particular why the ANE model was unloaded?

Thank you

That log may be a bit misleading in this context, it does not necessarily mean that your model is not running on the Neural Engine. Using the Core ML Instrument here would be good to check where the model is running within your app. If you are finding that the Xcode performance tab shows the model running on the Neural Engine, but the CoreML Instrument shows that it is running elsewhere when running the model in your app, can you please file a feedback here https://feedbackassistant.apple.com/? Some helpful things to include in the feedback to diagnose the issue would be a sysdiagnose as well as an Instruments trace with the Core ML Instrument.

In case someone stumble on this, the issue was actually with flexible input dimension, same model with fixed input shapes is behaving as expected

ML ANE Model unloaded on first call to predict
 
 
Q