the gpu and ane implementation of MIL resample ops

hello, I am a machine learning engineer, recently I need to run pytorch's grid_sample opration on iphone. so I use coremltools to convert pytorch grid_sample to MIL resample op which is officially supported. But when running on the phone, it is switched to the CPU instead of the GPU or ANE (xcode connected with phone, run offical performance benchmark). I would like to ask why there is no efficient GPU implementation?

What I am looking forward to is running around 2ms, but 8ms with cpu

this is some infomation of mlmodel

Same problems here, any updates?

the gpu and ane implementation of MIL resample ops
 
 
Q