Run CoreML model with GRU on Neural Engine

There was an issue in the past on coremltools that was closed saying this is the appropriate forum for discussing how to get CoreML models to run on the Neural Engine: https://github.com/apple/coremltools/issues/337.

I have a tensorflow model where the vast majority of layers can run on the GPU or Neural Engine. Conceptually, I don't see why all of it can't use the Neural Engine. I see that there are a couple layers associated with the GRU cannot run on the Neural Engine like get_shape (even though all of the shapes are known). Coremltools spit out the converted model, so I don't have much insight to why dynamic dimension layers are used instead of static dimensions.

Is there any way to have some of the model inferenced on the GPU/NE or have coremltools guarantee that a generated model runs on the NE?

I converted the model with coreml_model = ct.convert(probability_model, convert_to='mlprogram', compute_precision=ct.precision.FLOAT16, compute_units=ct.ComputeUnit.ALL) where ct is coremltools and probability_model is a Tensorflow 2 keras model that has a GRU in it.

Some of the similar models I tried without the GRU run 20-30x faster on the NE.

Here is an example performance report screenshot:

One thing I notice that doesn't seem to match my expectations with coremltools is the storage and compute types differ. I don't know why because I exported from coremltools with float16 compute precision.

Answered by mengran1 in 733493022

hello, please ask the Performance Report , I use iPhone iOS 16.0.2 and 16.0.3 , it occur this version not support creating performance reports.do you know why

Accepted Answer

hello, please ask the Performance Report , I use iPhone iOS 16.0.2 and 16.0.3 , it occur this version not support creating performance reports.do you know why

I mis-clicked and now an off-topic response is the accepted reply ...

There is no way to undo fix this according to https://developer.apple.com/forums/thread/662659

@jacobfromchampaignasdf, can you try converting the model again, by specifying the full input shapes? That is, ct.convert(..., inputs=[ct.TensorType(shape=(...), )],...). When the shape argument is not provided, the converter takes the shapes from the TF model graphdef, and in this case its likely that the TF graph has a few unknown dimensions in there for input shapes (e.g. batch or sequence) which get propagated down and hence the presence of "get_shape" like ops. Hopefully after providing static shapes for all mode inputs to ct.convert you will not see the get_shape ops.

Run CoreML model with GRU on Neural Engine
 
 
Q