Posts

Post not yet marked as solved
0 Replies
541 Views
Hi everyone, Wondering if you know how the device decide which compute unit (GPU, CPU or ANE) to use when compute units are set to ALL? I'm working on optimizing a GPT2 model to run on ANE. I ran the performance report for the existing model and the report showed me operators not supported by ANE. Then I went onto remove these operators and converted the model to CoreML again. This time the performance report showed that every operator is supported by ANE but the device still prefers GPU when the compute units are set to ALL and perfers CPU when the compute units are set to CPU and ANE. ALL CPU and ANE Does anyone know why? Thank you in advance!
Posted
by dcdcdc123.
Last updated
.
Post not yet marked as solved
1 Replies
633 Views
Hi folks, I'm working on converting a GPT2 model to coreml with KV caching enabled. I have a GPT2 model runinng on GPU with static input shape It seems once I enable flexible shape (i.e. either range shape or enumerated shape), the model will be run on CPU according to the performance report. I can see new operators being added ( get_shape and general_slice ) and it is not supported by GPU / ANE Wondering if there's any way to get around this to get the model running on GPU / ANE? How does the machine decide whether to run the model on GPU / Neural Engine? Thanks!
Posted
by dcdcdc123.
Last updated
.