When attempting to load an mlmodel and run it on the CPU/GPU by passing the ComputeUnit you'd like to use when creating the model with:
model = ct.models.MLModel('mymodel.mlmodel', ct.ComputeUnit.CPU_ONLY)
Documentation for coremltools v7.0 says:
compute_units: coremltools.ComputeUnit
coremltools.ComputeUnit.ALL: Use all compute units available, including the neural engine.
coremltools.ComputeUnit.CPU_ONLY: Limit the model to only use the CPU.
coremltools.ComputeUnit.CPU_AND_GPU: Use both the CPU and GPU, but not the neural engine.
coremltools.ComputeUnit.CPU_AND_NE: Use both the CPU and neural engine, but not the GPU. Available only for macOS >= 13.0.
coremltools 7.0 (and previous versions I've tried) now seems to ignore that hint and only runs my models on the ANE. Same model when loaded into XCode and run a perf test with cpu only runs happily on the CPU and selected in Xcode performance tool.
Is there a way in python to get our models to run on different compute units?