I have exported a Pytorch model into a CoreML mlpackage file and imported the model file into my iOS project. The model is a Music Source Separation model - running prediction on audio-spectrogram blocks and returning separated audio source spectrograms.
Model produces correct results vs. desktop+GPU+Python but the inference on iPhone 15 Pro Max is really, really slow. Using Xcode model Performance tool I can see that the inference isn't automatically managed between compute units - all of it runs on CPU. The Performance tool notation hints all that ops should be supported by both the GPU and Neural Engine.
One thing to note, that when initializing the model with MLModelConfiguration option .cpuAndGPU or .cpuAndNeuralEngine there is an error in Xcode console:
`Error(s) occurred compiling MIL to BNNS graph:
[CreateBnnsGraphProgramFromMIL]: Failed to determine convolution kernel at location at /private/var/containers/Bundle/Application/2E3C4AFF-1FA4-4C95-AAE4-ECEBC0FB0BF9/mymss.app/mymss.mlmodelc/model.mil:2453:12
@ CreateBnnsGraphProgramFromMIL`
Before going back hammering the model in Python, are there any tips/strategies I could try in CoreMLTools export phase or in configuring the model for prediction on iOS?
My export toolchain is currently Linux with CoreMLTools v8.1, export target iOS16.
This particular issue was caused by the upsampling path of the model having torch.nn.ConvTranspose2d
operations with stride (16,1) and kernel (16,1) - for some reason ANE-compiler did not like them. Resolved by changing 16x to 4x+4x operation instead.
Furthermore, ANE is 16-bit only. CoreMLTools may compile your 32-bit model without issues but loading the runtime model on iOS/macOS will not be assigned on ANE.