Style transfer models using much more memory on iPhone Xs

I'm using Core ML models for image style transfer. An initialized model takes ~60 MB memory on an iPhone X in iOS 12. However, the same model loaded on an iPhone Xs (Max) consumers more then 700 MB of RAM.


In instruments I can see that the runtime allocates 38 IOSurfaces with up to 54 MB memory foodprint each alongside numerious other Core ML (Espresso) related objects. Those are not there on the iPhone X.


My guess is that the Core ML runtime does something different in order to utilize the power of the A12. However, my app crashes due to the memory pressure.


I already tried to convert my models again with the newest version of coremltools. However, they are identical.


Did I miss something?


Thanks in advance!

Replies

Hi Frank, please file a bug report and update your post to mention the bug number. Thanks.

Hey, thanks for the fast response!

I filed a radar (44821525) and also attached a model for reproducability.

Here are some findings and a work-around I found:


From what I could see in Instruments I conclude that the CoreML runtime is pre-allocating all buffers (hence the many IOSurfaces) required to execute the neural network when initializing the model (using the method Espresso::ANERuntimeEngine::blob_container::force_allocate()).

Interestingly this only happens for models with a relatively high input size (1792 x 1792) and not for smaller ones (1024 x 1024).


Since this is only happening on the Xs I assumed it has something to do with the A12’s Neural Engine. So I configured the model to use CPU and GPU only as compute unit (MLComputeUnitsCPUAndGPU instead of MLComputeUnitsAll) and that did the trick—no more pre-allocated buffers.

So I’m using this as a work-around for now.

Hi CoreOSTechE,


Are there any updates on this issue? I just tested on iOS 12.1 and the issue remains the same.


Thanks!

I have also noticed that models on the XS require more memory than the same model on older devices, and that this goes away when not using the Neural Engine. The difference wasn't as extreme as what you found, but still a noticeable different nonetheless.