MLModel prediction - threading

Hi


we are executing a MLModel predection on GPU on a background thread. When the model is executing it seems that something will happen on main thread or at least something is happening that affects main thread for a short moment.

This affects UI with some anoying small unresponsive moments.


Any suggestion how to overcome / mitigate this?

Replies

On what devices does this occure?


As far as I could determine by experimentation, devices with A7 and A8 chips don't have GPU-accelleration for CoreML. That means CoreML processing is done on the CPU, which naturally decreases the UI performance.

It happens in X, 8 and 7.

Note that the network that we run is quite hevy. I think that it might be related to GPU pipeline occupation.

UI rendering is probably aftected and some synchronization might happen with main thread. Not sure, though.

Nevertheless CoreML execution could have the possibility to add code and cancellation within execution scope.

You also have to keep in mind that Core Animation (UI) is also GPU-accellerated. When the GPU is under heavy load, UI will always suffer.


Have you tried executing CoreML in a low-priority queue? I would be curious if that would affect GPU execution priority as well.

Correct, that was what i was mentioning. I think the problem is realy how CoreML is implemented. When executes a prediction you don't have any control on GPU and if your model takes around 2 seconds with a huge amount of calls to the GPU this situation can happen. At least it seems to be what is happening... 🙂

If you have the opportunity to pause execution between CNN layers by executing CPU code like thread sleep for 5 millieconds, it could take like 2.1 secs with no "visible" impact on the UI.