Core ML Model Performance report shows prediction speed much faster than actual app runs

Hi all, I'm tuning my app prediction speed with Core ML model. I watched and tried the methods in video: Improve Core ML integration with async prediction and Optimize your Core ML usage. I also use instruments to look what's the bottleneck that my prediction speed cannot be faster.

Below is the instruments result with my app. its prediction duration is 10.29ms

And below is performance report shows the average speed of prediction is 5.55ms, that is about half time of my app prediction!

Below is part of my instruments records. I think the prediction should be considered quite frequent. Could it be faster?

How to be the same prediction speed as performance report? The prediction speed on macbook Pro M2 is nearly the same as macbook Air M1!

It's hard to say without looking at the actual app, but here is my uneducated instinct.

Instruments report suggests you ran the prediction more than 10K times. Xcode performance tab ran far less predictions to get mean and median. Maybe thermal throttle kicked in while running 10K times and skewed the statistics?

Thank you for your insight. That's a good point about the potential thermal throttling issue. I'm curious about how we can maintain efficient execution while avoiding thermal throttling. Do you have any recommendations for optimizing the prediction runs to balance performance and thermal management?

Hi, I think I do the async wrong. My app captures screen with ScreenCaptureKit, and using Core ML model to convert its style then Draw on Metal View. I think this situation might not be able to use async prediction to get the result due to the screenshots have their order. Is it still possible to speed up the prediction?

Do you have any recommendations for optimizing the prediction runs to balance performance and thermal management?

If you haven't watched already, this video explains a few things about performance best practices. https://developer.apple.com/videos/play/wwdc2023/10049/

My app captures screen with ScreenCaptureKit, and using Core ML model to convert its style then Draw on Metal View. I think this situation might not be able to use async prediction to get the result due to the screenshots have their order.

Do you have multiple concurrent async tasks and need to sort the results in a particular order? You might be looking for AsyncStream.

Core ML Model Performance report shows prediction speed much faster than actual app runs
 
 
Q