VNCoreMLRequest output jitter

I implemented object detection using vision+coreml. It detects the object correctly, but getting jittering bounding boxes. Output looks like this: https://drive.google.com/file/d/1y63iY7pMWRwjrs5LOrKuARaKUgzfTaLp/view?usp=sharing

Any suggetion how to stop this?


I am using AVCaptureOutputsamplebuffer to get imagebuffer, which is the input for VNRequest.

I tried it for different frame rates, but jittering is still there.

Replies

Is your code open source? I'd love to pull it down and give it a try. No guarantees I could help with the jitter.

I refered this blog post: http://blog.ichibod.com/posts/2018/01/28/coreml-machine-learning-part-2/. You can find link to git at the bottom of page. Code is same without much modification, since my main objective was preparing and training data to mlmodel and getting nuance of it.

Problem here is prediction for same object in one AVcapture frame and next frame may not be from same reference in model. There will be change in predicted boundingbox frame even if device is stationary. This will cause shifting of bounding boxes, hence the jitter.


Thank you.

This is just how object detection models work. Tiny variations in the input image can cause slightly different predictions, and then the postprocessing step (non-max suppression) may choose different bounding boxes to keep.


There is no such thing as temporal coherence in this kind of model: each frame is considered to be a completely new image with completely new objects.