VNCoreMLRequest output jitter

Question

I implemented object detection using vision+coreml. It detects the object correctly, but getting jittering bounding boxes. Output looks like this: https://drive.google.com/file/d/1y63iY7pMWRwjrs5LOrKuARaKUgzfTaLp/view?usp=sharing

Any suggetion how to stop this?

I am using AVCaptureOutputsamplebuffer to get imagebuffer, which is the input for VNRequest.

I tried it for different frame rates, but jittering is still there.

Core ML

534

Posted by

manzu

Reply

Add a Comment

Answer 1

Is your code open source? I'd love to pull it down and give it a try. No guarantees I could help with the jitter.

Posted by

gantman@gmail.com

Add a Comment

Answer 2

I refered this blog post: http://blog.ichibod.com/posts/2018/01/28/coreml-machine-learning-part-2/. You can find link to git at the bottom of page. Code is same without much modification, since my main objective was preparing and training data to mlmodel and getting nuance of it.

Problem here is prediction for same object in one AVcapture frame and next frame may not be from same reference in model. There will be change in predicted boundingbox frame even if device is stationary. This will cause shifting of bounding boxes, hence the jitter.

Thank you.

Posted by

manzu

Add a Comment

Answer 3

This is just how object detection models work. Tiny variations in the input image can cause slightly different predictions, and then the postprocessing step (non-max suppression) may choose different bounding boxes to keep.

There is no such thing as temporal coherence in this kind of model: each frame is considered to be a completely new image with completely new objects.

Posted by

kerfuffle

Add a Comment

VNCoreMLRequest output jitter

Replies