Record video and Classify using YOLO at the same time

Question

Created Jun ’20

Replies 1

Boosts 0

Participants 2

I use the AVCaptureSession to capture the video and uses AVCaptureVideoDataOutput / AVCaptureAudioDataOutput to collect the samples
upon getting the samples, I parse them to the AVAssetWriter and to a CoreML YOLO class, that makes a prediction of the images.
When I do this, it looks like the CoreML uses so many resources that the frames is dropped so the video file becomes unusable.
I have tried with all kind of multi-thread but haven't been successful yet.

Help Please

In another settings I do the same for Speech Recognition, and it works like a charm

Boost

Answer 1

kerfuffle OP

Jun ’20

It depends a little on the device you're using, but YOLO can be quite slow (especially the full version of YOLO). If YOLO runs at 15 FPS, for example, and you block the AVCapture thread, then it will automatically drop frames because your code isn't able to keep up.

One solution would be to use a queue of size 1, and make Core ML read from this queue from a separate thread. The AVCapture thread simply appends its new frames to this queue, saves the frame to the movie file using AVAssetWriter, and then waits for the next frame to come in. (Because the queue is size 1, effectively this always overwrites the old frame.)

Now the AVCapture thread will never be blocked for long amounts of time, and you won't drop any frames in the video. (Of course, YOLO will not see all frames.)

0