Dlib webcam capture with face detection and shape prediction is slow
Asked Answered
C

1

6

I am working on a program in C++ which should detect faces from webcam stream, than crop them using face landmarks and swap them.

I programmed face detection using OpenCV and Viola-Jones face detection. Works fine. Than I searched for how to segment just face from ROI. I tried few skin detection implementations but none was successful.

Than I found dlib face landmarks. I decided to try it. Just in beginning I faced problems because I had to convert cv::Mat to cv_image, Rect to rectangle etc. So I tried to do it just with dlib. I just get stream using cv::VideoCapture and than I wanted to show what is captured using dlib image_window. But here was the problem it was reeeealy slow. Down is used code. Commented lines are lines which do that same but using OpenCV. OpenCV is much more faster, smooth, continuous than code which is not commented whis is like 5 FPS. That's horrible. I can't imagine how slow it will be when I apply face detection and face landmarks.

Am I doing something wrong? How can I make it faster? Or should I use OpenCV for video capture and showing?

cv::VideoCapture cap;
image_window output_frame;

if (!cap.open(0))
{
    cout << "ERROR: Opening video device 0 FAILED." << endl;
    return -1;
}

cv::Mat cap_frame;
//HWND hwnd;
do
{
    cap >> cap_frame;

    if (!cap_frame.empty())
    {
        cv_image<bgr_pixel> dlib_frame(cap_frame);
        output_frame.set_image(dlib_frame);
        //cv::imshow("output",dlib::toMat(dlib_frame));
    }

    //if (27 == char(cv::waitKey(10)))
    //{
    //  return 0;
    //}

    //hwnd = FindWindowA(NULL, "output");
} while(!output_frame.is_closed())//while (hwnd != NULL);

EDIT: After switching to Release mode showing capured frames becomes fine. But I go on and tried to do face detection and shape prediction with dlib just like in example here http://dlib.net/face_landmark_detection_ex.cpp.html. It was quite laggy. So I turned off shape prediction. Still "laggy.

So I assumed face detection is slowing it down. So I tried face detection using OpenCV because it was significantly better than dlib detector. I needed to convert detected cv::Rect to dlib::rectangle. I used this.

std::vector<dlib::rectangle> dlib_rois;
long l, t, r, b;

for (int i = cv_rois.size() - 1; i >= 0; i--)
{
    l = cv_rois[i].x;
    t = cv_rois[i].y;
    r = cv_rois[i].x + cv_rois[i].width;
    b = cv_rois[i].y + cv_rois[i].height;
    dlib_rois.push_back(dlib::rectangle(l, t, r, b));
}

But this combination of OpenCV face detection and dlib shape prediction become brutal laggy. It takes about 4s to process single frame.

I can't figure out why. OpenCV face detection was absolutely fine, dlib shape prediction doesn't seem to be hard to process. Can somebody help me with?

Commendatory answered 27/3, 2016 at 10:36 Comment(7)
dlib.net/faq.html#WhyisdlibslowEnculturation
I'll try Release mode. Thanks for now. But still I wonder how it is possible that OpenCV with face detection is much more smooth than just showing captured frames in dlib.Commendatory
@DavisKing thanks seems like it is comparable with OpenCV.Commendatory
Anyway shape prediction was poorly slow. I saw somewhere that it is fast but in my implementation it was really slow. I takes few seconds on small pictures. Nothing for real time processing :/Commendatory
No, it's much faster than that. dlib.net/faq.html#Whyisdlibslow Make sure the compiler's optimizations are on.Enculturation
As I said, i used Release and still it proces one photo few seconds. But not only face detextion but I detect face shape also. And that is slow. But nevermind now I´m finished with it iˇm not gonna to work on it more. Thanks in advanceCommendatory
@Commendatory and DavisKing Yes Dlib's face detector is much slower than OpenCV's face detector. But detection of face landmark points is fast and it really takes only few milliseconds. It is the face detection in your program that is taking much time. You can see this by taking time stamps at various points then you will see face detection step takes so much time.Tiana
N
14

You can take several actions to make Dlib run faster, before assuming that it is slow. You only have to read more documentation and try.

  • Dlib is capable of detecting faces in very small areas (80x80 pixels). You are probably sending raw WebCam frames at approximately 1280x720 resolution, which is not necessary. I recommend from my experience to reduce the frames about a quarter of the original resolution. Yes, 320x180 is fine for Dlib. In consequence you will get 4x speed.

  • As mentioned in the comments, by turning on the compilation optimizations while building Dlib, you will get significantly improvement in speed.

  • Dlib works faster with grayscale images. You do not need the color on the webcam frame. You can use OpenCV to convert into grayscale the previously reduced in size frame.

  • Dlib takes its time finding faces but is extremely fast finding landmarks on faces. Only if your Webcam provides a high framerate (24-30fps), you could skip some frames because faces normally doesn't move so much.

Given that optimizations, I am confident you will get at least 12x faster detection.

Narthex answered 7/4, 2017 at 1:39 Comment(3)
320x180 is fine for Dlib My camera captures 1080. If I let it record at 1920x1080 resolution it will detect my face from far away (over 3m from camera). If I resize the captured frame and send it do dlib's detector it will no longer "see" as far away. At a quarter sized frame I must stand close to the camera for it to detect my face. Is this expected?Gaultiero
Yes, it makes complete sense because Dlib can detect a faces in areas as little as 80x80. It is highly recommended to resize in order to get a faster detection.Narthex
Thanks. And how to change the minimum area ? say 100x100 for example.Lorrin

© 2022 - 2024 — McMap. All rights reserved.