Face Detection Algorithms with minimal training time [closed]
Asked Answered
R

1

0

Wanted to ask if there was any kind of face detection scheme suitable for video that would require minimal training time ideally about a few days rather than weeks like the Viola-Jones. I have read about LBP but it requires a huge set of training samples too but not sure how long it takes. Does training an LBP consume as much time as the Viola - Jones method with a similar number of training set ?. I will be implementing this on a microprocessor like raspberry pi running on a linux OS. Will want to implement it on C for speed as I want it to be able to detect images in a 10fps - 20fps video stream.

Regurgitate answered 20/11, 2013 at 1:46 Comment(7)
Will you be implementing the training yourself or the running yourself?Coffin
Yes I will be implementing it myself. It will help give me a better understanding of the algorithm that way. Basically from implementing the training stage to the detector will be fully implemented by me without using any libraries that way I can easily edit the code if I need to port it to a board.Regurgitate
You should look at OpenCV's implementation of traincascade as an inspiration. OpenCV is a large codebase, but the app itself is pretty small.Coffin
Sure will have a look. ThanksRegurgitate
I'm expanding my answer with my understanding of OpenCV's traincascade. In the meanwhile the traincascade app source code is found online here: code.opencv.org/projects/opencv/repository/revisions/master/…Coffin
Thanks again will be looking at those source codes for a better understanding.Hopefully I would be able to implement it in C by understanding the code.Regurgitate
I have extended my discussion about LBP detection in OpenCV, which should help you greatly implementing it.Coffin
C
11

OpenCV ships with a tool called traincascade that trains LBP, Haar and HOG. Specifically for face detection they even ship the 3000-image dataset of 24x24 pixel faces, in the format needed by traincascade.

In my experience, of the three types traincascade supports, LBP takes the least time to train, taking on the order of hours rather than days for Haar.

A quick overview of its training process is that for the given number of stages (a decent choice is 20), it attempts to find features that reject as many non-faces as possible while not rejecting the faces. The balance between rejecting non-faces and keeping faces is controlled by the mininum hit rate (OpenCV chose 99.5%) and false alarm rate (OpenCV chose 50%). The specific meta-algorithm used for crafting OpenCV's own LBP cascade is Gentle AdaBoost (GAB).

The variant of LBP implemented in OpenCV is described here:

Shengcai Liao, Xiangxin Zhu, Zhen Lei, Lun Zhang and Stan Z. Li. Learning Multi-scale Block Local Binary Patterns for Face Recognition. International Conference on Biometrics (ICB), 2007, pp. 828-837.

What it amounts to in practice in OpenCV with default parameters is:

OpenCV LBP Cascade Runtime Overview

The detector examines 24x24 windows within the image looking for a face. Stepping from Stage 1 to 20 of the cascade classifier, if it can show that the current 24x24 window is likely not a face, it rejects it and moves over the window by one or two pixels over to the next position; Otherwise it proceeds to the next stage.

During each stage, 3-10 or so LBP features are examined. Every LBP feature has an offset within the window and a size, and the area it covers is fully contained within the current window. Evaluating an LBP feature at a given position can result in either a pass or fail. Depending on whether an LBP feature succeeds or fails, a positive or negative weight particular to that feature is added to an accumulator.

Once all of a stage's LBP features are evaluated, the accumulator's value is compared to the stage threshold. A stage fails if the accumulator is below the threshold, and passes if it is above. Again, if a stage fails, the cascade is exited and the window moves to the next position.

LBP feature evaluation is relatively simple. At that feature's offset within the window, nine rectangles are laid out in a 3x3 configuration. These nine rectangles are all the same size for a particular LBP feature, ranging from 1x1 to 8x8.

The sum of all the pixels in the nine rectangles are computed, in other words their integral. Then, the central rectangle's integral is compared to that of its eight neighbours. The result of these eight comparisons is eight bits (1 or 0), which are assembled in an 8-bit LBP.

This 8-bit bitvector is used as an index into a 2^8 == 256-bit LUT, computed by the training process and particular to each LBP feature, that determines whether the LBP feature passed or failed.

That is all there is to it.

Coffin answered 20/11, 2013 at 2:50 Comment(7)
Thanks for the great simple explanation. Really helped speed up my understanding now I will just have to understand the code in OpenCV to make my own.Regurgitate
I trained my model. But what should I do for detection? The detectMultiScale documentation speaks about its support only to haar features and not LBP. Any inputs on that?Pages
I have tried using LBP cascade xml with the detectMulti scale function. The image size is 3000X2250, pretty big yes and the size of the samples were restricted to 30X50 while training. Its been exactly an hour and 10 mins for detection and is still running while am typing this. even the minsize and maxsize was mentioned from 10 to 50. Any reason for that?Pages
@LakshmiNarayanan detectMultiScale() works for more than Haar. If you look closer, you'll notice that the cascade parameter is only for the C API. What you are meant to do in C++ is to declare a CascadeClassifier csc; object, then csc.read(const string& filename) the XML description of your cascade (LBP or Haar, doesn't matter) and then csc.detectMultiScale(...) with your arguments of choice.Coffin
@LakshmiNarayanan You're feeding a huge image to the classifier already; If the detection window is 30x50 and the the minimum size is 50x50, then there will be no downsizing at all on the biggest scale. That means you'll be scanning ~ 3000x2250/4 possible windows (>1 million windows!) on just that scale! What is your scale factor?Coffin
1.3. Should I increase that and try?Pages
@LakshmiNarayanan It really depends on your classifier's tolerance; However, as a rule, if you square the scale factor you halve the number of scales to search, and therefore the amount of work. But if your first scale is already far too expensive, it won't do much good. Doubling the minimum object size will do far more to speed up your image processing.Coffin

© 2022 - 2024 — McMap. All rights reserved.