SVM classifier based on HOG features for "object detection" in OpenCV

Asked 26/5, 2012 at 20:27 Answered 12/1, 2014 at 1:34

Solved c++opencv svm object-recognition training-data

I have a project, which I want to detect objects in the images; my aim is to use HOG features. By using OpenCV SVM implementation , I could find the code for detecting people, and I read some papers about tuning the parameters in order to detect object instead of people. Unfortunately, I couldn't do that for a few reasons; first of all, I am probably tuning the parameters incorrectly, second of all, I am not a good programmer in C++ but I have to do it with C++/OpenCV... here you can find the code for detecting HOG features for people by using C++/OpenCV.

Let's say that I want to detect the object in this image. Now, I will show you what I have tried to change in the code but it didn't work out with me.

The code that I tried to change:

HOGDescriptor hog;
hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());

I tried to change getDefaultPeopleDetector() with the following parameters, but it didn't work:

(Size(64, 128), Size(16, 16), Size(8, 8), Size(8, 8), 9, 0,-1, 0, 0.2, true, cv::HOGDescriptor::DEFAULT_NLEVELS)

I then tried to make a vector, but when I wanted to print the results, it seems to be empty.

vector<float> detector;

HOGDescriptor hog(Size(64, 128), Size(16, 16), Size(8, 8), Size(8, 8), 9, 0,-1, 0, 0.2, true, cv::HOGDescriptor::DEFAULT_NLEVELS);

hog.setSVMDetector(detector);

Please, I need help solving this problem.

Noncooperation answered 26/5, 2012 at 20:27 Comment(2)

I'm stuck here, i want to know what to do or even to show me an example – Noncooperation 29/5, 2012 at 11:34

just want to know, if i'm doing wrong in coding, i'm learning and this is the aim of the questions here, to get benefits – Noncooperation 29/5, 2012 at 11:41

In order to detect arbitrary objects with using opencv HOG descriptors and SVM classifier, you need to first train the classifier. Playing with the parameters will not help here, sorry :( .

In broad terms, you will need to complete the following steps:

Step 1) Prepare some training images of the objects you want to detect (positive samples). Also you will need to prepare some images with no objects of interest (negative samples).

Step 2) Detect HOG features of the training sample and use this features to train an SVM classifier (also provided in OpenCV).

Step 3) Use the coefficients of the trained SVM classifier in HOGDescriptor::setSVMDetector() method.

Only then, you can use the peopledetector.cpp sample code, to detect the objects you want to detect.

Jaquenette answered 29/5, 2012 at 20:23 Comment(11)

Thank you so much for your amazing answer, but I have some question to ask please... First, do I have to detect the features with the default parameters for people or do i have to tune the parameters? Second of all, what coefficients you mean, can you give me a brief description please? – Noncooperation 30/5, 2012 at 10:57

Default parameters would do the job for the beginning. The coefficients I mean is the std::vector<float> you pass to hog.setSVMDetector() method. – Jaquenette 30/5, 2012 at 17:37

i'm having some issues please, may i have your email to contact you more – Noncooperation 30/5, 2012 at 19:25

Please ask your questions here. Others may find the discussions useful as well. – Jaquenette 30/5, 2012 at 19:31

I create it, std::vector<float> then after that HOGDescriptor hog(HOGDescriptor::getDefaultPeopleDetector()); which appears not correct and that what I'm suffering from originally, then I passed the coefficient to hog.setSVMDetector(detector); but it's not working... – Noncooperation 30/5, 2012 at 20:4

Another issue, in the example of the peopledetector.cpp I created new list of images which have negative and positive examples which i think i'm quite incorrect. So where to train these images please? – Noncooperation 30/5, 2012 at 20:6

HOGDescriptor::getDefaultPeopleDetector() returns the coefficients of the classifier trained with people data previously by the OpenCV team. So, if you want to perform people detection, use it directly. Do not set another vector for this. – Jaquenette 30/5, 2012 at 20:13

so you mean, i have to put these parameters instead of getdefaultpeopel detector, right? (Size(64, 128), Size(16, 16), Size(8, 8), Size(8, 8), 9, 0,-1, 0, 0.2, true, cv::HOGDescriptor::DEFAULT_NLEVELS – Noncooperation 30/5, 2012 at 20:18

For an explanation of how to train the classifier see here: szproxy.blogspot.com/2010/12/testtest.html Note that you do not need training for people detection, it is already available. Training is needed only when you need to detect other objects. – Jaquenette 30/5, 2012 at 20:18

ok, I will take a look on how they train the classifier in this site and after that if i will have something, I will write it to you here... – Noncooperation 30/5, 2012 at 20:22

@HakanSerce hey Hakan can you check this question too please ? #30194787 – Disforest 13/5, 2015 at 17:27

I've been dealing with the same problem and surprised with the lack of some clean C++ solutions I have create ~> this wrapper of SVMLight <~, which is a static library that provides classes SVMTrainer and SVMClassifier that simplify the training to something like:

// we are going to use HOG to obtain feature vectors:
HOGDescriptor hog;
hog.winSize = Size(32,48);

// and feed SVM with them:
SVMLight::SVMTrainer svm("features.dat");

then for each training sample:

// obtain feature vector describing sample image:
vector<float> featureVector;
hog.compute(img, featureVector, Size(8, 8), Size(0, 0));

// and write feature vector to the file:
svm.writeFeatureVectorToFile(featureVector, true);      // true = positive sample

till the features.dat file contains feature vectors for all samples and at the end you just call:

std::string modelName("classifier.dat");
svm.trainAndSaveModel(modelName);

Once you have a file with model (or features.dat that you can just train the classifier with):

SVMLight::SVMClassifier c(classifierModelName);
vector<float> descriptorVector = c.getDescriptorVector();
hog.setSVMDetector(descriptorVector);
...
vector<Rect> found;
Size padding(Size(0, 0));
Size winStride(Size(8, 8));
hog.detectMultiScale(segment, found, 0.0, winStride, padding, 1.01, 0.1);

just check the documentation of HOGDescriptor for more info :)

Australasia answered 12/1, 2014 at 1:34 Comment(2)

@Liho, how was the final outcome? was the detection good? Did you get a lot of falsepositive as others have mentioned? – Precipitate 12/8, 2015 at 15:32

@Precipitate Outcome was good enough for real time detection of game pieces (domain: board games). That was part of school project though. If you plan to use this in some serious project, you might want to tune up the internal parameters of SVN. You might also want to check the bugs and comments that few guys left on the github for this :) – Australasia 19/8, 2015 at 8:48

I have done similar things as you did: collect samples of positive and negative images using HOG to extract features of car, train the feature set using linear SVM (I use SVM light), then use the model to detect car using HOG multidetect function.

I get lot of false positives, then I retrain the data using positive samples and false positive+negative samples. The resulting model is then tested again. The resulting detection improves (less false positives) but the result is not satisfying (average 50% hit rate and 50% false positives). Tuning up multidetect parameters improve the result but not much (10% less false positives and increase in hit rate).

Edit I can share you the source code if you'd like, and I am very open for discussion as I have not get satisfactory results using HOG. Anyway, I think the code can be good starting point on using HOG for training and detection

Edit: adding code

static void calculateFeaturesFromInput(const string& imageFilename, vector<float>& featureVector, HOGDescriptor& hog) 
{
    Mat imageData = imread(imageFilename, 1);
    if (imageData.empty()) {
        featureVector.clear();
        printf("Error: HOG image '%s' is empty, features calculation skipped!\n", imageFilename.c_str());
        return;
    }
    // Check for mismatching dimensions
    if (imageData.cols != hog.winSize.width || imageData.rows != hog.winSize.height) {
       featureVector.clear();
       printf("Error: Image '%s' dimensions (%u x %u) do not match HOG window size (%u x %u)!\n", imageFilename.c_str(), imageData.cols, imageData.rows, hog.winSize.width, hog.winSize.height);
        return;
    }
    vector<Point> locations;
    hog.compute(imageData, featureVector, winStride, trainingPadding, locations);
    imageData.release(); // Release the image again after features are extracted
}

...

int main(int argc, char** argv) {

    // <editor-fold defaultstate="collapsed" desc="Init">
    HOGDescriptor hog; // Use standard parameters here
    hog.winSize.height = 128;
    hog.winSize.width = 64;

    // Get the files to train from somewhere
    static vector<string> tesImages;
    static vector<string> positiveTrainingImages;
    static vector<string> negativeTrainingImages;
    static vector<string> validExtensions;
    validExtensions.push_back("jpg");
    validExtensions.push_back("png");
    validExtensions.push_back("ppm");
    validExtensions.push_back("pgm");
    // </editor-fold>

    // <editor-fold defaultstate="collapsed" desc="Read image files">
    getFilesInDirectory(posSamplesDir, positiveTrainingImages, validExtensions);
    getFilesInDirectory(negSamplesDir, negativeTrainingImages, validExtensions);
    /// Retrieve the descriptor vectors from the samples
    unsigned long overallSamples = positiveTrainingImages.size() + negativeTrainingImages.size();
    // </editor-fold>

    // <editor-fold defaultstate="collapsed" desc="Calculate HOG features and save to file">
    // Make sure there are actually samples to train
    if (overallSamples == 0) {
        printf("No training sample files found, nothing to do!\n");
        return EXIT_SUCCESS;
    }

    /// @WARNING: This is really important, some libraries (e.g. ROS) seems to set the system locale which takes decimal commata instead of points which causes the file input parsing to fail
    setlocale(LC_ALL, "C"); // Do not use the system locale
    setlocale(LC_NUMERIC,"C");
    setlocale(LC_ALL, "POSIX");

    printf("Reading files, generating HOG features and save them to file '%s':\n", featuresFile.c_str());
    float percent;
    /**
     * Save the calculated descriptor vectors to a file in a format that can be used by SVMlight for training
     * @NOTE: If you split these steps into separate steps: 
     * 1. calculating features into memory (e.g. into a cv::Mat or vector< vector<float> >), 
     * 2. saving features to file / directly inject from memory to machine learning algorithm,
     * the program may consume a considerable amount of main memory
     */ 
    fstream File;
    File.open(featuresFile.c_str(), ios::out);
    if (File.good() && File.is_open()) {
        File << "# Use this file to train, e.g. SVMlight by issuing $ svm_learn -i 1 -a weights.txt " << featuresFile.c_str() << endl; // Remove this line for libsvm which does not support comments
        // Iterate over sample images
        for (unsigned long currentFile = 0; currentFile < overallSamples; ++currentFile) {
            storeCursor();
            vector<float> featureVector;
            // Get positive or negative sample image file path
            const string currentImageFile = (currentFile < positiveTrainingImages.size() ? positiveTrainingImages.at(currentFile) : negativeTrainingImages.at(currentFile - positiveTrainingImages.size()));
            // Output progress
            if ( (currentFile+1) % 10 == 0 || (currentFile+1) == overallSamples ) {
                percent = ((currentFile+1) * 100 / overallSamples);
                printf("%5lu (%3.0f%%):\tFile '%s'", (currentFile+1), percent, currentImageFile.c_str());
                fflush(stdout);
                resetCursor();
            }
            // Calculate feature vector from current image file
            calculateFeaturesFromInput(currentImageFile, featureVector, hog);
            if (!featureVector.empty()) {
                /* Put positive or negative sample class to file, 
                 * true=positive, false=negative, 
                 * and convert positive class to +1 and negative class to -1 for SVMlight
                 */
                File << ((currentFile < positiveTrainingImages.size()) ? "+1" : "-1");
                // Save feature vector components
                for (unsigned int feature = 0; feature < featureVector.size(); ++feature) {
                    File << " " << (feature + 1) << ":" << featureVector.at(feature);
                }
                File << endl;
            }
        }
        printf("\n");
        File.flush();
        File.close();
    } else {
        printf("Error opening file '%s'!\n", featuresFile.c_str());
        return EXIT_FAILURE;
    }
    // </editor-fold>

    // <editor-fold defaultstate="collapsed" desc="Pass features to machine learning algorithm">
    /// Read in and train the calculated feature vectors
    printf("Calling SVMlight\n");
    SVMlight::getInstance()->read_problem(const_cast<char*> (featuresFile.c_str()));
    SVMlight::getInstance()->train(); // Call the core libsvm training procedure
    printf("Training done, saving model file!\n");
    SVMlight::getInstance()->saveModelToFile(svmModelFile);
    // </editor-fold>

    // <editor-fold defaultstate="collapsed" desc="Generate single detecting feature vector from calculated SVM support vectors and SVM model">
    printf("Generating representative single HOG feature vector using svmlight!\n");
    vector<float> descriptorVector;
    vector<unsigned int> descriptorVectorIndices;
    // Generate a single detecting feature vector (v1 | b) from the trained support vectors, for use e.g. with the HOG algorithm
    SVMlight::getInstance()->getSingleDetectingVector(descriptorVector, descriptorVectorIndices);
    // And save the precious to file system
    saveDescriptorVectorToFile(descriptorVector, descriptorVectorIndices, descriptorVectorFile);
    // </editor-fold>

    // <editor-fold defaultstate="collapsed" desc="Test detecting vector">

    cout << "Test Detecting Vector" << endl;
    hog.setSVMDetector(descriptorVector); // Set our custom detecting vector
    cout << "descriptorVector size: " << sizeof(descriptorVector) << endl;

    getFilesInDirectory(tesSamplesDir, tesImages, validExtensions);
    namedWindow("Test Detector", 1);

    for( size_t it = 0; it < tesImages.size(); it++ )
    {
        cout << "Process image " << tesImages[it] << endl;
        Mat image = imread( tesImages[it], 1 );
        detectAndDrawObjects(image, hog);

        for(;;)
        {
            int c = waitKey();
            if( (char)c == 'n')
                break;
            else if( (char)c == '\x1b' )
                exit(0);
        }
    }
    // </editor-fold>
    return EXIT_SUCCESS;
}

Drennen answered 25/4, 2013 at 2:18 Comment(16)

I would ultimately suggest you post the code anyway, and hope for feedback from anybody - not just the author. – Bosquet 25/4, 2013 at 4:11

good idea. kinda busy right now, I'll do that tonight. Thanks! – Drennen 25/4, 2013 at 4:32

hey @Drennen thank you for the answer, i have solved the problem, but seems that you are suggesting a nice way. Anyways, share your code, and I will share mine as well,, what about BOW , do you have any idea? – Noncooperation 25/4, 2013 at 11:42

@bonchenko, where is your code? I posted a new question here #16215420 try to see the questions you might have an idea about them,, – Noncooperation 25/4, 2013 at 12:44

I add the code in my answer above. Basically, I use code from here. You can get the complete code from there and adapt the code to your needs. Anyway, about your HOG result, is it satisfactory? (e.g. high hit rate vs low false positives) By the way, I have read your new question. I haven't explore BOW, so I do not know about how to do multiclass classification there. As for HOG which use Linear SVM, I believe that you have to train the models separately because what linear SVM do is separating two class (object and non-object) using a hyperplane – Drennen 25/4, 2013 at 12:59

@Drennen I have implemented it in matlab, for 3-classes the rate was over 65% but when I did it for 5-classes it was much lower and it was only 47%, I left for a while because I had to do the BOW as well. What do you think about this results so far? by the way the classes that I classified are coca-cola, ketchup and washing liquid and then added Fanta and head and shoulder shampoo – Noncooperation 25/4, 2013 at 13:1

At least better than mine (50% of people detection using custom dataset such as this picture). But what I need is better accuracy (80% or more) on people detection on this kind of image or this – Drennen 25/4, 2013 at 13:17

@Drennen to be honest you make a bit happier, I have used HOG for multi-class classification not for people which HOG made basically for people, I think both of us have to get our percentage higher especially you, cuz you are dealing with a project that lots of people did it before you with higher percentage – Noncooperation 25/4, 2013 at 13:21

@Mario, Will this code take care of all image preprocessing techinques required to identify an object / image For Ex ( Computing Image Gradients, Cells, histogram Normalization ), and i am newbie, need help in this...pls share your mail id – Photopia 4/7, 2013 at 13:40

@Drennen It's been a while since your last post. I'm hoping you managed to get a better detection percentage? – Swoosh 19/8, 2013 at 16:52

@TimeManx I have not get better detection percentage. Besides I find that HOG is pretty slow, I need detection for video stream (20 fps). Currently I am working on Haar and LBP feature – Drennen 20/8, 2013 at 3:55

@Drennen Okay, so how's Haar and LBP working? I believe the accuracy would be a little lower. – Swoosh 20/8, 2013 at 7:21

I use the Haar and LBP to detect head. I decide it is easier to handle occlusion if I only detect the head. As for my usage, it is faster and more accurate than HOG – Drennen 20/8, 2013 at 7:41

@Drennen About how much frame rate? – Swoosh 21/8, 2013 at 9:50

10x faster. Afterall, the case is different because my Haar/LBP app only detect head while HOG detect whole human body – Drennen 22/8, 2013 at 2:22

@Drennen Any chance you could share your cascades. The ones that openCV provides just don't work for me that well. – Swoosh 22/8, 2013 at 5:12

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags