How to train a Support Vector Machine(svm) classifier with openCV with facial features?
Asked Answered
A

2

9

I want to use the svm classifier for facial expression detection. I know opencv has a svm api, but I have no clue what should be the input to train the classifier. I have read many papers till now, all of them says after facial feature detection train the classifier.

so far what I did,

  1. Face detection,
  2. 16 facial points calculation in every frame. below is an output of facial feature detection![enter image description
  3. A vector which holds the features points pixel addresshere

Note: I know how I can train the SVM only with positive and negative images, I saw this codehere, But I don't know how I combine the facial feature information with it.

Can anybody please help me to start the classification with svm.

a. what should be the sample input to train the classifier?

b. How do I train the classifier with this facial feature points?

Regards,

Appendicular answered 26/9, 2014 at 8:57 Comment(1)
hey, bring back the dots on the face ;) (which opencv version are you using ? )Zillion
Z
15

the machine learning algos in opencv all come with a similar interface. to train it, you pass a NxM Mat offeatures (N rows, each feature one row with length M) and a Nx1 Mat with the class-labels. like this:

//traindata      //trainlabels

f e a t u r e    1 
f e a t u r e    -1
f e a t u r e    1
f e a t u r e    1
f e a t u r e    -1

for the prediction, you fill a Mat with 1 row in the same way, and it will return the predicted label

so, let's say, your 16 facial points are stored in a vector, you would do like:

Mat trainData; // start empty
Mat labels;

for all facial_point_vecs:
{
    for( size_t i=0; i<16; i++ )
    {
        trainData.push_back(point[i]);
    }
    labels.push_back(label); // 1 or -1
}
// now here comes the magic:
// reshape it, so it has N rows, each being a flat float, x,y,x,y,x,y,x,y... 32 element array
trainData = trainData.reshape(1, 16*2); // numpoints*2 for x,y

// we have to convert to float:
trainData.convertTo(trainData,CV_32F);

SVM svm; // params omitted for simplicity (but that's where the *real* work starts..)
svm.train( trainData, labels );


//later predict:
vector<Point> points;
Mat testData = Mat(points).reshape(1,32); // flattened to 1 row
testData.convertTo(testData ,CV_32F);
float p = svm.predict( testData );
Zillion answered 29/9, 2014 at 6:28 Comment(9)
Hi Break Thanks for your answer but I have a question- -how do I provide the image and the feature points together? means, suppose I have 50 positive images and 20 negative images and every image have 16 feature point, so how do I insert the information which features belog to which image? what should I push_back in the trainData in that case? - why do I multyply 16 with 2 in the 'reshape' line?Appendicular
hmm, when i started typing here, it looked, like you wanted to do emotion detection, like happy/sad. now you edited it a couple of times, and it seems more , that you want face-recognition/people identification, which is a different pair of shoes. could you clarify ?Zillion
oh!! I want to do emotion detection only. for now only happy and sad.Appendicular
ah, ok. note, that there is no connection to the images. (it does not know about images, it only knows your landmark points) all it says in the end is happy or not.Zillion
Ok, I understand but not very clear to me. Suppose image1 has features at (xi,yi) and image2 have features on (Xi2,yi2), so to do SVM we only inser (Xi,Yi) and (Xi2,Yi2)?Appendicular
Let us continue this discussion in chat.Appendicular
Hi break, I am having problem with the trainData, can you please help me. suppose I have 50 positive image and every image have 16 2d feature points, Then how do I declare the trainData? float trainData1[16][2]; or should I do float trainData[50][16*2];Appendicular
In this line trainData.push_back(point[i]); what is point? Is that the vector of keypoints?Forespent
@Crash-ID , i think, those points came from a facial-landmark detector, like stasm, flandmark, asm-lib. (not sift or surf like keypoints, or, at least, not directly. landmarking involves a 'smoothing' pass to a pre-trained model)Zillion
W
3

Face gesture recognition is a widely researched problem, and the appropriate features you need to use can be found by a very thorough study of the existing literature. Once you have the feature descriptor you believe to be good, you go on to train the SVM with those. Once you have trained the SVM with optimal parameters (found through cross-validation), you start testing the SVM model on unseen data, and you report the accuracy. That, in general, is the pipeline.

Now the part about SVMs:

SVM is a binary classifier- it can differentiate between two classes (though it can be extended to multiple classes as well). OpenCV has an inbuilt module for SVM in the ML library. The SVM class has two functions to begin with: train(..) and predict(..). To train the classifier, you give as in input a very large amount of sample feature descriptors, along with their class labels (usually -1 and +1). Remember the format OpenCV supports: every training sample has to be a row-vector. And each row will have one corresponding class label in the labels vector. So if you have a descriptor of length n, and you have m such sample descriptors, your training matrix would be m x n (m rows, each of length n), and the labels vector would be of length m. There is also a SVMParams object that contains properties like SVM-type and values for parameters like C that you'll have to specify.

Once trained, you extract features from an image, convert it into a single row format, and give to predict() and it'll tell you which class it belongs to (+1 or -1).

There's also a train_auto() with similar arguments with a similar format that gives you the optimum values of the SVM parameters.

Also check this detailed SO answer to see an example.

EDIT: Assuming you have a Feature Descriptor that returns a vector of features, the algorithm would be something like:

Mat trainingMat, labelsMat;
for each image in training database:
  feature = extractFeatures( image[i] );
  Mat feature_row = alignAsRow( feature );
  trainingMat.push_back( feature_row );
  labelsMat.push_back( -1 or 1 );  //depending upon class.
mySvmObject.train( trainingMat, labelsMat, Mat(), Mat(), mySvmParams );

I don't presume that extractFeatures() and alignAsRow() are existing functions, you might need to write them yourself.

Wolk answered 29/9, 2014 at 6:11 Comment(6)
Thanks for your reply..As I mentioned in my question I know theoretically what I need to do. I know after feature extraction I will have to train the SVM classifier. Also I know that after training I can use predict() to predict the facial expression. So my main question is how do I use these feature points to train the svm classifier?? if you can give some code snipt that will also help.Appendicular
Thanks again, but do I only provide the features? not related images? then how will it relate which features belongs to which image?Appendicular
No not images (unless the raw image itself is a feature, which is rarely so). You have to extract features from an image to train, and while testing, you again extract the image features. It's not the images that you train and test with, but the corresponding features.Wolk
how do save the trained svm "mySvmObject", i did SvmObject.save("abc.xml");, which is not working :'(Appendicular
Did it give you some error message? It is mySvmObject.save(..).Wolk
no, some thing wrong, in loading the images, let me checkAppendicular

© 2022 - 2024 — McMap. All rights reserved.