Multi-class classification in libsvm [closed]
Asked Answered
B

1

11

I'm working with libsvm and I must implement the classification for multiclasses with one versus all.

How can I do it?
Does libsvm version 2011 use this?


I think that my question is not very clear. if libsvm don't use automatically one versus all,I will use one svm for every class, else how can i defined this parameters in the svmtrain function. I had read README of libsvm.

Bridle answered 28/1, 2012 at 0:9 Comment(3)
Explain us what you've tried? If you have not tried anything yet, read the README. ps: svmlib or libsvm?Patrinapatriot
related question: #4977039Maladapted
k(k-1)/2 classifiers are formed. I don't think k(k/1)=2 are formed. Just for information. Besides that, Amro's answer is perfect...Storiette
M
36

According to the official libsvm documentation (Section 7):

LIBSVM implements the "one-against-one" approach for multi-class classification. If k is the number of classes, then k(k-1)/2 classifiers are constructed and each one trains data from two classes.

In classification we use a voting strategy: each binary classification is considered to be a voting where votes can be cast for all data points x - in the end a point is designated to be in a class with the maximum number of votes.

In the one-against-all approach, we build as many binary classifiers as there are classes, each trained to separate one class from the rest. To predict a new instance, we choose the classifier with the largest decision function value.


As I mentioned before, the idea is to train k SVM models each one separating one class from the rest. Once we have those binary classifiers, we use the probability outputs (the -b 1 option) to predict new instances by picking the class with the highest probability.

Consider the following example:

%# Fisher Iris dataset
load fisheriris
[~,~,labels] = unique(species);   %# labels: 1/2/3
data = zscore(meas);              %# scale features
numInst = size(data,1);
numLabels = max(labels);

%# split training/testing
idx = randperm(numInst);
numTrain = 100; numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:);  testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));

Here is my implementation for the one-against-all approach for multi-class SVM:

%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end

%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    prob(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end

%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel)    %# accuracy
C = confusionmat(testLabel, pred)                   %# confusion matrix
Maladapted answered 29/1, 2012 at 0:17 Comment(20)
can you provide me an example for one against all with libsvm?Bridle
I try to use the code provided by lakesh in his questions(Multi-Class SVM( one versus all), is it correct?Bridle
@images: I've added a sample implementationMaladapted
'@ Amro :Thank you very much for your efforts.'Bridle
When I copy your code I get the following error at the line [~,~,labels] = unique(species); in matlab: Expression or statement is incorrect--possibly unbalanced (, {, or [.. Could you help me please?Holmes
@Ezati: the ~ syntax requires R2009b. Use a dummy variable instead if you are using an older version of MATLABMaladapted
Thanks a lot, I replaced it with [dummy,dummy,labels] = unique(species); and it worked fineHolmes
May I ask you another question: https://mcmap.net/q/540121/-10-fold-cross-validation-in-one-against-all-svm-using-libsvm/1071703 ...if you have time of course :)Holmes
@Ezati: I have posted an answer there.Maladapted
Can you expliain what these parameters stand for '-c 1 -g 0.2'Hermosillo
@MVTC: you should probably read the libsvm guide first; c is the penalty parameter of the error term in C-SVC, g is the RBF kernel gamma parameter. One usually use cross-validation to find the best values for these parameters, see here for an example: https://mcmap.net/q/540122/-retraining-after-cross-validation-with-libsvmMaladapted
How would I do this for a dataset size of 50k samples and of dimensionality 4000? Matlab seems to be taking too long. *I added the -t 0 option for a linear kernelUnterwalden
What about one class SVM? How do you define the Labels vector in that case? Thank You.Tshirt
@Amro: when I use your example, I reach to this error: Invalid MEX-file '...\libsvm-3.22\libsvm-3.22\matlab\svmtrain.mexw64': The specified module could not be found..Binnie
@Amin: It sounds like a build problem, see this post for instructions on how to compile libsvm for MATLAB: https://mcmap.net/q/541376/-how-to-run-libsvm-in-matlab. If you are still having problems, consider using Dependency Walker to troubleshoot: mathworks.com/matlabcentral/answers/…Maladapted
@Amro: perfect.Binnie
@Amin: as was explained in the above post, one-vs-one is the approach implemented in libsvm, just call svmtrain directly with multi-class labels... Here is another answer of mine that compares the two: https://mcmap.net/q/540121/-10-fold-cross-validation-in-one-against-all-svm-using-libsvmMaladapted
@Amro: for your reference: mathworks.com/matlabcentral/answers/…Binnie
@Amin: Don't expect full answers in comments, but you can't just blindly apply machine learning algorithms, and expect good results. You should read up on SVM and its parameters (kernels, C, gamma, etc.), and how you would need to do a grid search to find good values using cross-validation. You should also look into preprocessing the data (normalizing the features at the least)... Good luck.Maladapted
@Maladapted Please can you help me here? thank you a lot #65450434Huppert

© 2022 - 2024 — McMap. All rights reserved.