SVC vs LinearSVC in scikit learn: difference of loss function - McMap

About

SVC vs LinearSVC in scikit learn: difference of loss function

Asked 8/10, 2020 at 7:39 Answered 15/10, 2020 at 12:16

python scikit-learn svm libsvm

P

1

7

According to this post, SVC and LinearSVC in scikit learn are very different. But when reading the official scikit learn documentation, it is not that clear.

Especially for the loss functions, it seems that there is an equivalence:

And this post says that le loss functions are different:

SVC : 1/2||w||^2 + C SUM xi_i
LinearSVC: 1/2||[w b]||^2 + C SUM xi_i

It seems that in the case of LinearSVC, the intercept is regularized, but the official documentation says otherwise.

Does anyone have more information?

Postconsonantal answered 8/10, 2020 at 7:39 Comment(0)

P

4

SVC is a wrapper of LIBSVM library, while LinearSVC is a wrapper of LIBLINEAR

LinearSVC is generally faster then SVC and can work with much larger datasets, but it can only use linear kernel, hence its name. So the difference lies not in the formulation but in the implementation approach.

Quoting LIBLINEAR FAQ:

When to use LIBLINEAR but not LIBSVM

There are some large data for which with/without nonlinear mappings gives similar performances. 
Without using kernels, one can quickly train a much larger set via a linear classifier. 
Document classification is one such application. 
In the following example (20,242 instances and 47,236 features; available on LIBSVM data sets), 
the cross-validation time is significantly reduced by using LIBLINEAR:

% time libsvm-2.85/svm-train -c 4 -t 0 -e 0.1 -m 800 -v 5 rcv1_train.binary
Cross Validation Accuracy = 96.8136%
345.569s

% time liblinear-1.21/train -c 4 -e 0.1 -v 5 rcv1_train.binary
Cross Validation Accuracy = 97.0161%
2.944s

Warning:While LIBLINEAR's default solver is very fast for document classification, it may be slow in other situations. See Appendix C of our SVM guide about using other solvers in LIBLINEAR.
Warning:If you are a beginner and your data sets are not large, you should consider LIBSVM first.

Putput answered 15/10, 2020 at 12:16 Comment(7)

The difference is not only the speed, they are different. I made a simple example here. And you can also read this – Postconsonantal 16/10, 2020 at 15:8

My question is about the loss function of the two classifiers. Thank you – Postconsonantal 16/10, 2020 at 15:10

You can find more implementation details in the Appendices of the origiinal LIBLINEAR paper – Putput 18/10, 2020 at 10:16

The answer in the post is correct. LIBLINEAR does includes bias term in optimization, while LIBSVM does not. – Putput 18/10, 2020 at 15:30

SVC defaults to L1 loss and L2 penalty. This is why you can create conditions when the results of both are almost equal, if you set for LinearSVM loss="hinge" and intercept_scaling large enough. Bias term is included in LIBLINEAR as weight vector is implicitly extended as w=[w;b]. If you center your data before optimizing, it should effectively set bias to zero. – Putput 18/10, 2020 at 15:51

So, there is an error in the scikit learn documentation? For LinearSVC, the math formula should include a penalty for the bias b, right? – Postconsonantal 18/10, 2020 at 19:43

It surely is confusing and incomplete. There is an issue discussing setting warning that intercept is regularized in code (and eventually not doing it), but no mentions whatsoever in documentation. – Putput 18/10, 2020 at 20:17

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.