I have trained a bunch of RBF SVMs using scikits.learn in Python and then Pickled the results. These are for image processing tasks and one thing I want to do for testing is run each classifier on every pixel of some test images. That is, extract the feature vector from a window centered on pixel (i,j), run each classifier on that feature vector, and then move on to the next pixel and repeat. This is far too slow to do with Python.
Clarification: When I say "this is far too slow..." I mean that even the Libsvm under-the-hood code that scikits.learn uses is too slow. I'm actually writing a manual decision function for the GPU so classification at each pixel happens in parallel.
Is it possible for me to load the classifiers with Pickle, and then grab some kind of attribute that describes how the decision is computed from the feature vector, and then pass that info to my own C code? In the case of linear SVMs, I could just extract the weight vector and bias vector and add those as inputs to a C function. But what is the equivalent thing to do for RBF classifiers, and how do I get that info from the scikits.learn object?
Added: First attempts at a solution.
It looks like the classifier object has the attribute support_vectors_
which contains the support vectors as each row of an array. There is also the attribute dual_coef_
which is a 1 by len(support_vectors_)
array of coefficients. From the standard tutorials on non-linear SVMs, it appears then that one should do the following:
- Compute the feature vector
v
from your data point under test. This will be a vector that is the same length as the rows ofsupport_vectors_
. - For each row
i
insupport_vectors_
, compute the squared Euclidean distanced[i]
between that support vector andv
. - Compute
t[i]
asgamma * exp{-d[i]}
wheregamma
is the RBF parameter. - Sum up
dual_coef_[i] * t[i]
over alli
. Add the value of theintercept_
attribute of the scikits.learn classifier to this sum. - If the sum is positive, classify as 1. Otherwise, classify as 0.
Added: On numbered page 9 at this documentation link it mentions that indeed the intercept_
attribute of the classifier holds the bias term. I have updated the steps above to reflect this.