I need a Support Vector Machine library for Scala.
I guess that I should have a look at both Scala and Java implementations, do you recommend me to use any of them in particular?
I need a Support Vector Machine library for Scala.
I guess that I should have a look at both Scala and Java implementations, do you recommend me to use any of them in particular?
Here are two alternatives:
I ended up using the second one from Scala without problems. It is published to Maven Central, although not the latest source version.
This one looks promising. Looks like it's constantly updated
I would suggest libsvm. It is very well maintained and, as it is already mentioned, a Java implementation is available in maven.
It is possible to use libsvm for classification, regression and outlier detection (one-class svm).
In the following code snippets I will show an example on how to use libsvm for regression.
To set up the model (SVR, which comes into two flavors) you need to define an svm_problem and some parameters.
The object svm_problem is an object with the following properties: x: the input data y: the labels l: the size of the problem.
In libsvm an input data point is represented by an svm_node object. An svm_node object has two properties: an index, which is the index of the the feature and a value that is the value of the feature.
class SVMProblem(val X:DenseMatrix[Double], val y:Option[DenseVector[Double]]) {
private val n = X.rows
private val m = X.cols
def define():svm_problem = {
val nodes = Array.ofDim[svm_node](n, m)
for (i <- 0 until n; j <- 0 until m) {
val node = new svm_node()
node.index = j
node.value = X(i, j)
nodes(i)(j) = node
}
val problem = new svm_problem()
problem.x = nodes
y match {
case Some(labels) => problem.y = labels.toArray
case _ =>
}
problem.l = n
problem
}
}
The parameters allow us to control the training of the model. So, for instance, in the case of regression we need to provide parameters to control the kernel function, epsilon (in case epsilon-SVR is used) and C.
trait SVMParamComponent {
def update(param:svm_parameter):Unit
}
class CSVCComponent(val C:Double=1.0) extends SVMParamComponent {
override def update(param:svm_parameter):Unit = {
param.svm_type = svm_parameter.C_SVC
param.C = C
}
}
class RBFKernelComponent(val gamma:Double=0.5) extends SVMParamComponent {
override def update(param:svm_parameter):Unit = {
param.kernel_type = svm_parameter.RBF
param.gamma = gamma
}
}
class EpsilonComponent(val epsilon:Double=0.1) extends SVMParamComponent {
override def update(param:svm_parameter):Unit = {
param.p = epsilon
}
}
class SVMType(val t:Int=0) extends SVMParamComponent {
override def update(param:svm_parameter):Unit = {
assert (List(0, 1, 2, 3, 4).contains(t))
param.svm_type = t
}
}
class SVMRuntimeConfigComponent(val cacheSize:Int=1000, val eps:Double=0.001, val returnProb:Int=1,
val debug:Boolean=false) extends SVMParamComponent {
override def update(param:svm_parameter):Unit = {
param.cache_size = cacheSize
param.eps = eps
param.probability = returnProb
if (!debug){
svm.svm_set_print_string_function(new libsvm.svm_print_interface(){
override def print(s:String):Unit = {}
})
}
}
}
Finally, we can write a wrapper to train a model:
class SVR(val definition:SVMProblem, val paramComponents:Seq[SVMParamComponent]) {
var model:svm_model = _
def fit():Unit = {
val problem = definition.define()
val param = new svm_parameter()
paramComponents.foreach(c => c.update(param))
model = svm.svm_train(problem, param)
}
def predict(X_new:DenseMatrix[Double]) = {
val n = X_new.rows
val n_features = X_new.cols
val nodes = Array.ofDim[svm_node](n, n_features)
for (i <- 0 until n; j <- 0 until n_features) {
val node = new svm_node()
node.index = j
node.value = X_new(i, j)
nodes(i)(j) = node
}
val predictions = Array.ofDim[Double](n)
for {i <- 0 until n} {
predictions(i) = svm.svm_predict(model, nodes(i))
}
new DenseVector(predictions)
}
def save(filename:String):Unit = {
try {
svm.svm_save_model(filename, model)
} catch {
case ex:IOException => println(ex.getMessage())
}
}
def load(filename:String):Unit = {
try {
model = svm.svm_load_model(filename)
} catch {
case ex:Exception => println(ex.getMessage())
}
}
}
Now, assuming we have some training and testing data we can train and test a model in the following way:
val problem = new SVMProblem(X_train_scaled, Some(y_train.toDenseVector))
val paramComponentSeq = Seq(
new CSVCComponent(),
new RBFKernelComponent(),
new EpsilonComponent(),
new SVMType(3),
new SVMRuntimeConfigComponent()
)
val svr = new SVR(problem, paramComponentSeq)
svr.fit()
svr.predict(X_test_scaled)
© 2022 - 2024 — McMap. All rights reserved.