How to perform 10 fold cross validation with LibSVM in R?
Asked Answered
T

3

6

I know that in MatLab this is really easy ('-v 10').

But I need to do it in R. I did find one comment about adding cross = 10 as parameter would do it. But this is not confirmed in the help file so I am sceptical about it.

svm(Outcome ~. , data= source, cost = 100, gamma =1, cross=10)

Any examples of a successful SVM script for R would also be appreciated as I am still running into some dead ends?

Edit: I forgot to mention outside of the tags that I use the libsvm package for this.

Trilateration answered 12/11, 2012 at 16:26 Comment(2)
the caret package may prove useful for you. It has extensive vignettes and the ability to fit many different models through a common interface (the train function).Twandatwang
Tune seems to be similar in the e1071 package and I try to minimize the number of packages I use so I will try this but still hoping for more replies.Trilateration
C
7

I am also trying to perform a 10 fold cross validation. I think that using tune is not the right way in order to perform it, since this function is used to optimize the parameters, but not to train and test the model.

I have the following code to perform a Leave-One-Out cross validation. Suppose that dataset is a data.frame with your data stored. In each LOO step, the observed vs. predicted matrix is added, so that at the end, result contains the global observed vs. predicted matrix.

#LOOValidation
for (i in 1:length(dataset)){
    fit = svm(classes ~ ., data=dataset[-i,], type='C-classification', kernel='linear')
    pred = predict(fit, dataset[i,])
    result <- result + table(true=dataset[i,]$classes, pred=pred);
}
classAgreement(result)

So in order to perform a 10-fold cross validation, I guess we should manually partition the dataset, and use the folds to train and test the model.

for (i in 1:10)
    train <- getFoldTrainSet(dataset, i)
    test <- getFoldTestSet(dataset,i)
    fit = svm(classes ~ ., train, type='C-classification', kernel='linear')
    pred = predict(fit, test)
    results <- c(results,table(true=test$classes, pred=pred));

}
# compute mean accuracies and kappas ussing results, which store the result of each fold

I hope this help you.

Celebrity answered 13/11, 2012 at 16:43 Comment(0)
G
1

Here is a simple way to create 10 test and training folds using no packages:

#Randomly shuffle the data
yourData<-yourData[sample(nrow(yourData)),]

#Create 10 equally size folds
folds <- cut(seq(1,nrow(yourData)),breaks=10,labels=FALSE)

#Perform 10 fold cross validation
for(i in 1:10){
    #Segement your data by fold using the which() function 
    testIndexes <- which(folds==i,arr.ind=TRUE)
    testData <- yourData[testIndexes, ]
    trainData <- yourData[-testIndexes, ]
    #Use test and train data howeever you desire...
}
Gopak answered 4/7, 2014 at 20:18 Comment(0)
C
0

Here is my generic code to run a k-fold cross validation aided by cvsegments to generate the index folds.

# k fold-cross validation
set.seed(1)
k <- 80;
result <- 0;
library('pls');
folds <- cvsegments(nrow(imDF), k);
for (fold in 1:k){
    currentFold <- folds[fold][[1]];
    fit = svm(classes ~ ., data=imDF[-currentFold,], type='C-classification', kernel='linear')
    pred = predict(fit, imDF[currentFold,])
    result <- result + table(true=imDF[currentFold,]$classes, pred=pred);   
}
classAgreement(result)
Celebrity answered 14/11, 2012 at 18:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.