R - Calculate Test MSE given a trained model from a training set and a test set

About

Asked 1/10, 2016 at 21:33 Answered 1/10, 2016 at 21:36

Solved r machine-learning statistics regression linear-regression

Given two simple sets of data:

 head(training_set)
      x         y
    1 1  2.167512
    2 2  4.684017
    3 3  3.702477
    4 4  9.417312
    5 5  9.424831
    6 6 13.090983

 head(test_set)
      x        y
    1 1 2.068663
    2 2 4.162103
    3 3 5.080583
    4 4 8.366680
    5 5 8.344651

I want to fit a linear regression line on the training data, and use that line (or the coefficients) to calculate the "test MSE" or Mean Squared Error of the Residuals on the test data once that line is fit there.

model = lm(y~x,data=training_set)
train_MSE = mean(model$residuals^2)
test_MSE = ?

Moneymaker answered 1/10, 2016 at 21:33 Comment(0)

In this case, it is more precise to call it MSPE (mean squared prediction error):

mean((test_set$y - predict.lm(model, test_set)) ^ 2)

This is a more useful measure as all models aim at prediction. We want a model with minimal MSPE.

In practice, if we do have a spare test data set, we can directly compute MSPE as above. However, very often we don't have spare data. In statistics, the leave-one-out cross-validation is an estimate of MSPE from the training dataset.

There are also several other statistics for assessing prediction error, like Mallows's statistic and AIC.

Midrash answered 1/10, 2016 at 21:36 Comment(3)

(+1) but what's the point about APSE? I never heard of that (while I can guess the reason for calling it average instead of mean). – Tranquillity 1/10, 2016 at 21:37

So MSPE is analogous to the mean of the residuals squared? – Moneymaker 1/10, 2016 at 21:40

@李哲源 could you point me to a reference in which it is explained how to compute the expectation value of the MSPE? – Excrescency 6/6, 2018 at 21:4

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags