How to obtain RMSE out of lm result?
Asked Answered
E

5

24

I know there is a small difference between $sigma and the concept of root mean squared error. So, i am wondering what is the easiest way to obtain RMSE out of lm function in R?

res<-lm(randomData$price ~randomData$carat+
                     randomData$cut+randomData$color+
                     randomData$clarity+randomData$depth+
                     randomData$table+randomData$x+
                     randomData$y+randomData$z)

length(coefficients(res))

contains 24 coefficient, and I cannot make my model manually anymore. So, how can I evaluate the RMSE based on coefficients derived from lm?

Experiment answered 30/3, 2017 at 16:27 Comment(1)
strongly recommend lm(price ~ carat + color + ..., data=randomData): easier to read, works with downstream methods such as predict().Porush
S
38

Residual sum of squares:

RSS <- c(crossprod(res$residuals))

Mean squared error:

MSE <- RSS / length(res$residuals)

Root MSE:

RMSE <- sqrt(MSE)

Pearson estimated residual variance (as returned by summary.lm):

sig2 <- RSS / res$df.residual

Statistically, MSE is the maximum likelihood estimator of residual variance, but is biased (downward). The Pearson one is the restricted maximum likelihood estimator of residual variance, which is unbiased.


Remark

  • Given two vectors x and y, c(crossprod(x, y)) is equivalent to sum(x * y) but much faster. c(crossprod(x)) is likewise faster than sum(x ^ 2).
  • sum(x) / length(x) is also faster than mean(x).
Stockjobber answered 30/3, 2017 at 16:35 Comment(5)
The code as it is in the first line cannot be run. There is 3 ')' but only 2 '('. I have tried to run without the last ')' but didn't work. it gets the following error: "unused argument (crossprod(fit2$residuals))"Kerrin
@AndréCosta Just see that you have another issue. You must have defined your own c function in your R session (which is a bad idea), masking base::c for vector concatenation. Here is how I reproduce your error: c <- function () NULL and then c(crossprod(1:10))Stockjobber
My c function was fine. Maybe something as change in some version of R. For me the way worked was crossprod(1:10)[1]. (btw, it wasn't me who upvote, hahaha)Kerrin
@AndréCosta There are other alternatives like drop(crossprd(1:10)). Ah, sorry for that. Maybe that person also voted on the other answer here :)Stockjobber
You can also calculate MSE=mean(res$residuals^2)Ind
V
32

To get the RMSE in one line, with just functions from base, I would use:

sqrt(mean(res$residuals^2))
Vasyuta answered 26/7, 2018 at 2:28 Comment(0)
G
6

I think the other answers might be incorrect. The MSE of regression is the SSE divided by (n - k - 1), where n is the number of data points and k is the number of model parameters.

Simply taking the mean of the residuals squared (as other answers have suggested) is the equivalent of dividing by n instead of (n - k - 1).

I would calculate RMSE by sqrt(sum(res$residuals^2) / res$df).

The quantity in the denominator res$df gives you the degrees of freedom, which is the same as (n - k - 1). Take a look at this for reference: https://www3.nd.edu/~rwilliam/stats2/l02.pdf

Gastroscope answered 7/11, 2018 at 19:33 Comment(2)
If you look, you'll find plenty of other sources defining RMSE as exactly what it sounds like, the root mean squared error. Square the errors, take the mean, take the square root. The wikipedia page, for example, doesn't mention anything about degrees of freedom or model parameters.Lonnielonny
Wikipedia covers it here: en.wikipedia.org/wiki/Mean_squared_error#Regression. The MSE definition with the degrees of freedom is also in my stats textbook by Hayter, "Probability and Statistics for Engineers and Scientists."Gastroscope
T
0

Just do

sigma(res) 

An you got it

Timisoara answered 21/10, 2019 at 11:43 Comment(1)
sigma() returns the "the estimated standard deviation of the errors (“residual standard deviation”) for Gaussian models" (help(sigma), not the root mean squared error.Dubuffet
N
0

Checkout the rmse() function in the Metrics package

Numbers answered 24/4, 2022 at 21:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.