Creating a loop through a list of variables for an LM model in R
Asked Answered
N

2

6

I am trying to create multiple linear regression models from a list of variable combinations (I also have them separately as a data-frame if that is more useful!)

The list of variables looks like this:

Vars
x1+x2+x3
x1+x2+x4
x1+x2+x5
x1+x2+x6
x1+x2+x7

The loop I'm using looks like this:

for (i in 1:length(var_list)){
  lm(independent_variable ~ var_list[i],data = training_data)
  i+1
}

However it is not recognizing the string of var_list[i] which gives x1+x2+x3 etc. as a model input.

Does any-one know how to fix it?

Thanks for your help.

Nomarch answered 3/12, 2019 at 16:12 Comment(1)
D
5

You don't even have to use loops. Apply should work nicely.

training_data <- as.data.frame(matrix(sample(1:64), nrow = 8))
colnames(training_data) <- c("independent_variable", paste0("x", 1:7))

Vars <- as.list(c("x1+x2+x3",
                "x1+x2+x4",
                "x1+x2+x5",
                "x1+x2+x6",
                "x1+x2+x7"))

allModelsList <- lapply(paste("independent_variable ~", Vars), as.formula)
allModelsResults <- lapply(allModelsList, function(x) lm(x, data = training_data))  

If you need models summaries you can add :

allModelsSummaries = lapply(allModelsResults, summary) 

For example you can access the coefficient R² of the model lm(independent_variable ~ x1+x2+x3) by doing this:

allModelsSummaries[[1]]$r.squared

I hope it helps.

Douzepers answered 3/12, 2019 at 17:8 Comment(0)
T
4

We can create the formula with paste

out <- vector('list', length(var_list))

for (i in seq_along(var_list)){
  out[[i]] <- lm(paste('independent_variable',  '~', var_list[i]),
               data = training_data)
 }

Or otherwise, it can be done with reformulate

lm(reformulate(var_list[i], 'independent_variable'), data = training_data)
Triolein answered 3/12, 2019 at 16:13 Comment(7)
Thank you for the quick answer. For the first option it says: object independent_variable not found. And for the second option it says: 'termlabels' must be a character vector of length at least one'.Nomarch
@CL It is inside the for loop. Also, I didn't get the last part of your mesageTriolein
I'm running the code in the same for loop as previously: ` for (i in 1:length(var_list)){ lm(paste(independent_variable, "~", var_list[i]),data = training_data) i+1 } `Nomarch
@CL I thought it is an object name. I guess, it shoould be quoted if it is a column name. Updated the postTriolein
Perfect, it runs now but doesn't seem to produce any visible output. Is there something I'm missing? Thanks again for your help with this.Nomarch
@CL You are not storing the output. Please check the updateTriolein
Brilliant it works really well. Thanks for the quick reply and helpful updates.Nomarch

© 2022 - 2024 — McMap. All rights reserved.