number of rows in use has changed: remove missing values?
Asked Answered
V

1

6

I have been trying to do stepwise selection on my variables with R. This is my code:

library(lattice)#to get the matrix plot, assuming this package is already installed
library(ftsa) #to get the out-of sample performance metrics, assuming this package is already installed
library(car) 

mydata=read.csv("C:/Users/jgozal1/Desktop/Multivariate Project/Raw data/FINAL_alldata_norowsunder90_subgroups.csv")

names(mydata)
str(mydata)

mydata$country_name=NULL
mydata$country_code=NULL
mydata$year=NULL
mydata$Unemployment.female....of.female.labor.force...modeled.ILO.estimate.=NULL
mydata$Unemployment.male....of.male.labor.force...modeled.ILO.estimate.=NULL
mydata$Life.expectancy.at.birth.male..years.= NULL
mydata$Life.expectancy.at.birth.female..years. = NULL

str(mydata)

Full_model=lm(mydata$Fertility.rate.total..births.per.woman. + mydata$Immunization.DPT....of.children.ages.12.23.months. + mydata$Immunization.measles....of.children.ages.12.23.months. + mydata$Life.expectancy.at.birth.total..years. + mydata$Mortality.rate.under.5..per.1000.live.births. + mydata$Improved.sanitation.facilities....of.population.with.access. ~ mydata$Primary.completion.rate.female....of.relevant.age.group. + mydata$School.enrollment.primary....gross. + mydata$School.enrollment.secondary....gross. + mydata$School.enrollment.tertiary....gross. + mydata$Internet.users..per.100.people. + mydata$Primary.completion.rate.male....of.relevant.age.group. + mydata$Mobile.cellular.subscriptions..per.100.people. + mydata$Foreign.direct.investment.net.inflows..BoP.current.US.. + mydata$Unemployment.total....of.total.labor.force...modeled.ILO.estimate., data= mydata)

summary(Full_model) #this provides the summary of the model

Reduced_model=lm(mydata$Fertility.rate.total..births.per.woman. + mydata$Immunization.DPT....of.children.ages.12.23.months. + mydata$Immunization.measles....of.children.ages.12.23.months. + mydata$Life.expectancy.at.birth.total..years. + mydata$Mortality.rate.under.5..per.1000.live.births. + mydata$Improved.sanitation.facilities....of.population.with.access. ~1,data= mydata)

step(Reduced_model,scope=list(lower=Reduced_model, upper=Full_model), direction="forward", data=mydata)

step(Full_model, direction="backward", data=mydata)

step(Reduced_model,scope=list(lower=Reduced_model, upper=Full_model), direction="both", data=mydata)

This is the link to the dataset that I am using: http://speedy.sh/YNXxj/FINAL-alldata-norowsunder90-subgroups.csv

After setting the scope for my stepwise I get this error:

Error in step(Reduced_model, scope = list(lower = Reduced_model, upper = Full_model), : number of rows in use has changed: remove missing values? In addition: Warning messages: 1: In add1.lm(fit, scope$add, scale = scale, trace = trace, k = k, : using the 548/734 rows from a combined fit 2: In add1.lm(fit, scope$add, scale = scale, trace = trace, k = k, : using the 548/734 rows from a combined fit

I have looked at other posts with the same error and the solutions usually is to omit the NAs from the data used, but that hasn't solved my problem and I am still getting exactly the same error.

Verjuice answered 18/4, 2015 at 18:11 Comment(2)
I'm not sure what you did, but doing mydata<-na.omit(mydata) before fitting the full model makes the error go away in my test. (Though it does reduce the number of observations from 825 to 548.)Principality
Thank you MrFlick. It worked for me too. I put the na.omit when reading the data, and also when referring to the data in the full and reduced models. It's interesting because it would show how my number of observations decreased with the previous ways of omitting NAs but it still gave me an errorVerjuice
C
1

What worked for me was to use MrFlick's suggestion in the data parameter of the reduced model, i.e.:

model_reduced <- lm(y ~., data = na.omit(data_subset))

It also gave me far more observations than if I had enclosed the entire model in na.omit().

Conchita answered 25/2, 2023 at 1:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.