I am handling missing data using imputation. I am exploring the Amelia and rms packages for imputation. I have some queries regarding these packages.
I want to combine the imputed data sets from Amelia. I did see that Amelia has a function
mi.meld
which combines the result from multiple imputation. But I want to combine the data set first and then train different model. I am not sure if combing the dataset and then using that data to train the model a right way. I want to do so because my testing data also has missing data. I want to handle it using imputation so that I can use it to predict the values.for(i in 1:impute$m) { model <- rpart(Y ~X1+X2+X3+X4+X5, data=impute$imputations[[i]],method="anova",control=rpart.control(cp=0.001)) b.out <- rbind(b.out, model$coef) se.out <- rbind(se.out, coef(summary(model))[,2]) } combined.results <- mi.meld(q = b.out, se = se.out)
I am also using the rms package for this purpose. I wanted to confirm does the
aregImpute
function combines the imputed dataset?impute<- aregImpute(Y~X1+X2+X3+X4+X5,data= train_data,n.impute=5,nk=0)
Does anyone have suggestions on how can I combine multiple imputed datasets in to one dataset?