Test for Multicollinearity in Panel Data R
Asked Answered
M

1

12

I am running a panel data regression using the plm package in R and want to control for multicollinearity between the explanatory variables.
I know there is the vif() function in the car-package, however as far as I know, it cannot deal with panel data output.
The plm can do other diagnostics such as a unit root test but I found no method to calculate for multicollinearity.

Is there a way to calculate a similar test to vif, or can I just regard each variable as a time-series, leaving out the panel information and run tests using the car package?

I cannot disclose the data, but the problem should be relevant to all panel data models.
The dimension is roughly 1,000 observations, over 50 time-periods.
The code I use looks like this:

pdata <- pdata.frame(RegData, index=c("id","time"))
fixed <- plm(Y~X, data=pdata, model="within")

and then

vif(fixed) 

returns an error.

Medication answered 29/11, 2013 at 8:12 Comment(3)
I don't know an R function for the VIF in panel data, but you can always look at the correlations between the explanatory variables to get a good idea. Probably the more balanced the design the better picture you get.Lordling
Thank you for the idea @Lordling But is it valid to use correlation between panel data variables without considering their panel nature? Wouldn't this create some distortion in the result?Medication
This is my gut feeling, but I would say it is valid when you correlate the variables measured at the same time point. At least to get a general impression if you have multicollinearity issues.Lordling
S
19

This question has been asked with reference to other statistical packages such as SAS https://communities.sas.com/thread/47675 and Stata http://www.stata.com/statalist/archive/2005-08/msg00018.html and the common answer has been to use pooled model to get VIF. The logic is that since multicollinearity is only about independent variable there is no need to control for individual effects using panel methods.

Here's some code extracted from another site:

mydata=read.csv("US Panel Data.csv")
attach(mydata)  # not sure is that's really needed
Y=cbind(Return) # not sure what that is doing
pdata=pdata.frame(mydata, index=c("id","t"))
model=plm(Y ~ 1+ESG+Beta+Market.Cap+PTBV+Momentum+Dummy1+Dummy2+Dummy3+Dummy4+Dummy5+
                   Dummy6+Dummy7+Dummy8+Dummy9,
           data=pdata,model="pooling")
vif(model)
Selinaselinda answered 29/11, 2013 at 17:48 Comment(1)
Thanks @Rfan! This is exactly the answer to my question.Medication

© 2022 - 2024 — McMap. All rights reserved.