Hausman type test in R
Asked Answered
G

1

6

I have been using "plm" package of R to do the analysis of panel data. One of the important test in this package for choosing between "fixed effect" or "random effect" model is called Hausman type. A similar test is also available for the Stata. The point here is that Stata requires fixed effect to be estimated first followed by random effect. However, I didn't see any such restriction in the "plm" package. So, I was wondering whether "plm" package has the default "fixed effect" first and then "random effect" second. For your reference, I mention below the steps in Stata and R that I followed for the analysis.

*

Stata Steps: (data=mydata, y=dependent variable,X1:X4: explanatory variables)
    *step 1 : Estimate the FE model
    xtreg y X1 X2 X3 X4 ,fe
    *step 2: store the estimator 
    est store fixed
    *step 3 : Estimate the RE model
    xtreg y X1 X2 X3 X4,re
   * step 4: store the estimator 
    est store random
    *step 5: run Hausman test
    hausman fixed random

#R steps (data=mydata, y=dependent variable,X1:X4: explanatory variables)
#step 1 : Estimate the FE model
 fe <- plm(y~X1+X2+X3+X4,data=mydata,model="within")
summary(model.fe)
#step 2 : Estimate the RE model
 re <- pggls(y~X1+X2+X3+X4,data=mydata,model="random")
summary(model.re)
#step 3 : Run Hausman test
phtest(fe, re)
Grassland answered 20/10, 2012 at 15:32 Comment(2)
RoyalTS seems to have answered your question. Do you really want to use the test, though? It's not the most reliable indicator of whether to use FE or RE (ref).Salzman
Thanks for the paper. However, we still have robust hausman test (xtoverid and Wooldridge 2002) in stata. The paper you mentioned didn't talk about these tests. I am not sure about these tests in plm package of R.Grassland
A
9

Update: Be sure to read the comments. Original answer below.

Trial-and-error way of finding this out:

> library(plm)
> data("Gasoline", package = "plm")
> form <- lgaspcar ~ lincomep + lrpmg + lcarpcap
> wi <- plm(form, data = Gasoline, model = "within")
> re <- plm(form, data = Gasoline, model = "random")
> phtest(wi, re)

    Hausman Test

data:  form 
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent 

> phtest(re, wi)

    Hausman Test

data:  form 
chisq = 302.8037, df = 3, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent

As you can see, the test yields the same result no matter which of the models you feed it as the first and which as the second argument.

Aryl answered 20/10, 2012 at 15:51 Comment(14)
Thanks for clarification. However, the order does matter for Stata.In one of the data, that I have, I got negative value for chi square when I use the fe first and re second. Reversing, however, gives the positive value (of course, the magnitudes are the same). Stata does have sigmamore option to resolve the negative value (but the magnitude differs). I didn't find such option in the "plm" package. Fortunately, using the sigmamore option or reversing doesn't change the qualitative results; I was able to find support for "fe". The point,is that "plm" package may rule out the negative value.Grassland
Sorry, my earlier explanation was just wrong: Since the test statistic is not just a sum of squares but a weighed sum of squares and some of these weights can be negative, so can the test statistic. Check out the source code for the phtest function (just type plm:::phtest.panelmodel into the R console) and you'll see that the statistic R computes will always be positive simply because they take the absolute value at the end (stat <- abs(t(dbeta) %*% solve(dvcov) %*% dbeta)). So, as far as the R implementation is concerned the order of the arguments doesn't matter.Aryl
As to whether what the R routine is doing is correct, I don't know. It does seem slightly dodgy to me to ignore the fact that the matrix isn't positive definite and take an absolute value at the end.Aryl
Wow. Taking the absolute value in R package code is just plain wrong, and sweeps the problem under the carpet. Stata's hausman is too generic, and is coded to be agnostic of the specific estimation situation you are in -- you may be comparing OLS and IV, or OLS and GLS, or something like that, and hausman does not need or want to know about this. Hence it is your responsibility to specify the results in the order assumed (and documented) by hausman. There used to be xthaus that was specific to panel data, but it is considered obsolete now.Ancestress
FWIW, I've written the package maintainers an email about this and suggested they at the very least throw a warning about the non-positive definite matrix.Aryl
Stask: Stata does suggest the user to specify to hausman the models in the order "always consistent" first and "efficient under H0" second. For the panel data, this is the "fe" first and "re" second.Grassland
RoyalTS: Did you also suggest the author to include option (like sigmamore) to deal with negative matrix?Grassland
@user1493368: I did. Let's wait and see what comes of this.Aryl
FYI, I never got an answer from the package maintainer and the current version of plm still does this wrong.Aryl
Update: The package maintainer has gotten back to me and promised a fix for the next version.Aryl
hello. any update about the absolute value issue? I am comparing a FE model with a GMM model and looks like the code still takes the absolute valueEmarie
@Bob: May I suggest you write the package maintainer another email?Aryl
There is some justification for using the absolute value given by the analysis in Schreiber (2008)-The Hausman Test Statistic can be Negative even AsymptoticallyEconah
-1: If you realize that your answer was wrong, then you should delete it and then create a new answer with your correct solution. It does not help users to leave an answer on the site that you realize is wrong. (You might lose some reputation points for doing that, but I think it would be the more honourable thing to do.) Sending readers to the comments is not an efficient way to work with Stack Overflow.Noisette

© 2022 - 2024 — McMap. All rights reserved.