Using and interpreting output from gvlma [closed]

Asked 6/4, 2017 at 10:26 Answered 10/3, 2020 at 21:24

I want to test whether all assumptions for my linear regression model hold. I did this manually and it seems to be fine. However, I want to double check with the function gvlma. The output I get is:

 gvlma(x = m_lag) 

                Value p-value                   Decision
 Global Stat        82.475 0.00000 Assumptions NOT satisfied!
 Skewness           72.378 0.00000 Assumptions NOT satisfied!
 Kurtosis            1.040 0.30778    Assumptions acceptable.
 Link Function       6.029 0.01407 Assumptions NOT satisfied!
 Heteroscedasticity  3.027 0.08187    Assumptions acceptable.

My question is:

How do I interpret Global Stat
Since the assumption is violated, what can I do about it now? (Same with the other 2 assumptions which were not accepted)

Latchet answered 6/4, 2017 at 10:26 Comment(0)

Global Stat- Are the relationships between your X predictors and Y roughly linear?. Rejection of the null (p < .05) indicates a non-linear relationship between one or more of your X’s and Y
Skewness - Is your distribution skewed positively or negatively, necessitating a transformation to meet the assumption of normality? Rejection of the null (p < .05) indicates that you should likely transform your data.
Kurtosis- Is your distribution kurtotic (highly peaked or very shallowly peaked), necessitating a transformation to meet the assumption of normality? Rejection of the null (p < .05) indicates that you should likely transform your data.
Link Function- Is your dependent variable truly continuous, or categorical? Rejection of the null (p < .05) indicates that you should use an alternative form of the generalized linear model (e.g. logistic or binomial regression).
Heteroscedasticity- Is the variance of your model residuals constant across the range of X (assumption of homoscedastiity)? Rejection of the null (p < .05) indicates that your residuals are heteroscedastic, and thus non-constant across the range of X. Your model is better/worse at predicting for certain ranges of your X scales.

Polyploid answered 8/12, 2017 at 13:4 Comment(1)

Can you please point me to a source for the above info? I want to read about it comprehensively. – Claxton 3/11, 2020 at 7:34

I know the question was written a long time ago, but the only answer is not really accurate.

Based on Pena and Slate (2006), the four assumptions in linear regression are normality, heteroscedasticity, and linearity, and what the authors refer to as uncorrelatedness.

For the assumption 'uncorrelatedness', I usually call it independence. The authors refer to independence as a measurement that is validated by an assessment of uncorrelatedness and normality combined. The author also refers to other scholars whom indicate this is the independence of the residuals (on the left side pg 342).

Global Stat This is the overall metric; this states whether the model, as a whole, passes or fails.
Skewness <- measuring the distribution
Kurtosis <- measuring the distribution, outliers, influential data, etc
Link function <- misspecified model, how you linked the elements in the model assignment
Heteroscedasticity <- looking for equal variance in the residuals

The measurements are not specifically skew, kurtosis, etc; if you look closely at the math behind the measures. These metrics are mathematical derivations from multiple statistical analysis methods. In their research, the authors found that when they combined these four measurements, it not only accurately assessed the four assumptions of linear regression, but also the interaction of the assumptions on the residuals.

In order to determine what to do first for correcting the issues, it would be necessary to know what data you are using, sample size, and the model you have established. The high value in skew could be from distribution, variance, etc. There are things to look for, based on the original work by Pena and Slate, but it seems like if you have a large or small sample size, it could drastically change where you start. I have not worked through all of the conclusions in the article, to know for sure.

Pena, E. A., & Slate, E. H. (2006). Global validation of linear model assumptions. Journal of the American Statistical Association, 101(473), 341-354. https://doi.org/10.1198/016214505000000637

Pittman answered 10/3, 2020 at 21:24 Comment(0)

Recommended topics

Hot tags