Standard errors discrepancies between SAS and R for GLM gamma distribution
Asked Answered
M

1

6

I am comparing GLM output from R and SAS of a model with a Gamma distribution. The point estimations are identical, but they have different estimation of the standard error and therefore different p-values.

Does anyone know why? I am wondering if R and SAS uses different methods to estimate the standard errors? Maybe MLE vs. method of moments?

R sample code

set.seed(2)
test = data.table(y = rnorm(100, 1000, 100), x1 = rnorm(100, 50, 20), x2 = rgamma(100, 0.01))
model = summary(glm(formula = y ~ x1+x2 , family = Gamma(link = "log"), data = test))

Using the same data generated here I used the following code to run a model in SAS:

proc genmod data= test_data;
                model y =  x1 x2 /link= log dist= gamma;
    run;

The output from R is as following:

Call:
glm(formula = y ~ x1 + x2, family = Gamma(link = "log"), data = test)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.26213  -0.08456  -0.01033   0.08364   0.20878  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.9210757  0.0324674 213.170   <2e-16 ***
x1          -0.0003371  0.0005985  -0.563    0.575    
x2           0.0234097  0.0627251   0.373    0.710    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.01376099)

    Null deviance: 1.3498  on 99  degrees of freedom
Residual deviance: 1.3436  on 97  degrees of freedom
AIC: 1240.6

Number of Fisher Scoring iterations: 4

The output from SAS:enter image description here

Misdeed answered 15/6, 2017 at 22:28 Comment(5)
The SCALE parameter used in PROC GENMOD is the inverse of the gamma dispersion parameter , so from 1/model$dispersion , there is a difference there, hence in se's. In r it is calculated by sum((object$weights * object$residuals^2)[object$weights > 0])/ object$df.residual (look at summary.glm) Maybe try and find how SAS calculates it.Hellas
SAS manual : The GENMOD Procedure - SAS Support - suggets there may be an option on how the scale is used. Try using SCALE=PEARSON in the MODEL statement (untested)Hellas
I'm not seeing a material difference in "standard errors" or the deviance estimates. The p-values only differ in the range of ~0.001 which seems trivial. The code is available in R. From SAS, well you know how that goes, right? SAS is infamous for using "type III" calculations while R uses sequential estimates.Gauldin
I understand that the difference in SEs and p-values are very small, but I am just trying to understand why there is a difference at all if they are both using MLE to estimate the results. I have looked into the different ways that they calculate dispersion parameters, and it seems to be the main issue.Misdeed
Looks like SAS is using a normal approximation to calculate the Chi-square values. What does R do?Monotonous
S
2

By default R computes the scale=1/dispersion parameter in the same way as the sas/genmod/model option scale=pearson. The choice of scale parameter affects the SEs. See the documentation here: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect022.htm

By default SAS/genmod gives MLEs of the shape parameter. Suppose the fitted gamma model is stored in the list "fit". To get this in R load the MASS library then type

gamma.shape(fit)

This gives the MLEs of shape parameter alpha. If you then type

summary(fit, dispersion=1/gamma.shape(fit)$alpha))

the summary function will use the MLE of alpha when computing the SEs, and they will match SAS/genmod exactly.

I will make a separate post on this. While summary.glm gives the correct SEs (using the specified value of dispersion), it does not print the correct value of AIC (It does not use the specified dispersion, instead it uses the value computed with the Pearson residuals). The difference is small, but I would call it a bug.

Srinagar answered 21/11, 2017 at 18:6 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.