Difference between glm and LogitModelFit

Asked 3/1, 2018 at 13:26 Answered 3/1, 2018 at 14:28

Solved r parameters wolfram-mathematica logistic-regression coefficients

I have a problem with glm function in R.

Specifically, I am not sure how to include nominal variables.

The results that I get in R after running the glm function are the following:

> df

   x1 x2 y
1  a  2  0
2  b  4  1
3  a  4  0
4  b  2  1
5  a  4  1
6  b  2  0

> str(df)
'data.frame':   6 obs. of  3 variables:
 $ x1: Factor w/ 2 levels "a","b": 1 2 1 2 1 2
 $ x2: num  2 4 4 2 4 2
 $ y: Factor w/ 2 levels "0","1": 1 2 1 2 2 1

Call:
glm(formula = y ~ x1 + x2, family = "binomial", data = df)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -39.132  15208.471  -0.003    0.998
x1b            19.566   7604.236   0.003    0.998
x2              9.783   3802.118   0.003    0.998

However, when I run the LogitModelFit function in Wolfram Mathematica I get different parameters.

The code in Wolfram is provided below:

data = {{a, 2, 0}, {b, 4, 1}, {a, 4, 0}, {b, 2, 1}, {a, 4, 1}, {b, 2, 0}};

model = LogitModelFit[data, {x, y}, {x, y}, NominalVariables -> x]

model["BestFitParameters"]

And these are my estimated parameters:

{-18.5661, -18.5661, 9.28303}

model // Normal

1/(1 + E^(18.5661 - 9.28303 y + 18.5661 DiscreteIndicator[x, a, {a, b}]))

So, what is different here? Why the results differ so much?

Am I doing something wrong in R or in Wolfram?

Twopence answered 3/1, 2018 at 13:26 Comment(1)

It was, this was just a simple example. – Twopence 3/1, 2018 at 14:44

You have effectively 4 groups, for which you are trying to estimate 3 parameters:

library(dplyr)
df %>% group_by(x1, x2) %>% summarise(n = n(), y = mean(y))

As you can see from the huge standard errors, the parameter estimates aren't stable. The standard errors for wolfram should also be very large (if they are given).

Second, wolfram, seems to use a different reference group, for x1:

> df$x1 <- relevel(df$x1, "b")
> m <- glm(y ~ x1 + x2, family = binomial(), data = df, control = list(maxit = 100))
> summary(m)

Call:
glm(formula = y ~ x1 + x2, family = binomial(), data = df, control = list(maxit = 100))

Deviance Residuals: 
       1         2         3         4         5         6  
-0.00008   0.00008  -1.17741   1.17741   1.17741  -1.17741  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -19.566   7604.236  -0.003    0.998
x1a          -19.566   7604.236  -0.003    0.998
x2             9.783   3802.118   0.003    0.998

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 8.3178  on 5  degrees of freedom
Residual deviance: 5.5452  on 3  degrees of freedom
AIC: 11.545

Number of Fisher Scoring iterations: 18

This is much closer to the result of wolfram (this is actually the same model as you have found; I just choose another reference group).

The predictions for both models (glm and wolfram) will be practically equal. Actually any model with the first two parameters very small (best model will be -Inf) and the third parameter equal to half of the first two (9.783*2 = 19.566) will give almost the same result.

The factor two originates from the fact that x2 takes on value 2 and 4, which differ by two.

Beluga answered 3/1, 2018 at 14:28 Comment(0)

Seems like in your LogitModelFit

1/(1 + E^(18.5661 - 9.28303 y + 18.5661 DiscreteIndicator[x, a, {a, b}]))

the DiscreteIndicator refers to discrete variable matching condition x1 == 'a',

whereas in your glm fit result there is instead a discrete variable x1b matching condition x1 == 'b':

> str(df)
'data.frame':   6 obs. of  3 variables:
 $ x1: Factor w/ 2 levels "a","b": 1 2 1 2 1 2
 $ x2: num  2 4 4 2 4 2
 $ y: Factor w/ 2 levels "0","1": 1 2 1 2 2 1

Call:
glm(formula = y ~ x1 + x2, family = "binomial", data = df)

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)   -39.132  15208.471  -0.003    0.998
x1b            19.566   7604.236   0.003    0.998
x2              9.783   3802.118   0.003    0.998

So the difference seems to be due to the different way in which LogitModelFit and glm exclude one dependent category. LogitModelFit excludes dependent category x=='a' whereas glm excludes it's complement x=='b'.

Oxazine answered 3/1, 2018 at 14:28 Comment(0)

Recommended topics

Hot tags