Syntax for binomial formula in geom_smooth
Asked Answered
S

1

7

I have computed a binomial regression in R:

Call:
glm(formula = cbind(success, failure) ~ x * f, family = "binomial", 
    data = tb1)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.6195  -0.9399  -0.0493   0.5698   2.0677  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept) -2.3170182  0.0622600 -37.215  < 2e-16 ***
x            0.0138201  0.0009892  13.972  < 2e-16 ***
fTRUE        0.6466238  0.1976115   3.272  0.00107 ** 
x:fTRUE     -0.0035741  0.0032587  -1.097  0.27273    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 479.88  on 147  degrees of freedom
Residual deviance: 201.53  on 144  degrees of freedom
  (7 observations deleted due to missingness)
AIC: 870.72

Number of Fisher Scoring iterations: 4    

and I'd like to visualize it. I would like to plot the data and the regression curve. I can easily get the linear smoother to work:

ggplot(tb1, aes(x, success/(success+failure), colour=f)) +
  geom_point() +
  geom_smooth(method="lm")

Linear Regression

but what I really want is to draw a logistic curve through the data. When I try:

ggplot(tb1, aes(x, success/(success+failure), colour=f)) +
  geom_point() +
  geom_smooth(
    method="glm",
    method.args=list(family="binomial"),
  )

I get this graph:

Suspicious binomial regression

which doesn't seem right. The standard errors shouldn't be so large. I thought I'd need to specify the formula explicitly in geom_smooth, but I can't get the syntax right. When I try

ggplot(tb1, aes(x, success/(success+failure), colour=f)) +
  geom_point() +
  geom_smooth(
    method="glm",
    method.args=list(
      family="binomial",
      formula = cbind(success, failure) ~ x
    )
  )

I get

Warning message:
Computation failed in stat_smooth():
object 'success' not found

How do I specify the formula correctly?

Supper answered 25/10, 2019 at 14:14 Comment(3)
Seems the formula only knows about x and y which are presumably inferred from the aesthetics?Mellar
@NelsonGon: Thanks, you seem to be right. However, Now I get the warning: "In eval(family$initialize) : non-integer #successes in a binomial glm!"Supper
This is statistical I guess. Take a look at this issue.Mellar
S
10

As in binomial regression, the formula in geom_smooth needs to have a matrix of successes and failures as the response. The corresponding variables need to be defined in the aesthetics:

ggplot(tb1, aes(x, y=success/(success+failure), colour=f, succ=success, fail=failure)) +
  geom_point() +
  geom_smooth(
    method="glm",
    method.args=list(family="binomial"),
    formula = cbind(succ, fail) ~ x
  )

Now it works:

Correct binomial smoothing

Thanks to NelsonGon for pointing this out.

Supper answered 28/10, 2019 at 8:41 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.