Odds ratios instead of logits in stargazer() LaTeX output
Asked Answered
A

4

20

When using stargazer to create a LaTeX table on a logistic regression object the standard behaviour is to output logit-values of each model. Is it possible to get exp(logit) instead? That is, can i get the Odds-ratios instead?

In the stargazer documentation the following mentions the "Coef"-argument, but I dont understand if this can enable the exp(logits).

Coef: a list of numerical vectors that will replace the default coefficient values for each model. Element names will be used to match coefficients to individual covariates, and should therefore match covariate names. A NULL vector indicates that, for a given model, the default set of coefficients should be used. By contrast, an NA vector means that all of the model’s coefficients should be left blank.

Angiology answered 26/4, 2013 at 12:32 Comment(0)
I
15

As per symbiotic comment in 2014, more recent versions of ''stargazer'' have the options ''apply.*'' for ''coef'' ''se'' ''t'' ''p'' and ''ci'' allowing the direct transformation of these statistics.

apply.coef a function that will be applied to the coefficients.
apply.se a function that will be applied to the standard errors.
apply.t a function that will be applied to the test statistics.
apply.p a function that will be applied to the p-values.
apply.ci a function that will be applied to the lower and upper bounds of the confidence intervals.

Meaning you can directly use...

stargazer(model, 
          apply.coef = exp,
          apply.se   = exp)

EDIT : I have noticed however that simply exponentiating the CIs does not give what you would expect.

EDIT : You can obtain the correct CIs using the method described here.

Irrecusable answered 13/10, 2015 at 8:59 Comment(2)
note that the standard error of the odds ratio is not the exponent of the s.e. of the coeff. Rather, it is se(OR)=exp(coeff)*se(coeff). For reference see, e.g. stata.com/statalist/archive/2005-09/msg00829.htmlParlay
Thanks for the pointer to Andrew Heiss's blog for establishing CIs on odds ratios. Can you add to your answer to explain how you would do that within stargazer, please?Offense
T
8

stargazer allows you to substitute a lot of things, dependent variable labels, covariate labels and so forth. To substitute those you need to supply a vector of variable labels, this is done to have publishable row names, instead of variable names from R by default.

So in order to have odds ratios, you need to supply a vector of odds ratios to stargazer. How do you obtain that vector? Very easily, actually. Let's say that your model is called model, then your code is:

coef.vector <- exp(model$coef)
stargazer(model,coef=list(coef.vector))

If you have multiple models in your table, then the list should be expanded, e.g. coef=list(coef.vector1,coef.vector2,...), where all vectors in the list would be derived from similar exponentiation as above.

Tracy answered 20/6, 2013 at 18:58 Comment(2)
Great, this works fine for coefficients but how to get the correct Standard Errors and Confidence Intervals.Hagler
To get the right Standard Errors you can use the apply.se argument to specify that you want all standard errors to be exponentiated. Similarly you can use apply.ci for confidence intervalsPtolemaist
B
5

So, the issue is that you want to display the (non-log) odds ratio, but keep the test statistics based on the underlying linear model. By default, when you use one of the "apply" methods, such as apply.coef = exp, stargazer will recalculate the t statistics and p values. We don't want that. Also, the standard errors are in the log basis, but we can't just exponentiate them. My preferred approach is to:

  1. exponentiate the coefs in stargazer
  2. turn off the auto p and auto t
  3. report (untransformed) t-statistics in the table instead of standard errors

In code, this is:

stargazer(model, apply.coef=exp, t.auto=F, p.auto=F, report = "vct*")
Burnsed answered 18/3, 2016 at 20:49 Comment(0)
B
5

There are pieces of the right answer across the various posts, but none of them seem to put it all together. Assuming the following:

glm_out <- glm(Y ~ X, data=DT, family = "binomial")

Getting the Odds-Ratio

For a logistic regression, the regression coefficient (b1) is the estimated increase in the log odds of Y per unit increase in X. So, to get the odds-ratio, we just use the exp function:

OR <- exp(coef(glm_out))

# pass in coef directly
stargazer(glm_out, coef = list(OR), t.auto=F, p.auto=F)

# or, use the apply.coef option
stargazer(glm_out, apply.coef = exp, t.auto=F, p.auto=F)

Getting the Standard Error of the Odds-Ratio

You cannot simply use apply.se = exp to get the Std. Error for the Odds Ratio

Instead, you have to use the function: Std.Error.OR = OR * SE(coef)

# define a helper function to extract SE from glm output
se.coef <- function(glm.output){sqrt(diag(vcov(glm.output)))}

# or, you can use the arm package
se.coef <- arm::se.coef

#Get the odds ratio
OR <- exp(coef(glm_out))

# Then, we can get the `StdErr.OR` by multiplying the two:
Std.Error.OR <-  OR * se.coef(glm_out)

So, to get it into stargazer, we use the following:

# using Std Errors
stargazer(glm_out, coef=list(OR), se = list(Std.Error.OR), t.auto=F, p.auto=F)

Computing CIs for the Odds-Ratio

Confidence intervals in an odds-ratio setting are not symmetric. So, we cannot just do ±1.96*SE(OR) to get the CI. Instead, we can compute it from the original log odds exp(coef ± 1.96*SE).

# Based on normal distribution to compute Wald CIs:
# we use confint.default to obtain the conventional confidence intervals
# then, use the exp function to get the confidence intervals

CI.OR <- as.matrix(exp(confint.default(glm_out)))

So, to get it into stargazer, we use the following:

# using ci.custom
stargazer(glm_out, coef=list(OR), ci.custom = list(CI.OR), t.auto=F, p.auto=F, ci = T)

# using apply.ci
stargazer(glm_out, apply.coef = exp, apply.ci = exp, t.auto=F, p.auto=F, ci = T)

NOTE ABOUT USING CONFIDENCE INTERVALS FOR SIGNIFICANCE TESTS:

Do not use the Confidence Intervals of Odds Ratios to compute significance (see note and reference at the bottom). Instead, you can do it using the log odds:

z <- coef(glm_out)/se.coef(glm_out)

And, use that to get the p.values for significance tests:

pvalue <- 2*pnorm(abs(coef(glm_out)/se.coef(glm_out)), lower.tail = F)

(source: https://data.princeton.edu/wws509/r/c3s1)

See this link for more detailed discussion on statistical testing: https://stats.stackexchange.com/questions/144603/why-do-my-p-values-differ-between-logistic-regression-output-chi-squared-test

It is important to note however, that unlike the p value, the 95% CI does not report a measure’s statistical significance. In practice, the 95% CI is often used as a proxy for the presence of statistical significance if it does not overlap the null value (e.g. OR=1). Nevertheless, it would be inappropriate to interpret an OR with 95% CI that spans the null value as indicating evidence for lack of association between the exposure and outcome. source: Explaining Odds Ratios

Backstitch answered 8/9, 2020 at 7:41 Comment(3)
Thank you for your post! ... However, I still get weird conf. levels indicated by the various thresholds (,,) for my OR output that does not completely correspond to the conf. levels of my estimates. This is only a matter of degree, though. In the OR output there's nothing that suddenly turns out to be sig. that wasn't before (default estimates) or vice versa. There are just some Xi that turn out *** (OR) that were sig. on a lower level before: ** (estimate). Does that make any sense?Intersidereal
There is an error in the Std.Error.OR calculation here. Notice that OR <- exp(coef(glm_out)) returns a matrix (multiple lists to be precise), depending on the number of categories of the dependent variable. Whereas, the function to calculate se.coef uses sqrt(diag(vcov(glm.output))) equation which returns an array (1D). When you multiply OR * se.coef(glm_out) the result for Std..Error.OR comes out wrong because the linear algebra does not work out correctly (Matrix * Array). I solved the problem by getting se.coef from the model directly using summary(glm_out)$standard.errors.Tuner
To be more precise, use Std.Error.OR = exp(coef(glm_out)) * summary(glm_out)$standard.errors) and inside stargazer function use se= list( Std.Error.OR) as shown above.Tuner

© 2022 - 2024 — McMap. All rights reserved.