Extract pvalue from glm

Asked 23/5, 2014 at 22:2 Answered 16/6, 2021 at 20:51

I'm running many regressions and am only interested in the effect on the coefficient and p-value of one particular variable. So, in my script, I'd like to be able to just extract the p-value from the glm summary (getting the coefficient itself is easy). The only way I know of to view the p-value is using summary(myReg). Is there some other way?

e.g.:

fit <- glm(y ~ x1 + x2, myData)
x1Coeff <- fit$coefficients[2] # only returns coefficient, of course
x1pValue <- ???

I've tried treating fit$coefficients as a matrix, but am still unable to simply extract the p-value.

Is it possible to do this?

Thanks!

Indispose answered 23/5, 2014 at 22:2 Comment(0)

You want

coef(summary(fit))[,4]

which extracts the column vector of p values from the tabular output shown by summary(fit). The p-values aren't actually computed until you run summary() on the model fit.

By the way, use extractor functions rather than delve into objects if you can:

fit$coefficients[2]

should be

coef(fit)[2]

If there aren't extractor functions, str() is your friend. It allows you to look at the structure of any object, which allows you to see what the object contains and how to extract it:

summ <- summary(fit)

> str(summ, max = 1)
List of 17
 $ call          : language glm(formula = counts ~ outcome + treatment, family = poisson())
 $ terms         :Classes 'terms', 'formula' length 3 counts ~ outcome + treatment
  .. ..- attr(*, "variables")= language list(counts, outcome, treatment)
  .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. ..- attr(*, "term.labels")= chr [1:2] "outcome" "treatment"
  .. ..- attr(*, "order")= int [1:2] 1 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..- attr(*, "predvars")= language list(counts, outcome, treatment)
  .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "factor" "factor"
  .. .. ..- attr(*, "names")= chr [1:3] "counts" "outcome" "treatment"
 $ family        :List of 12
  ..- attr(*, "class")= chr "family"
 $ deviance      : num 5.13
 $ aic           : num 56.8
 $ contrasts     :List of 2
 $ df.residual   : int 4
 $ null.deviance : num 10.6
 $ df.null       : int 8
 $ iter          : int 4
 $ deviance.resid: Named num [1:9] -0.671 0.963 -0.17 -0.22 -0.956 ...
  ..- attr(*, "names")= chr [1:9] "1" "2" "3" "4" ...
 $ coefficients  : num [1:5, 1:4] 3.04 -4.54e-01 -2.93e-01 1.34e-15 1.42e-15 ...
  ..- attr(*, "dimnames")=List of 2
 $ aliased       : Named logi [1:5] FALSE FALSE FALSE FALSE FALSE
  ..- attr(*, "names")= chr [1:5] "(Intercept)" "outcome2" "outcome3" "treatment2" ...
 $ dispersion    : num 1
 $ df            : int [1:3] 5 4 5
 $ cov.unscaled  : num [1:5, 1:5] 0.0292 -0.0159 -0.0159 -0.02 -0.02 ...
  ..- attr(*, "dimnames")=List of 2
 $ cov.scaled    : num [1:5, 1:5] 0.0292 -0.0159 -0.0159 -0.02 -0.02 ...
  ..- attr(*, "dimnames")=List of 2
 - attr(*, "class")= chr "summary.glm"

Hence we note the coefficients component which we can extract using coef(), but other components don't have extractors, like null.deviance, which you can extract as summ$null.deviance.

Voluntaryism answered 23/5, 2014 at 22:5 Comment(13)

you beat me while I was looking for dupes (there aren't any perfect duplicates but there are a lot of posts about extracting stuff from [g]lm fits: e.g. #12496868 ) – Aeroscope 23/5, 2014 at 22:8

might be worth adding a comment on str() when you don't know what accessors are available and have to try to dig stuff out of the object for yourself. – Aeroscope 23/5, 2014 at 22:13

Thanks @BenBolker. Have added something on str. – Voluntaryism 23/5, 2014 at 22:31

Actually, I was using str() to try to figure out how to get the data, but I don't see where in the str() I can deduce that I need the coef() function to get what I'm looking for. I'm reading you're update and I still don't see that either... – Indispose 23/5, 2014 at 22:40

@Ben I looked a few times for other posts but couldn't find any. – Indispose 23/5, 2014 at 22:41

the way to find out about coef is to do methods(class="lm") or methods(class="summary.lm"). I agree that you can't figure out from looking at str() that you could use coef(). – Aeroscope 23/5, 2014 at 23:11

Actually, I can't even see in the methods(class="summary.glm") or methods(class="glm") output that I should be able to use coef() on the object. Could you please elaborate? Thanks! – Indispose 23/5, 2014 at 23:59

I tend to use names(summ) or attributes(summ) to get at the attributes of an object; although this does not tell me the methods it gives me more compact info than str. Sometimes I use the frowned upon summ$coefficients if I'm not sure of the extractor function; I have not yet received an excommunication notice from the Church of R. – Lentamente 23/5, 2014 at 23:59

@Clark Look at class(fit) and you'll see that a glm` fit inherits from class "lm" so you would need to look for methods for that class. – Voluntaryism 24/5, 2014 at 0:18

I've been using this: cbind( coef(summary(model))[, 4], exp(coef(model)), exp(confint(model))) , but it takes a long time "Waiting for profiling to be done..." any suggestions as to how to fix this? – Ionization 13/4, 2023 at 19:0

@Ionization don't use profile likelihood to compute confidence intervals for parameters (which is what the confint() method for GLMs is doing. – Voluntaryism 13/4, 2023 at 22:35

thanks! @GavinSimpson, what should I use instead? – Ionization 14/4, 2023 at 14:29

@Ionization well, there's a reason confint() is using profile likelihood - the usual Gaussian approximation can often give poor confidence intervals in GLMs. Unfortunately profiling the likelihood of the model takes time. You could form intervals on the link scale as beta_i +/- (2 * std_err_i) and then back transform those intervals to the response scale using the inverse of the link function, but that's using the gaussian approximation which can be problematic in some situations. Bootstrapping could also be used. That won't be quick either. – Voluntaryism 14/4, 2023 at 14:49

Instead of the number you can put directly the name

coef(summary(fit))[,'Pr(>|z|)']

the other ones available from the coefficient summary:

Estimate Std. Error z value Pr(>|z|)

Astrid answered 24/1, 2018 at 9:22 Comment(0)

I've used this technique in the past to pull out predictor data from summary or from a fitted model object:

coef(summary(m))[grepl("var_i_want$",row.names(coef(summary(m)))), 4]

which lets me easily edit which variable I want to get data on.

Or as pointed out be @Ben, use match or %in%, somewhat cleaner than grepl:

coef(summary(m))[row.names(coef(summary(m))) %in% "var_i_want" , 4]

Lentamente answered 23/5, 2014 at 22:18 Comment(1)

or use match() ... or just index the row appropriately – Aeroscope 23/5, 2014 at 22:19

The tidy function from the broom package (part of the Tidyverse, available on CRAN) provides a quick and easy way to convert your GLM summaries into a data frame, which might be useful in a number of situations other than the one you described above.

In this case, your desired output could be obtained with the code:

x1pValue <- broom::tidy(fit)$p.value[2]

Moot answered 16/6, 2021 at 20:51 Comment(0)

Well, this would be another way, however not the most efficient way to perform it :

a = coeftable(model).cols[4]
pVals = [ a[i].v for i in 1:length(a) ]

This ensures that the extracted values from the glm is not in StatsBase. Therein, you can play around with pVals as per your requirement. Hope it helps, Ebby

Gabrila answered 1/5, 2018 at 8:46 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags