texreg on panelmodel (plm) object; additional gof information
Asked Answered
C

3

6

How do I customize the goodness of fit (gof) in a call?

I have some models I want to display, but I only get "Num. obs.", "Adj. R^2", and , "R^2" (see working example below). I would howver like to all display small n, T, F-statistic, and p-value, all stuff I get in the default summary() call.

A example of what I got. First some data and needed packages,

# install.packages(c("wooldridge", "plm", "texreg"), dependencies = TRUE)
library(wooldridge) 
data(wagepan)
library(plm) 

Second, some models,

POLS <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
            factor(year), data = wagepan, index=c("nr","year") , model="pooling")

RE <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + union +
          factor(year), data = wagepan, index = c("nr","year") , model = "random") 

FE <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
          data = wagepan, index = c("nr","year"), model="within")

Third, my current call and its output,

# library(texreg)                      
texreg::screenreg(list(POLS, RE, FE), custom.coef.map = list('married' = 'Marrtied', 'union' = 'Union'))
#> ================================================
#>            Model 1      Model 2      Model 3    
#> ------------------------------------------------
#> Marrtied      0.11 ***     0.06 ***     0.05 *  
#>              (0.02)       (0.02)       (0.02)   
#> Union         0.18 ***     0.11 ***     0.08 ***
#>              (0.02)       (0.02)       (0.02)   
#> ------------------------------------------------
#> R^2           0.19         0.18         0.18    
#> Adj. R^2      0.19         0.18         0.06    
#> Num. obs.  4360         4360         4360       
#> ================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05

I did try adding , include.fstatistic = TRUE, but it seems as I can't get it that way. It is as I need some additional customization.

I am aiming for something like this,

#> ------------------------------------------------
#> Obs.  (N)   4360         4360         4360       
#> Indiv.(n)    545          545          545       
#> Time  (T)      8            8            8       
#> R^2           0.19         0.18         0.18    
#> Adj. R^2      0.19         0.18         0.06    
#> F-stat       72.458       68.4124      83.8515  
#> P-value        (2.22e-16)   (2.22e-16)   (2.22e-16) 
#> ================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05
Cupp answered 24/5, 2018 at 11:0 Comment(0)
H
2

You could hack it using texreg::extract().

For getting the "small n" we need a small function first.

getIndex <- function(fit){
  # extracts number of factor levels of index variables
  # from raw data used in models
  index.names <- as.character(as.list(summary(fit)$call)$index)[-1]
  if (length(index.names == 1)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    length(unique(index.df))
  }
  if (length(index.names == 2)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    cbind(length(unique(index.df[, 1])), 
          length(unique(index.df[, 2]))) 
  } else {
    stop("no index variables specified in model")
  }

}

Then go on extracting.

fv.1 <- summary(POLS)$fstatistic$statistic  # get F statistic
pv.1 <- summary(POLS)$fstatistic$p.value  # get p value
ns.1 <- getIndex(POLS)[1]  # get small n
tm.1 <- getIndex(POLS)[2]  # get times

library(texreg)
ex.1 <- extract(POLS)  # extract coefficients and GOF measures
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")  # the GOF names
ex.1@gof <- c(ex.1@gof[1:3], ns.1, tm.1, fv.1, pv.1)  # the GOF values
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)  # numeric or integer

fv.2 <- summary(RE)$fstatistic$statistic
pv.2 <- summary(RE)$fstatistic$p.value
ns.2 <- getIndex(RE)[1]
tm.2 <- getIndex(RE)[2]

ex.2 <- extract(RE)
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.2@gof <- c(ex.2@gof[1:3], ns.2, tm.2, fv.2, pv.2)
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)

fv.3 <- summary(FE)$fstatistic$statistic
pv.3 <- summary(FE)$fstatistic$p.value
ns.3 <- getIndex(FE)[1]
tm.3 <- getIndex(FE)[2]

ex.3 <- extract(FE)
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.3@gof <- c(ex.3@gof[1:3], ns.3, tm.3, fv.3, pv.3)
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)

Yielding

> screenreg(list(ex.1, ex.2, ex.3))

=======================================================
                  Model 1      Model 2      Model 3    
-------------------------------------------------------
[TRUNCATED...]
-------------------------------------------------------
R^2                  0.19         0.18         0.18    
Adj. R^2             0.19         0.18         0.06    
Num. obs.         4360         4360         4360       
Indiv.(n)          545          545          545       
Time (T)             8            8            8       
F-stat              72.46        68.41        83.85    
P-value              0.00         0.00         0.00    
=======================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

Take a look into e.g. str(extract(FE)) to apply this for additional GOFs.

To wrap it into a function take a look at code in @Neal Fultz's answer.

Humdinger answered 24/5, 2018 at 11:55 Comment(5)
Waw, it really does require quite a lot of hacking to get there. I had imagined there would be some simple called or command I had overlooked. Thanks for demonstrating this approach!Cupp
Welcome! Hope I didn't reinvent the wheel, I also couldn't find a simple command, though.Humdinger
I'll hold out for some hours and see if someone can point to a wheel. Thanks again.Cupp
Would you be up for editing your answer to include the 545 Indiv.(n)? If so I'll be happy to award you the bounty.Cupp
Sure, I'd be honored. It was quite a hack, the difficulty was, that small n and time variable nowhere seems to be saved within summary(model), so we have to consult the raw data to obtain them. I've tested getIndex() with some other data, too, and should be working in most cases. Test the function with e.g. plm(Sepal.Length ~ Sepal.Width, iris) vs. plm(Sepal.Length ~ Sepal.Width, iris, index="Species").Humdinger
D
4

Following up on @jaySF answer, create your own extract function which wraps the default, and register it:

custom_extract_plm <- function(model, ...) {
  s <- summary(model)
  ex.1 <- texreg:::extract.plm(model, ...)

  fv.1 <- s$fstatistic$statistic
  pv.1 <- s$fstatistic$p.value

  [email protected] <- c([email protected], "F-stat", "P-value")
  ex.1@gof <- c(ex.1@gof, fv.1, pv.1)
  [email protected] <- c([email protected], TRUE, TRUE)

  ex.1
}

setMethod(texreg:::extract, signature = className("plm", "plm"), custom_extract_plm)

Now you get an F-statistic:

> texreg::screenreg(list(POLS, RE, FE), custom.coef.map = list('married' = 'Marrtied', 'union' = 'Union'))

================================================
           Model 1      Model 2      Model 3    
------------------------------------------------
Marrtied      0.11 ***     0.06 ***     0.05 *  
             (0.02)       (0.02)       (0.02)   
Union         0.18 ***     0.11 ***     0.08 ***
             (0.02)       (0.02)       (0.02)   
------------------------------------------------
R^2           0.19         0.18         0.18    
Adj. R^2      0.19         0.18         0.06    
Num. obs.  4360         4360         4360       
F-stat       72.46        68.41        83.85    
P-value       0.00         0.00         0.00    
================================================
*** p < 0.001, ** p < 0.01, * p < 0.05
Duppy answered 30/5, 2018 at 1:14 Comment(1)
Thank you for contribution to the discussion. I wonder if you would have time to include small n, what is Indiv. (n) above, in your answer?Cupp
H
2

You could hack it using texreg::extract().

For getting the "small n" we need a small function first.

getIndex <- function(fit){
  # extracts number of factor levels of index variables
  # from raw data used in models
  index.names <- as.character(as.list(summary(fit)$call)$index)[-1]
  if (length(index.names == 1)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    length(unique(index.df))
  }
  if (length(index.names == 2)){
    df.name <- as.character(as.list(summary(fit)$call)$data)
    index.df <- get(df.name)[, index.names]
    cbind(length(unique(index.df[, 1])), 
          length(unique(index.df[, 2]))) 
  } else {
    stop("no index variables specified in model")
  }

}

Then go on extracting.

fv.1 <- summary(POLS)$fstatistic$statistic  # get F statistic
pv.1 <- summary(POLS)$fstatistic$p.value  # get p value
ns.1 <- getIndex(POLS)[1]  # get small n
tm.1 <- getIndex(POLS)[2]  # get times

library(texreg)
ex.1 <- extract(POLS)  # extract coefficients and GOF measures
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")  # the GOF names
ex.1@gof <- c(ex.1@gof[1:3], ns.1, tm.1, fv.1, pv.1)  # the GOF values
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)  # numeric or integer

fv.2 <- summary(RE)$fstatistic$statistic
pv.2 <- summary(RE)$fstatistic$p.value
ns.2 <- getIndex(RE)[1]
tm.2 <- getIndex(RE)[2]

ex.2 <- extract(RE)
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.2@gof <- c(ex.2@gof[1:3], ns.2, tm.2, fv.2, pv.2)
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)

fv.3 <- summary(FE)$fstatistic$statistic
pv.3 <- summary(FE)$fstatistic$p.value
ns.3 <- getIndex(FE)[1]
tm.3 <- getIndex(FE)[2]

ex.3 <- extract(FE)
[email protected] <- c([email protected][1:3],"Indiv.(n)", "Time (T)", 
                    "F-stat", "P-value")
ex.3@gof <- c(ex.3@gof[1:3], ns.3, tm.3, fv.3, pv.3)
[email protected] <- c([email protected][1:3], FALSE, FALSE, TRUE, TRUE)

Yielding

> screenreg(list(ex.1, ex.2, ex.3))

=======================================================
                  Model 1      Model 2      Model 3    
-------------------------------------------------------
[TRUNCATED...]
-------------------------------------------------------
R^2                  0.19         0.18         0.18    
Adj. R^2             0.19         0.18         0.06    
Num. obs.         4360         4360         4360       
Indiv.(n)          545          545          545       
Time (T)             8            8            8       
F-stat              72.46        68.41        83.85    
P-value              0.00         0.00         0.00    
=======================================================
*** p < 0.001, ** p < 0.01, * p < 0.05

Take a look into e.g. str(extract(FE)) to apply this for additional GOFs.

To wrap it into a function take a look at code in @Neal Fultz's answer.

Humdinger answered 24/5, 2018 at 11:55 Comment(5)
Waw, it really does require quite a lot of hacking to get there. I had imagined there would be some simple called or command I had overlooked. Thanks for demonstrating this approach!Cupp
Welcome! Hope I didn't reinvent the wheel, I also couldn't find a simple command, though.Humdinger
I'll hold out for some hours and see if someone can point to a wheel. Thanks again.Cupp
Would you be up for editing your answer to include the 545 Indiv.(n)? If so I'll be happy to award you the bounty.Cupp
Sure, I'd be honored. It was quite a hack, the difficulty was, that small n and time variable nowhere seems to be saved within summary(model), so we have to consult the raw data to obtain them. I've tested getIndex() with some other data, too, and should be working in most cases. Test the function with e.g. plm(Sepal.Length ~ Sepal.Width, iris) vs. plm(Sepal.Length ~ Sepal.Width, iris, index="Species").Humdinger
S
1

Here is one possibility using huxreg from my huxtable package. You will also need the broom package installed.

library(huxtable)

ht <- huxreg(POLS, RE, FE, 
      coefs = c("Married" = "married", "Union" = "union"), 
      statistics = c("Obs. (N)" = "nobs", "Adj. R^2" = "adj.r.squared", 
        "F statistic" = "statistic", "P value" = "p.value"))

The number of time units is not in broom::glance, so we add it manually - here is a simple way:

Ts <- purrr::map_int(list(POLS, RE, FE), list(pdim, "nT", "T"))
ns <- purrr::map_int(list(POLS, RE, FE), list(pdim, "nT", "n"))
ht <- insert_row(ht, c("Time (T)", Ts), after = 6)
ht <- insert_row(ht, c("Indiv. (n)", ns), after = 6)
ht

───────────────────────────────────────────────────────────────
                     (1)             (2)             (3)       
              ─────────────────────────────────────────────────
  Married          0.108 ***       0.064 ***       0.047 *     
                  (0.016)         (0.017)         (0.018)      
  Union            0.182 ***       0.106 ***       0.080 ***   
                  (0.017)         (0.018)         (0.019)      
              ─────────────────────────────────────────────────
  Obs. (N)      4360            4360            4360           
  Indiv. (n)     545             545             545           
  Time (T)         8               8               8           
  Adj. R^2         0.187           0.178           0.061       
  F statistic     72.459          68.412          83.851       
  P value          7.250e-186      5.813e-176      1.655e-156  
───────────────────────────────────────────────────────────────
  *** p < 0.001; ** p < 0.01; * p < 0.05. 

Column names: names, model1, model2, model3

This will automatically print out as TeX/HTML within a Rmarkdown document. You can edit it to add rows, columns and further formatting.

Stepmother answered 24/5, 2018 at 12:27 Comment(4)
Waw, thanks for demonstrating how your huxtable package can answer my question. Maybe I should used that! I am exporting to LaTeX. Two quick questions. How did you find the names statistic and p.value, asking as I still to identify small n, what I called Indiv.(n). Second, what was your motivation for writing huxtable? I take you have a link? :)Cupp
Those names are from broom::glance. The small n isn't available there... you can find it in pdim(model)$nT$n, and I learned that by looking through the code of plm:::print.summary.plm!Stepmother
Updated my answer to add the individualsStepmother
Very nice. Thanks! I am however a bit reluctant to reward the bouncy to this answer as I, in my initial question, asked for a texreg solution. That is not to say that you haven't answered my question!Cupp

© 2022 - 2024 — McMap. All rights reserved.