How to display coefficients in scientific notation with stargazer
Asked Answered
V

3

12

I want to compare the results of different models (lm, glm, plm, pglm) in a table in R using stargazer or a similar tool. However I can't find a way to display the coefficients in scientific notation. This is kind of a problem because the intercept is rather large (about a million) while other coefficients are small (about e-7) which results in lots of useless zeros making it harder to read the table.

I found a similar question here: Format model display in texreg or stargazer R as scientific. But the results there require rescaling the variables and since I use count data I wouldn't want to rescale it.

I am grateful for any suggestions.

Volatilize answered 22/7, 2015 at 0:20 Comment(1)
This question is somewhat relevant but not directly applicable here: #36584923Hell
B
5

Here's a reproducible example:

m1 <- lm(Sepal.Length ~ Petal.Length*Sepal.Width,
         transform(iris, Sepal.Length = Sepal.Length+1e6,
                   Petal.Length=Petal.Length*10, Sepal.Width=Sepal.Width*100))
# Coefficients:
#              (Intercept)              Petal.Length               Sepal.Width  Petal.Length:Sepal.Width  
#                1.000e+06                 7.185e-02                 8.500e-03                -7.701e-05  

I don't believe stargazer has easy support for this. You could try other alternatives like xtable or any of the many options here (I have not tried them all)

library(xtable)
xtable(m1, display=rep('g', 5)) # or there's `digits` too; see `?xtable`

Or if you're using knitr or pandoc I quite like pander, which has automagic scientific notation already (note: this is pandoc output which looks like markdown, not tex output, and then you knit or pandoc to latex/pdf):

library(pander)
pander(m1)
Binaural answered 22/7, 2015 at 5:8 Comment(2)
Okay thanks. Do you know a package that supports pglm models? I tried out several from the list you posted but wasn't lucky so far.Haphtarah
pglm is quite specific. Not even your original choice stargazer supports it. I think you will have to convert it to a nice format yourself (e.g. convert to dataframe then use one of the packages which will generate the latex for it).Binaural
C
4

It's probably worth making a feature request to the package maintainer to include this option.

In the meantime, you can replace numbers in the output with scientific notation auto-magically. There are a few things to be careful about when replacing numbers. It is important not to reformat numbers that are part of the latex encoding. Also, be careful not to replace characters that are part of variable names. For example the . in Sepal.Width could easily be mistaken for a number by regex. The following code should deal with most common situations. But, if someone, for example, calls their variable X_123456789 it might rename this to X_1.23e+09 depending on the scipen setting. So some caution is needed and a more robust solution probably will need to be implemented within the stargazer package.

here's an example stargazer table to demonstrate on (shamelessly copied from @mathematical.coffee):

library(stargazer)
library(gsubfn)
m1 <- lm(Sepal.Length ~ Petal.Length*Sepal.Width,
  transform(iris, Sepal.Length = Sepal.Length+1e6,
    Petal.Length=Petal.Length*10, Sepal.Width=Sepal.Width*100))    
star = stargazer(m1, header = F, digit.separator = '')

Now a helper function to reformat the numbers. You can play around with the digits and scipen parameters to control the output format. If you want to force scientific format more often use a smaller (more negative) scipen. Otherwise we can have it automatically use scientific format only for very small or large numbers by using a larger scipen. The cutoff parameter is there to prevent reformatting of numbers represented by only a few characters.

replace_numbers = function(x, cutoff=4, digits=3, scipen=-7) {
  ifelse(nchar(x) < cutoff, x, prettyNum(as.numeric(x), digits=digits, scientific=scipen))
}

And apply that to the stargazer output using gsubfn::gsubfn

gsubfn("([0-9.]+)", ~replace_numbers(x), star)

enter image description here

Charlton answered 4/7, 2019 at 17:18 Comment(4)
The problem also though with @Charlton and the current answer is that it doesn't exactly let you do proper scientific notation, which would be X.Ye+(Z) . The above output shows this issue, do you know what I mean?Madelle
Not sure I quite follow. Do you mean that you want something like e.g. 1.5 x 10$^3$ rather than 1.5e+03?Charlton
No i mean like, 1e+06 should be 1.2e+06 or whatever. There should be conformity in the decimals. You wouldn't want, for example 8.44e-01 and 8.4e-01 the next line. You'd want 8.44e-01 and then 8.40e-01 (if that's what it is). (R^2 example above)Madelle
Ok - I'm with you. I edited the other answer (the one with the bounty) with a version that uses sprintf instead of prettyNum. This allows one to have always the same number of digits in the scientific notation (retaining any trailing zeroes). Same change could be applied to this version too if needed.Charlton
C
4

Another robust way to get scientific notation using stargazer is to hack the digit.separator parameter. This option allows the user to specify the character that separates decimals (usually a period . in most locales). We can usurp this parameter to insert a uniquely identifiable string into any number that we want to be able to find using regex. The advantage of searching for numbers this way is that we shall only find numbers that correspond to numeric values in the stargazer output. I.e. there is no possibility to also match numbers that are part of variable names (e.g. X_12345) or that are part of the latex formatting code (e.g. \hline \\[-1.8ex]). In the following I use the string ::::, but any unique character string (such as a hash) that we will not find elsewhere in the table will do. It's probably best to avoid having any special regex characters in the identifier mark, as this will complicate things slightly.

Using the example model m1 from this other answer.

mark  = '::::'
star = stargazer(m1, header = F, decimal.mark  = mark, digit.separator = '')

replace_numbers = function(x, low=0.01, high=1e3, digits = 3, scipen=-7, ...) {
  x = gsub(mark,'.',x)
  x.num = as.numeric(x)
  ifelse(
    (x.num >= low) & (x.num < high), 
    round(x.num, digits = digits), 
    prettyNum(x.num, digits=digits, scientific = scipen, ...)
  )
}    

reg = paste0("([0-9.\\-]+", mark, "[0-9.\\-]+)")
cat(gsubfn(reg, ~replace_numbers(x), star), sep='\n')

enter image description here

Update If you want to ensure that trailing zeros are retained in the scientific notation, then we can use sprintf instead of prettyNum.

Like this

replace_numbers = function(x, low=0.01, high=1e3, digits = 3) {
  x = gsub(mark,'.',x)
  x.num = as.numeric(x)
  form = paste0('%.', digits, 'e')
  ifelse(
    (abs(x.num) >= low) & (abs(x.num) < high), 
    round(x.num, digits = digits), 
    sprintf(form, x.num) 
  )
}

enter image description here

Charlton answered 7/7, 2019 at 17:13 Comment(5)
I like this answer because it is explicit about how to not convert moderate values, but only extreme ones. One problem I am having though is that the regressions are coming with coef's 10^-8, so I need to put a lot of digits so everything doesn't become 0. That's fine, but then the R^2 at the end looks like .0131575939. Is there a way to round all non-converted, decials left to the nearest thousandth?Madelle
Good point @Madelle - I edited it to use round on the moderate values, so that you can specify digits also on these values. Does this do what you're looking for?Charlton
I gave you bounty to be safe, let me check these now, great answer. I predict this will be a more viewed question over time as academics move to integrated markdown for documents!Madelle
Could you explain what the scipen = -7 does more specifically? For some reason also the rounding works, but numbers like .003 are not being converted to scientific notationMadelle
Sorry - forgot to pass scipen through to PrettyNum in this version. Corrected now. Basically scipen is the difference in number of digits between scientific versus standard notation for scientific to be preferred. Smaller (more negative) values force using scientific notation more aggressively.Charlton

© 2022 - 2024 — McMap. All rights reserved.