Analysing a data frame that contains a time series using stargazer
Asked Answered
B

1

2

I have a panel data set of 10 obs. and 3 variables. (# of obs. 30 = 10 rows (= countries) * 2 columns (= migration parameters) * 1col for the respective year. My data frame consists of 3 annual data frames, so to say.

How can I apply stargazer on the whole period of time by taking into account that it is a panel data set (so max N=10)? That is, R should start over after every 11th row. I'd like to get the pretty table for descriptive statistics

The data set for the first three years:

structure(list(Population = c(21759420, 8696916, 1946351, 14689726, 
8212264, 491723, 18907008, 4345386, 11133861, 657229, 22549547, 
8944706, 1979882, 15141099, 8489031, 496963, 19432541, 4404230, 
11502786, 673252, 23369131, 9199259, 2014866, 15605217, 8766930, 
502384, 19970495, 4448525, 11887202, 689692), Distance..km. = c(7243L, 
4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 4047L, 9140L, 
7243L, 4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 4047L, 
9140L, 7243L, 4290L, 9500L, 3789L, 6452L, 2211L, 4667L, 5036L, 
4047L, 9140L), year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 
2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 
2010)), .Names = c("Population", "Distance..km.", "year"), row.names = c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 50L, 51L, 52L, 53L, 54L, 
55L, 56L, 57L, 58L, 59L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 
106L, 107L, 108L), class = "data.frame")

I still get descriptive statistics from N=30, but it should N=10, since I'm looking for the descriptive statistics of the whole period of three years and each yearly data frame needs to be considered isolated for that. Hope I expressed the problem comprehensibly

Bays answered 10/11, 2017 at 15:56 Comment(4)
By time series, do you mean panel data? a time series is univariate, whereas panel data is multivariate and can have more than one entity. Also, stargazer is a package for printing well-formatted tables, not an analysis tool, so your question of "R should start over to analyse after every 49th row." does not make any sense.Negrete
What exactly are you trying to do here? stargazer just makes pretty tables and doesn't really do an analysis. you should provide some sort of minimal reproducible example with data that can be used for testing and a clear description of the desired output.Subtle
Your sample data only has one row...please provide a panel data in the form of copy and pasting the output of dput(my_data) into your question.Negrete
Please read my comment again, and provide the dput(my_data) version instead of what you have here. Also read MrFlick's link on how to provide a minimal reproducible exampleNegrete
N
4

You can either use split + lapply from base R:

library(stargazer)

lapply(split(df, df$year), stargazer, type = "text")

or by:

by(df, df$year, stargazer, type = 'text')

Result:

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,008.000       0.000      2,008    2,008   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,009.000       0.000      2,009    2,009   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,010.000       0.000      2,010    2,010   
---------------------------------------------------------------
df$year: 2008
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,008.000       0.000      2,008    2,008   "
[8] "---------------------------------------------------------------"
-------------------------------------------------------------------------- 
df$year: 2009
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,009.000       0.000      2,009    2,009   "
[8] "---------------------------------------------------------------"
-------------------------------------------------------------------------- 
df$year: 2010
[1] ""                                                               
[2] "==============================================================="
[3] "Statistic     N      Mean        St. Dev.      Min      Max    "
[4] "---------------------------------------------------------------"
[5] "Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131"
[6] "Distance..km. 10   5,637.500     2,385.941    2,211    9,500   "
[7] "year          10   2,010.000       0.000      2,010    2,010   "
[8] "---------------------------------------------------------------"

The disadvantage of these two methods is that they print out the tables twice (once from stargazer output, another from lapply/by). To get around this, you can use walk form purrr to only call stargazer for it's side-effects:

library(dplyr)
library(purrr)

df %>%
  split(.$year) %>%
  walk(~ stargazer(., type = "text"))

Result:

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,083,988.000 7,541,970.000 491,723 21,759,420
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,008.000       0.000      2,008    2,008   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,361,404.000 7,798,880.000 496,963 22,549,547
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,009.000       0.000      2,009    2,009   
---------------------------------------------------------------

===============================================================
Statistic     N      Mean        St. Dev.      Min      Max    
---------------------------------------------------------------
Population    10 9,645,370.000 8,065,676.000 502,384 23,369,131
Distance..km. 10   5,637.500     2,385.941    2,211    9,500   
year          10   2,010.000       0.000      2,010    2,010   
---------------------------------------------------------------

Note:

All methods above works for latex output (type = "latex"). I only set type = "text" for demonstrative purposes.

Negrete answered 13/11, 2017 at 15:25 Comment(2)
Thanks a lot, that works! I'd need to solve two more things. 1. When I retrieve the result from my browser as a htm-file, it only shows me the last table (I may not use Latex) 2. I got now the tables for each year. How can I summarize those tables to one, and thereby get the statistic for the years 2008-2010?Bays
@Bays Not sure about No.1 as I can't reproduce your issue. For No.2, just write: stargazer(df, type = 'text')?Negrete

© 2022 - 2024 — McMap. All rights reserved.