table with data (its a data.table object) that looks like the following :
date stock_id logret
1: 2011-01-01 1 0.001
2: 2011-01-02 1 0.003
3: 2011-01-03 1 0.005
4: 2011-01-04 1 0.007
5: 2011-01-05 1 0.009
6: 2011-01-06 1 0.011
7: 2011-01-01 2 0.013
8: 2011-01-02 2 0.015
9: 2011-01-03 2 0.017
10: 2011-01-04 2 0.019
11: 2011-01-05 2 0.021
12: 2011-01-06 2 0.023
13: 2011-01-01 3 0.025
14: 2011-01-02 3 0.027
15: 2011-01-03 3 0.029
16: 2011-01-04 3 0.031
17: 2011-01-05 3 0.033
18: 2011-01-06 3 0.035
The above can be created as :
DT = data.table(
date=rep(as.Date('2011-01-01')+0:5,3) ,
stock_id=c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3),
logret=seq(0.001, by=0.002, len=18));
setkeyv(DT,c('stock_id','date'))
Of course the real table is larger with many more stock_ids and dates. The aim to to reshape this data table such that I can run a regression of all stockid log_returns with their corresponding log_returns with a lag of 1 day (or prior traded day in case of weekends).
The final results would look like :
date stock_id logret lagret
1: 2011-01-01 1 0.001 NA
2: 2011-01-02 1 0.003 0.001
3: 2011-01-03 1 0.005 0.003
....
16: 2011-01-04 3 0.031 0.029
17: 2011-01-05 3 0.033 0.031
18: 2011-01-06 3 0.035 0.033
I'm finding this data structure really tricky to build without mixing up my stockid.
2011-04-01
and2011-04-04
is not 1 day. – Atalayahsetkey(stockid,date)
. Then add the lagged column using:=
androll
ondate-1
. Then do the regression by stock. – Gaspard?data.table
forroll
. – GaspardDT[DT,lagret:=log_return,roll=TRUE]
fills the lagret variable with dates close to epoch. I've eliminated the obvious stuff (i.e. Date are actually dates not strings, the keys are set correctly) – BackdateDT[list(id,date-1),lagret:=log_return,roll=TRUE]
. – GaspardDT[,lagret:=DT[list(id,date-1),logret,roll=TRUE][[3L]]]
. Which version ofdata.table
are you using? A full reproducible example would be nice - something pastable into the R session. – Gaspard