R XTS to.minutes5(), is not converting as "I" expected
Asked Answered
F

1

4

Hi i'm converting some 1 min data to 5 min data, and i'm finding it does 4 mins for the first increment, then goes on to do 5 min increments after that.

I've tried messing around with all the "indexAt" parameters but none give me what i want, which is starting from 5, then 10, 15, 20 etc.

i've tried

x5 <- to.minutes5(x)

AND

x <- to.period(x,
          period = 'minutes', 
          k = 5, 
          OHLC = TRUE)

1 min data

                     Open  High   Low Close Volume
2013-01-16 00:01:00 93.55 93.60 93.54 93.58      5
2013-01-16 00:02:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:03:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:04:00 93.58 93.58 93.57 93.57     12
2013-01-16 00:05:00 93.57 93.57 93.55 93.70     21
2013-01-16 00:06:00 93.56 93.56 93.56 93.56      5
2013-01-16 00:07:00 93.56 93.56 93.55 93.55      3
2013-01-16 00:08:00 93.55 93.55 93.55 93.55      2
2013-01-16 00:09:00 93.55 93.56 93.55 93.56      2
2013-01-16 00:10:00 93.56 93.56 93.56 93.56      1
2013-01-16 00:11:00 93.57 93.57 93.57 93.57      3

after converting to 5 min

                    clemtest.Open clemtest.High clemtest.Low clemtest.Close clemtest.Volume
2013-01-16 00:04:00         93.55         93.60        93.54          93.57              27
2013-01-16 00:09:00         93.57         93.57        93.55          93.56              33
2013-01-16 00:14:00         93.56         93.57        93.56          93.57               8
2013-01-16 00:19:00         93.56         93.58        93.51          93.53              77
2013-01-16 00:24:00         93.53         93.55        93.49          93.49             121
2013-01-16 00:29:00         93.49         93.51        93.49          93.51             121

the calculations are correct, its just not starting with the first 5 mins of data, it start with 4 mins, then goes onto 5 mins after that.

(using indexAt='startof' gives me the correct, 5, 10, 15... but when inspecting the bar the 5 min data represents the start of that 5 min (eg min 5 - 10) not min 0 - 5)

Here is the tail of the 1min data for reference.

                         Open  High   Low Close Volume
2013-01-17 23:53:00 95.52 95.52 95.52 95.52      2
2013-01-17 23:55:00 95.51 95.52 95.51 95.52      2
2013-01-17 23:56:00 95.51 95.51 95.51 95.51      1
2013-01-17 23:57:00 95.52 95.52 95.52 95.52      1
2013-01-17 23:59:00 95.52 95.52 95.51 95.51      4
2013-01-18 00:00:00 95.51 95.51 95.51 95.51      8
Fond answered 15/8, 2015 at 0:45 Comment(10)
to.period (and therefore to.minutes, to.minutes5, etc) use the endpoints of each interval to aggregate the data. The last observation in the first 5 minutes of your data is at 00:04:00, which is what you see. 00:05:00 is the beginning of the second 5-minute interval in the zero hour. Can you add some detail about why you want to do this?Mulligan
Take a look at library(highfrequency) package, they have a function called aggregatets() which might be what you are looking for. @user1736644Bolshevist
Hi Joshua, Why i want this is: I'm working with multiple time frames, tick, 5min and 60min all in the same strategy. I would like the end of the 12th 5 min bar to match up with the 60min bar. But if i'm getting 4,9,14 bars, it won't match. And I would like the close of the first 5min bar to equal the close of the 5th 1 min bar, In this case the close of the 5th 1 min bar is 93.70, therefore i would like the close of the first 5min bar to be 93.70 and have a time stamp of 00:05:00. (if i look a the tick data from 00:00:00 - 00:04:59 it starts from 93.55 and finish at 93.70). ThanksFond
Also if you notice the volume, The first 5 mins of the day are from 00:00:00 - 00:05:00 therefore the volume of the first 5 mins should be 5+5+5+12+21(48)Fond
If it helps I also added the tail of the 1min data to the original questionFond
If "In this case the close of the 5th 1 min bar is 93.70" is true, then your timestamps are misleading; they contain data that occur after the time they are aligned with (the end of the 5th minute of the day occurs at 00:05:59.9, but your data says it occurs at 00:05:00). I would encourage you not to continue this practice in your analysis, because it can lead to "peeking" bias. Also, the first 5 minutes of the day are from 00:00:00-04:59:59. 00:05:00 is the beginning of the second 5 minutes of the day. So (if your timestamps are correct) the volume of the first 5 minutes should be 27.Mulligan
I was thinking the end of the 5th minute would be 00:05:00. The first minute would be from 00:00:01 - 00.01.00, the second minute 00:01:01 - 00:02:00, Third min 00:02:01 - 00:03:00. Therefore if i was to look at the tick data for 00:05:00, i would expect the 1 min and 5 min bar data at 00:05:00 to be all the same price (also the volume would equal the same).Fond
What i was using this for: I use tick data for Bid/Ask as my primary data for entry and exit (Quantstrat), but i'm also using a SAR indicator on the 5 min bars time frame. I calculate the indicator value and then merge the XTS object together with the tick data, the result is at every 5 min time stamp i have the SAR value, which i then fill UP rows (n/a's), therefore the SAR at 00:10:00 gets up filled for the previous 5 mins to 00:05:01. This way when i'm running my strategy and the time is 00:05:01 I know what the SAR value is for the previous 5 min bar.Fond
Maybe this is due to the data i have exported from my trading application, it assumes that the minute 00:01:00 is the OHLC V from 00:00:01 - 00:01:00. Are you suggesting that the data should be from 00:01:00 - 00:01:59 ? thanks again for your help, appreciated.Fond
Ok after some much playing around and reading and downloading other sources of tick data your first comment has sunk in!! As 00:04:00 is the last time stamp of the first 5 mins, it uses that as the time stamp! What i really need is that time stamp to be 00:04:59:999, Because the first 5 mins of my tick data goes up until to 00:04:59:999, and and when i reference what was happening in the 5 min bar time frame i use the same time index to pull the value out eg mktdata$bid[timestamp] > mktdata$5minclose[timestamp] . Maybe i'm doing this the wrong way (multi time frame indicators)Fond
D
2

Is this what you're looking for (using just 12 minutes of data, as that is all you posted)?

x <- read.table(text ="
2013-01-16 00:01:00 93.55 93.60 93.54 93.58      5
2013-01-16 00:02:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:03:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:04:00 93.58 93.58 93.57 93.57     12
2013-01-16 00:05:00 93.57 93.57 93.55 93.70     21
2013-01-16 00:06:00 93.56 93.56 93.56 93.56      5
2013-01-16 00:07:00 93.56 93.56 93.55 93.55      3
2013-01-16 00:08:00 93.55 93.55 93.55 93.55      2
2013-01-16 00:09:00 93.55 93.56 93.55 93.56      2
2013-01-16 00:10:00 93.56 93.56 93.56 93.56      1
2013-01-16 00:11:00 93.57 93.57 93.57 93.57      3")

colnames(x) <- c("Date", "time",  "Open",  "High",   "Low", "Close", "Volume")


xt <- xts(x[, 3:7], order.by = as.POSIXct(paste0(x$Date, x$time, " ")) - 0.000001)



xt5 <- to.period(xt, period = "minutes", k =5)
xt5 <- align.time(xt5, n = 300)
xt5

#                     xt.Open xt.High xt.Low xt.Close xt.Volume
# 2013-01-16 00:05:00   93.55   93.60  93.54    93.70        48
# 2013-01-16 00:10:00   93.56   93.56  93.55    93.56        13
# 2013-01-16 00:15:00   93.57   93.57  93.57    93.57         3

To include the bar with stamp "2013-01-16 00:05:00" in the 5 minute bar that includes the interval ["2013-01-16 00:00:00", "2013-01-16 00:04:59.99999"], you could reduce the underlying time by a tiny amount of a second (a slightly negative quantity, here say -0.000001), so that it is included in the first 5 minute interval.

I think the confusion in your comments is avoided if you make the decision of stating whether the timestamp on the OHLC bar data is at the start of the bar or the end of the bar. i.e. does "2013-01-16 00:01:00" mean the OHLC for the interval (2013-01-16 00:00:00 to 2013-01-16 00:00:59.999) or (2013-01-16 00:01:00, 2013-01-16 00:01:59.999). In your case, it is at the end of the bar (the former case).

And the timestamp being the start of the bar for OHLC data isn't a good idea as it introduces look forward bias when you merge xts objects on different bar intervals together.

Diluent answered 10/11, 2017 at 5:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.