R: count 15 minutes interval in time
Asked Answered
H

2

6

I would like to count the amount of sessions started at each 15 minutes intervals for businessdays within a large dataset.

My data looks like:

df <- 

Start_datetime       End_datetime       Duration    Volume
2016-04-01 06:20:55 2016-04-01 14:41:22  08:20:27   8.360
2016-04-01 08:22:27 2016-04-01 08:22:40  00:00:13   0.000
2016-04-01 08:38:53 2016-04-01 09:31:58  00:53:05   12.570
2016-04-01 09:33:57 2016-04-01 12:37:43  03:03:46   7.320
2016-04-01 10:05:03 2016-04-01 16:41:16  06:36:13   9.520
2016-04-01 12:07:57 2016-04-02 22:22:32  34:14:35   7.230
2016-04-01 16:56:55 2016-04-02 10:40:17  17:43:22   5.300
2016-04-01 17:29:18 2016-04-01 19:50:29  02:21:11   7.020
2016-04-01 17:42:39 2016-04-01 19:45:38  02:02:59   2.430
2016-04-01 17:47:57 2016-04-01 20:26:35  02:38:38   8.090
2016-04-01 22:00:15 2016-04-04 08:22:21  58:22:06   4.710
2016-04-02 01:12:38 2016-04-02 09:49:00  08:36:22   3.150
2016-04-02 01:32:00 2016-04-02 12:49:47  11:17:47   5.760
2016-04-02 07:28:48 2016-04-04 06:58:56  47:30:08   0.000
2016-04-02 07:55:18 2016-04-05 07:55:15  71:59:57   0.240

I would like to count all the starting sessions per 15minutes starting, where:

For business days
  Time                PTU    Count
  00:00:00 - 00:15:00  1       10     #(where count is the amount of sessions started between 00:00:00 and 00:15:00)
  00:15:00 - 00:30:00  2       6
  00:30:00 - 00:45:00  3       5
  00:45:00 - 01:00:00  3       3

And so on and the same data for the weekend.

I have tried the cut function:

df$PTU <- table (cut(df$Start_datetime, breaks="15 minutes"))
data.frame(PTU)

EDIT: When I run this i receive the following error:

Error in cut.default(df$Start_datetime, breaks = "15 minutes") :'x' must be numeric

And some functions with lubridate, but I can't seem to make it work. My final goal is to create a table like the following, but then with 15 minutes interval.
enter image description here

Hyland answered 13/6, 2017 at 10:6 Comment(6)
Could you explain why the cut approach is not workingCorolla
Could you dput a bit of data?Kovno
If you looking for business days, please check hereCorolla
@akrun: I receive an error which I have added in my question. I would like to have weekdays (mo, tu, wed, th, fri) and not perse business days, but thanks anyway for the tip.Hyland
@Kovno I'm really sorry but I don't quite get what you mean with {dput} ? (I'm new to R)Hyland
@Hyland When you want to share data in R, the dput command will turn your dataframe into a few lines of code which you can copy-paste into your questions. Try ?dput to find out moreHold
K
2

Here is a sort of complete process from datetime "strings" to the format you want. The start is a string vector:

Start_time <- 
c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53", 
  "2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57", 
  "2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39", 
  "2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38", 
  "2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
)
df <- data.frame(Start_time)

And this is an actual processing

## We will use two packages
library(lubridate)
library(data.table)

# convert df to data.table, parse the datetime string
setDT(df)[, Start_time := ymd_hms(Start_time)] 
# floor time by 15 min to assign the appropriate slot (new variable Start_time_slot)
df[, Start_time_slot := floor_date(Start_time, "15 min")]

# aggregate by wday and time in a date
start_time_data_frame <- df[, .N, by = .(wday(Start_time_slot), format(Start_time_slot, format="%H:%M:%S") )]

# output looks like this 
start_time_data_frame
##     wday     time N
##  1:    6 06:15:00 1
##  2:    6 08:15:00 1
##  3:    6 08:30:00 1
##  4:    6 09:30:00 1
##  5:    6 10:00:00 1
##  6:    6 12:00:00 1
##  7:    6 16:45:00 1
##  8:    6 17:15:00 1
##  9:    6 17:30:00 1
## 10:    6 17:45:00 1
## 11:    6 22:00:00 1
## 12:    7 01:00:00 1
## 13:    7 01:30:00 1
## 14:    7 07:15:00 1
## 15:    7 07:45:00 1
Kovno answered 13/6, 2017 at 13:59 Comment(0)
C
1

There's two things you have to keep in mind when using cut on datetimes:

  1. Make sure your data is actually a POSIXt class. I'm quite sure yours isn't, or R wouldn't be using cut.default but cut.POSIXt as a method.
  2. "15 minutes" should be "15 min". See ?cut.POSIXt

So this works:

Start_datetime <- as.POSIXct(
  c("2016-04-01 06:20:55",
    "2016-04-01 06:22:12",
    "2016-04-01 05:30:12")
)

table(cut(Start_datetime, breaks = "15 min"))
# 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00 
#                   1                   0                   0                   2 

Note that the output gives you the start of the 15 minute interval as names of the table.

Coddle answered 13/6, 2017 at 12:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.