Group dates by week in R
Asked Answered
S

3

9

I wish to label my data based on their week. This is my data:

df2 <- structure(list(Order_Date = structure(c(16735, 16805, 16753, 
16830, 17075, 17009, 17085, 16740, 16891, 16750, 16820, 16849, 
16906, 16929, 16746, 16731, 16786, 16873, 16895, 16931), class = "Date")), .Names = "Order_Date", row.names = c(NA, 
-20L), class = "data.frame")

and I tried to label them based on the week (0th week, 1th week and ....) and I wish to group my data by the week later

And I tried this:

# order by data
library (dplyr)
df2<- arrange(df2, Order_Date)


# label them by week
$df2$week <- cumsum(weekdays(df2$Order_Date) == "Friday")

It doesn't give me the correct result and I have the following output, which is weird

   Order_Date week
1  2015-10-27    0
2  2016-01-05    0
3  2015-11-14    0
4  2016-01-30    0
5  2016-10-01    0
6  2016-07-27    0
7  2016-10-11    0
8  2015-11-01    0
9  2016-03-31    0
10 2015-11-11    0
11 2016-01-20    0
12 2016-02-18    0
13 2016-04-15    1
14 2016-05-08    1
15 2015-11-07    1
16 2015-10-23    2
17 2015-12-17    2
18 2016-03-13    2
19 2016-04-04    2
20 2016-05-10    2

Ideally, I'd like to have this output:

  Order_Date   label
1  2015-10-23   0
2  2015-10-27   0
3  2015-11-01   1
4  2015-11-07   2
5  2015-11-11   2
6  2015-11-14   3
7  2015-12-17   8
8  2016-01-05   10

since row number 8 occurs 10 week after row number 1 but also a solution that generates the following is my second alternative that shows these data are not in the same week:

  Order_Date   label
1  2015-10-23   0
2  2015-10-27   0
3  2015-11-01   1
4  2015-11-07   2
5  2015-11-11   2
6  2015-11-14   3
7  2015-12-17   4
8  2016-01-05   5
Squirt answered 14/11, 2016 at 4:1 Comment(5)
What is your expected output?Uric
@Uric I update the question, please let me know if it is not still clear.Squirt
Why would you expect cumsum(weekdays(df2$Order_Date) == "Friday") to give you week-number? It will only tell you the cumulative number of dates in that column which happened to be Fridays (but they're unordered, and not every day or week is covered; what if you skip a Friday or two?). That has nothing to do with the week-number.Jasminjasmina
@Jasminjasmina that's correct, I hoped to at least find a solution to find the dates which are in the same week, Look at my alternative desirable output at the bottom of the question please.Squirt
Then use the week-of-year functions from either zoo or lubridate.Jasminjasmina
E
6

The code below calculates the current week relative to the minimum week in the data. week2 uses modular arithmetic to make the code more concise, although the week numbers don't always line up exactly with the direct calculation of years and week numbers using lubridate functions.

library(dplyr)
library(lubridate)

df2 %>% mutate(week = (year(Order_Date) - year(min(Order_Date)))*52 + 
                 week(Order_Date) - week(min(Order_Date)),
               week2 = (as.numeric(Order_Date) %/% 7) - (as.numeric(min(Order_Date)) %/% 7)) %>%
  arrange(Order_Date)
   Order_Date week week2
1  2015-10-23    0     0
2  2015-10-27    0     0
3  2015-11-01    1     1
4  2015-11-07    2     2
5  2015-11-11    2     2
6  2015-11-14    3     3
7  2015-12-17    8     8
8  2016-01-05   10    10
9  2016-01-20   12    12
10 2016-01-30   14    14
11 2016-02-18   16    17
12 2016-03-13   20    20
13 2016-03-31   22    23
14 2016-04-04   23    23
15 2016-04-15   25    25
16 2016-05-08   28    28
17 2016-05-10   28    28
18 2016-07-27   39    39
19 2016-10-01   49    49
20 2016-10-11   50    50
Escalator answered 14/11, 2016 at 4:29 Comment(2)
I guess we can also use week() from lubridate packageShana
I did use week() from the lubridate package.Escalator
C
10

cut.Date takes an interval specification (see ?cut.Date).

Your data covers a whole year, so unless you want to re-name those weeks, this will count up the actual number of weeks:

library(dplyr)
df2 %>% 
  mutate(week = cut.Date(Order_Date, breaks = "1 week", labels = FALSE)) %>% 
  arrange(Order_Date)

#>    Order_Date week
#> 1  2015-10-23    1
#> 2  2015-10-27    2
#> 3  2015-11-01    2
#> 4  2015-11-07    3
#> 5  2015-11-11    4
#> 6  2015-11-14    4
#> 7  2015-12-17    9
#> 8  2016-01-05   12
#> 9  2016-01-20   14
#> 10 2016-01-30   15
#> 11 2016-02-18   18
#> 12 2016-03-13   21
#> 13 2016-03-31   24
#> 14 2016-04-04   25
#> 15 2016-04-15   26
#> 16 2016-05-08   29
#> 17 2016-05-10   30
#> 18 2016-07-27   41
#> 19 2016-10-01   50
#> 20 2016-10-11   52
Claud answered 14/11, 2016 at 4:31 Comment(1)
Thank you so much. Both solution worked for me perfectly.Squirt
E
6

The code below calculates the current week relative to the minimum week in the data. week2 uses modular arithmetic to make the code more concise, although the week numbers don't always line up exactly with the direct calculation of years and week numbers using lubridate functions.

library(dplyr)
library(lubridate)

df2 %>% mutate(week = (year(Order_Date) - year(min(Order_Date)))*52 + 
                 week(Order_Date) - week(min(Order_Date)),
               week2 = (as.numeric(Order_Date) %/% 7) - (as.numeric(min(Order_Date)) %/% 7)) %>%
  arrange(Order_Date)
   Order_Date week week2
1  2015-10-23    0     0
2  2015-10-27    0     0
3  2015-11-01    1     1
4  2015-11-07    2     2
5  2015-11-11    2     2
6  2015-11-14    3     3
7  2015-12-17    8     8
8  2016-01-05   10    10
9  2016-01-20   12    12
10 2016-01-30   14    14
11 2016-02-18   16    17
12 2016-03-13   20    20
13 2016-03-31   22    23
14 2016-04-04   23    23
15 2016-04-15   25    25
16 2016-05-08   28    28
17 2016-05-10   28    28
18 2016-07-27   39    39
19 2016-10-01   49    49
20 2016-10-11   50    50
Escalator answered 14/11, 2016 at 4:29 Comment(2)
I guess we can also use week() from lubridate packageShana
I did use week() from the lubridate package.Escalator
S
2

Alternatively you could use the ISOweek package to convert the dates to ISOweek format and then use that for filtering your output.

Example code using ISOweek package:

library(ISOweek)
x <- paste0(2000:2017, "-01-01")
x <- as.Date(x)
y <- ISOweek(x)
print(y)
Scoundrel answered 15/6, 2017 at 7:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.