sequence of monthly dates making sure it's the same day, or the last day of month in case of invalid
Asked Answered
C

2

7

Given an initial date, I want to generate a sequence of dates with monthly intervals, ensuring every element has the same day as the initial date or the last day of the month in case the same day would yield an invalid date.

Sounds pretty standard, right?

Using difftime is not possible. Here's what the help file of difftime says:

Units such as "months" are not possible as they are not of constant length. To create intervals of months, quarters or years use seq.Date or seq.POSIXt.

But then looking at the help file of seq.POSIXt I find that:

Using "month" first advances the month without changing the day: if this results in an invalid day of the month, it is counted forward into the next month: see the examples.

This is the example in the help file.

seq(ISOdate(2000,1,31), by = "month", length.out = 4)
> seq(ISOdate(2000,1,31), by = "month", length.out = 4)
[1] "2000-01-31 12:00:00 GMT" "2000-03-02 12:00:00 GMT" 
"2000-03-31 12:00:00 GMT" "2000-05-01 12:00:00 GMT"

So, given that the initial date is on day 31, this would yield invalid dates on February, April, etc. So, the sequence end up actually skipping those months because it "counts forward" and end up with March-02, instead of February-29.

If I start on 2000-01-31, I would like the sequence as follows:

  • 2000-01-31
  • 2000-02-29
  • 2000-03-31
  • 2000-04-30
  • ...

And it should properly handle leap-years, so if the initial date is 2015-01-31 the sequence should be:

  • 2015-01-31
  • 2015-02-28
  • 2015-03-31
  • 2015-04-30
  • ...

These are just examples to illustrate the problem and I do not know the initial date in advance, nor can I assume anything about it. The initial date may well be in the middle of the month (2015-01-15) in which case seq works fine. But it can also be, as in the examples, towards the end of the month on dates that using seq alone would be problematic (days 29, 30 and 31). I cannot assume either that the initial date is the last day of the month.

I have looked around trying to find a solution. In some questions here in SO (e.g. here) there is a "trick" to get the last day of a month, by getting the first day of the next month and simply subtract 1. And finding the first day is "easy" because it is just day 1.

So my solution so far is:

# Given an initial date for my sequence
initial_date <- as.Date("2015-01-31")

# Find the first day of the month
library(magrittr) # to use pipes and make the code more readable
firs_day_of_month <- initial_date %>% 
    format("%Y-%m") %>% 
    paste0("-01") %>% 
    as.Date()

# Generate a sequence from initial date, using seq  
# This is the sequence that will have incorrect values in months that would
# have invalid dates
given_dat_seq <- seq(initial_date, by = "month", length.out = 4)

# And then generate an auxiliary sequence for the last day of the month
# I do this generating a sequence that starts the first day of the 
# same month as initial date and it goes one month further 
# (lenght 5 instead of 4) and substract 1 to all the elements
last_day_seq <- seq(firs_day_of_month, by = "month", length.out = 5)-1

# And finally, for each pair of elements, I take the min date of both
pmin(given_dat_seq, last_day_seq[2:5])

It works, but it is, at the same time, kinda dumb, hacky and convoluted. So I do not like it. And most importantly, I cannot believe there is no easier way to do this in R.

Can someone please point me to a simpler solution? (I guess it should have been as simple as seq(initial_date, "month", 4), but apparently it is not). I've googled it and looked here in SO and R mailing lists, but apart from the tricks I mentioned above, I couldn't find a solution.

Collusive answered 7/4, 2016 at 14:23 Comment(2)
Couldn't you just do seq(as.Date("2015-01-31") + 1, length=4, by="month") - 1 ?Glottalized
No, because the initial date is arbitrary (in this particular example I used Jan-31 to illustrate the problem, but I do not know the initial date in advance and it is not always the last day of the month). If I do as you say, and the initial date is "2015-01-15", my sequence will be always on day 16, when it is supposed to be on day 15 every month. Thanks, you made me realize that was not clear in my rather long question. I'll edit it.Collusive
B
7

The simplest solution is %m+% from lubridate, which solves this exact problem. So:

seq_monthly <- function(from,length.out) {
  return(from %m+% months(c(0:(length.out-1))))
}

Output:

> seq_monthly(as.Date("2015-01-31"),length.out=4)
[1] "2015-01-31" "2015-02-28" "2015-03-31" "2015-04-30"
Boyla answered 9/7, 2017 at 19:55 Comment(1)
This works great until you have to create thousands of sequence, it is just very slow.Cowboy
S
3

Similar to the lubridate answer, here is one using RcppBDT (which wraps the Boost Date.Time library from C++)

R> dt <- new(bdtDt, 2010, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2010-02-28"
[1] "2010-04-30"
[1] "2010-07-31"
[1] "2010-11-30"
[1] "2011-04-30"
R> dt <- new(bdtDt, 2000, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2000-02-29"
[1] "2000-04-30"
[1] "2000-07-31"
[1] "2000-11-30"
[1] "2001-04-30"
R> 
Sackcloth answered 10/7, 2017 at 10:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.