How to account for leap years?
Asked Answered
D

5

20

I have some doubts about the leap years, how can I be sure that by using a formula like this

add.years= function(x,y){    
if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
x <- as.POSIXlt(x)
x$year <- x$year+y
as.Date(x)
}

it will take into account leap years, when adding for example 100 years to my observation dataset? How can I control this?

I have a time series dataset with 50 years of observations:

   date    obs
1995-01-01 1.0
1995-01-02 2.0
1995-01-03 2.5
...
2045-12-30 0.2
2045-12-31 0.1

dataset+100 years

   date    obs
2095-01-01 1.0
2095-01-02 2.0
2095-01-03 2.5
...
2145-12-30 0.2
2145-12-31 0.1

After a basic check, I've noticed that the number of rows is the same for both original and 100 years after dataset. I am not sure if what was before the 29th Februray in a leap year will be now the obs value for the 1st of March in a non-leap year, etc.

I can check leap years using from the chron library the function leap.year, however I would like to know if there is a simpler way to do this, to be sure that rows with pass days of 29th february that do not exist 100 years after will be deleted, and new days of 29th February are added with NA values.

Distracted answered 13/12, 2011 at 14:24 Comment(2)
Mixing POSIXlt and Date formats is only going to end in obscure bugs and tears.Iraq
I confirm! Better spend some time to clean up my code. Thanks!Distracted
I
19

You can check if a year is a leap year with leap_year from lubridate.

years <- 1895:2005
years[leap_year(years)]

This package will also handle not generating impossible 29ths of February.

ymd("2000-2-29") + years(1)    # NA
ymd("2000-2-29") %m+% years(1) # "2001-02-28"

The %m+% "add months" operator, as mentioned by @VitoshKa, rolls the date back to the end of the previous month if the actual day doesn't exist.

Iraq answered 13/12, 2011 at 17:3 Comment(2)
lubridate doesn't seem able to handle this anymore. I get NA when I try to add one year to a leap day.Unlisted
The behavior in lubridate has been change quite a while ago. You will get NA on invalid dates. See the doc of %m+% if you want rolling behavior . Also docs for period and Period-Class.Moncton
I
5

Following the suggestion of DarkDust and Dirk Eddelbuettel, you can easily roll your own leap_year function:

leap_year <- function(year) {
  return(ifelse((year %%4 == 0 & year %%100 != 0) | year %%400 == 0, TRUE, FALSE))
}

and apply it to vector data:

years = 2000:2050
years[leap_year(years)]

[1] 2000 2004 2008 2012 2016 2020 2024 2028 2032 2036 2040 2044 2048
Investigate answered 28/8, 2018 at 22:51 Comment(0)
F
4

A year is a leap year if:

  • Is divisible by 4.
  • Not if it is divisible by 100.
  • But is if it is divisible by 400.

That is why 2000 was a leap year (although it's divisible by 100, it's also divisible by 400).

But generally, if you have a library that can take of date/time calculations then use it. It's very complicated to do these calculations and easy to do wrong, especially with ancient dates (calendar reforms) and timezones involved.

Fortunate answered 13/12, 2011 at 14:31 Comment(3)
I still dind't find any function that will allow me to do this type of modifications to a dataset, I guess I have to create one by myself.Distracted
No you don't. There is existing functionality in base R, as well as in CRAN packages.Timepiece
Toby Marthews-3 provides a neat ifelse statement for handling leap years here: r.789695.n4.nabble.com/…Kirakiran
T
1

Your suspicions are indeed correct:

x <- as.POSIXlt("2000-02-29")
y <- x
y$year <- y$year+100
y
#[1] "2100-03-01"

The strange thing is that other parts of y remain unchanged so you can't use these for comparison:

y$mday
#[1] 29
y$mon
#[1] 1

But you can use strftime:

strftime(x,"%d")
#[1] "29"
strftime(y,"%d")
#[1] "01"

So how about:

add.years <- function(x,y){
   if(!isTRUE(all.equal(y,round(y)))) stop("Argument \"y\" must be an integer.\n")
   x.out <- as.POSIXlt(x)
   x.out$year <- x.out$year+y
   ifelse(strftime(x,"%d")==strftime(x.out,"%d"),as.Date(x.out),NA)
   } 

You can then subset your data using [ and is.na to get rid of the otherwise duplicate 1st March dates. Though as these dates seem to be consecutive, you might want to consider a solution that uses seq.Date and avoid dropping data.

Tenishatenn answered 13/12, 2011 at 16:11 Comment(1)
2100 is not a leap year, so the calculation in your first example seems correct to me.Timepiece
D
0

I tried these three way on webr and got their results:

(as.Date("2020-02-29") + lubridate::years(1)) # NA
stats::update(as.Date("2020-02-29"), year = lubridate::year(as.Date("2020-02-29")) + 1) # "2021-03-01"
lubridate::`%m+%`(as.Date("2020-02-29"), lubridate::years(1)) # "2021-02-28"

You can choose one you like.

About operator %m+% and %m-% , see the document or ops-m+.r at source code

Doughy answered 24/5 at 3:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.