guess_formats + R + lubridate
Asked Answered
D

2

17

I'm having trouble understanding how to use the guess_formats function in lubridate. I have a vector of dates in some unknown set/order of formats. I'd like to convert them to a Date object (or at least convert as many as possible). The following code is what I've tried:

library(lubridate)
sampleDates <- c("4/6/2004","4/6/2004","4/6/2004","4/7/2004",
        "4/6/2004","4/7/2004","2014-06-28","2014-06-30","2014-07-12",
        "2014-07-29","2014-07-29","2014-08-12")
formats <- guess_formats(sampleDates, c("Ymd", "mdY"))
dates <- as.Date(sampleDates, format=formats)

This gives all NA's.

This is obviously just a short example. In the real case, I wouldn't know where the various formats are scattered about, and I wouldn't be 100% sure there are only %m/%d/%Y and %Y-%m-%d. Could someone let me know either A. how would guess_formats be used in this example or B. is there something more appropriate to use in lubridate/base R, hopefully without a lot of regex'ing. Thanks!

Edit: I've also tried parse_date_time. What I don't understand is the following works for this example:

parse_date_time(sampleDates,
            orders = c("Ymd", "mdY"),
            locale = "eng")

But this does not:

parse_date_time(sampleDates,
            orders = c("mdY", "Ydm"),
            locale = "eng")

In my actual set of data, I will not know the order of the formatting, which seems to be important for this function.

Double Edit: Dur, OK, I see I had Ymd in the first parse_date_time example and Ydm in the second...carry on.

Demello answered 26/9, 2014 at 16:37 Comment(0)
C
20

No need to call guess_formats just use parse_date_time :

 parse_date_time(sampleDates, c("Ymd", "mdY"))

 [1] "2004-04-06 UTC" "2004-04-06 UTC" "2004-04-06 UTC" "2004-04-07 UTC" "2004-04-06 UTC"
 [6] "2004-04-07 UTC" "2014-06-28 UTC" "2014-06-30 UTC" "2014-07-12 UTC" "2014-07-29 UTC"
[11] "2014-07-29 UTC" "2014-08-12 UTC"

Internally it will call guess_formats.

Cosper answered 26/9, 2014 at 16:48 Comment(0)
B
1

A general purpose option that does a good job at matching date formats is the anytime package:

library(anytime)

anydate(sampleDates)
[1] "2004-04-06" "2004-04-06" "2004-04-06" "2004-04-07" "2004-04-06" "2004-04-07" "2014-06-28"
[8] "2014-06-30" "2014-07-12" "2014-07-29" "2014-07-29" "2014-08-12"
Bhili answered 4/2, 2022 at 18:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.