Why is Date is being returned as type 'double'?
Asked Answered
D

2

18

I'm having some trouble working with the as.Date function in R. I have a vector of dates that I'm reading in from a .csv file that are coming in as a factor of integers or as character (depending on how I read in the file, but this doesn't seem to have anything to do with the issue), formatted as %m/%d/%Y.

I'm going through the file row by row, pulling out the date field and trying to convert it for use elsewhere using the following code:

tmpDtm <- as.Date(as.character(tempDF$myDate), "%m/%d/%Y")

This seems to give me what I want, for example, if I do this to a starting value of 12/30/2014, I get the value "2014-12-30" returned. However, if I examine this value using typeof(), R tells me that it its data type is 'double'. Additionally, if I try to bind this to other values and store it in a data frame using c() or cbind(), in the data frame, it winds up being stored as 16434, which looks to me like some sort of different internal storage value of a date. I'm pretty sure that's what it is too because if I try to convert that value again using as.Date(), it throws an error asking for an origin.

So, two questions: Is this as expected? If so, is there a more appropriate way to convert a date so that I actually end up with a date-typed object?

Thank you

Deckle answered 12/9, 2016 at 20:41 Comment(5)
Please make an reproducible example. What type has the data that you c() or cbind() it to? Are you aware, that all entries of a vector must be of the same type?Libb
I am aware of that. My apologies for not being clear, but the data type is double before I even attempt the bind (i.e., the tmpDtm object above is type: double). The binding issue is less of a concern for me - if I can make sure I have a date object before the bind, I can figure out how to get it bound to other data the way I need - I just thought the extra information might help in identifying what is happening with the conversion before I do the bind.Deckle
I don't personally know the subtle differences between the two functions, but try class instead of typeof. The former seems to return Date while the latter returns double.Rihana
The answer below is incomplete. This question should be migrated to Stack Overflow. There are many explanations there. Including, why c() and cbind() coerce the value? What does the integer date represent exactly? What is the difference between typeof and class? Are there other date objects not subject to this coercion?Martica
help(Date) says Dates are represented as the number of days since 1970-01-01, with negative values for earlier dates.Libb
L
20

Dates are internally represented as double, as you can see in the following example:

> typeof(as.Date("09/12/16", "%m/%d/%y"))
[1] "double"

it is still marked a class Date, as in

> class(as.Date("09/12/16", "%m/%d/%y"))
[1] "Date"

and because it is a double, you can do computations with it. But because it is of class Date, these computations lead to Dates:

> as.Date("09/12/16", "%m/%d/%y") + 1
[1] "2016-09-13"
> as.Date("09/12/16", "%m/%d/%y") + 31
[1] "2016-10-13"

EDIT I have asked for c() and cbind(), because they can be assciated with some strange behaviour. See the following example, where switching the order within c changes not the type but the class of the result:

> c(as.Date("09/12/16", "%m/%d/%y"), 1)
[1] "2016-09-12" "1970-01-02"
> c(1, as.Date("09/12/16", "%m/%d/%y"))
[1]     1 17056

> class(c(as.Date("09/12/16", "%m/%d/%y"), 1))
[1] "Date"
> class(c(1, as.Date("09/12/16", "%m/%d/%y")))
[1] "numeric"

EDIT 2 - c() and cbind force objects to be of one type. The first edit shows an anomaly of coercion, but generally, the vector must be of one shared type. cbind shares this behavior because it coerces to matrix, which in turn coerces to a single type.

For more help on typeof and class see this link

Libb answered 12/9, 2016 at 20:59 Comment(1)
Check out methods("c"). In the S3 system, the class of the first argument determines which method is used. And both c and cbind are S3 generics. If you want to have the same result regardless of order, you'd need to use S4 classes, which can do method dispatch depending on the signatures of all arguments.Villatoro
M
7

This is as expected. You used typeof(); you probably should used class():

R> Sys.Date()
[1] "2016-09-12"
R> typeof(Sys.Date())       # this more or less gives you how it is stored
[1] "double"
R> class(Sys.Date())        # where as this gives you _behaviour_
[1] "Date"
R> 

Minor advertisement: I have a new package anytime, currently in incoming at CRAN, which deals with this as it converts "anything" to POSIXct (via anytime()) or Date (via anydate().

E.g.:

R> anydate("12/30/2014")             # no format needed
[1] "2014-12-30"
R> anydate(as.factor("12/30/2014"))  # converts from factor too
[1] "2014-12-30"
R> 
Mythomania answered 12/9, 2016 at 21:32 Comment(5)
looking forward to anytime !Tedmund
It is on CRAN now :)Mythomania
Had a look. Can't see where it documents to handle origin. E.g. anydate(43930) returns "2090-04-11" instead of "2020-04-11" which it would with origin 1900-01-01.Ow
Sorry but you are commenting on a comment from over four years. I made changes in anytime since -- did you check its NEWS and/or documentation? If you did find an issue please report it (ideally with a reproducible example) as a GitHub issue. Thanks!Mythomania
In the first paragraph of the help page, under Description: "... objects represent dates and time as (possibly fractional) seconds since the ‘epoch’ of January 1, 1970." So anydate(0) is Jan 1, 1970, and anydate(2) is Jan 3. Hence anydate(43930) is alomost 44k days after that epoch.Mythomania

© 2022 - 2024 — McMap. All rights reserved.