I believe that R incorrectly formats POSIXct types with fractional seconds. I submitted this via R-bugs as an enhancement request and got brushed off with "we think the current behavior is correct -- bug deleted." While I am very appreciative of the work they have done and continue to do, I wanted to get other peoples' take on this particular issue, and perhaps advice on how to make the point more effectively.
Here is an example:
> tt <- as.POSIXct('2011-10-11 07:49:36.3')
> strftime(tt,'%Y-%m-%d %H:%M:%OS1')
[1] "2011-10-11 07:49:36.2"
That is, tt is created as a POSIXct time with fractional part .3 seconds. When it is printed with one decimal digit, the value shown is .2. I work a lot with timestamps of millisecond precision and it causes me a lot of headaches that times are often printed one notch lower than the actual value.
Here is what is happening: POSIXct is a floating-point number of seconds since the epoch. All integer values are handled precisely, but in base-2 floating point, the closest value to .3 is very slightly smaller than .3. The stated behavior of strftime()
for format %OSn
is to round down to the requested number of decimal digits, so the displayed result is .2. For other fractional parts the floating point value is slightly above the value entered and the display gives the expected result:
> tt <- as.POSIXct('2011-10-11 07:49:36.4')
> strftime(tt,'%Y-%m-%d %H:%M:%OS1')
[1] "2011-10-11 07:49:36.4"
The developers' argument is that for time types we should always round down to the requested precision. For example, if the time is 11:59:59.8 then printing it with format %H:%M
should give "11:59" not "12:00", and %H:%M:%S
should give "11:59:59" not "12:00:00". I agree with this for integer numbers of seconds and for format flag %S
, but I think the behavior should be different for format flags that are designed for fractional parts of seconds. I would like to see %OSn
use round-to-nearest behavior even for n = 0
while %S
uses round-down, so that printing 11:59:59.8 with format %H:%M:%OS0
would give "12:00:00". This would not affect anything for integer numbers of seconds because those are always represented precisely, but it would more naturally handle round-off errors for fractional seconds.
This is how printing of fractional parts is handled in, for example C, because integer casting rounds down:
double x = 9.97;
printf("%d\n",(int) x); // 9
printf("%.0f\n",x); // 10
printf("%.1f\n",x); // 10.0
printf("%.2f\n",x); // 9.97
I did a quick survey of how fractional seconds are handled in other languages and environments, and there really doens't seem to be a consensus. Most constructs are designed for integer numbers of seconds and the fractional parts are an afterthought. It seems to me that in this case the R developers made a choice that is not completely unreasonable but is in fact not the best one, and is not consistent with the conventions elsewhere for displaying floating-point numbers.
What are peoples' thoughts? Is the R behavior correct? Is it the way you yourself would design it?
Rgames: tt <- as.POSIXct('2011-10-11 07:49:36.32842')
leads toRgames: strftime(tt,'%Y-%m-%d %H:%M:%OS4')
[1] "2011-10-11 07:49:36.3284"
Rgames: strftime(tt,'%Y-%m-%d %H:%M:%OS1')
[1] "2011-10-11 07:49:36.3"
I strongly recommend you simply carry a couple extra places and subsequently round off whatever way works best for you. – Caphaitien