Parsing a string with date and time into a particular point in time (Java calls it an "Instant
") is quite complicated. Java has been tackling this in several iterations. The latest one, java.time
and java.time.chrono
, covers almost all needs (except time dilation :) ).
However, that complexity brings a lot of confusion.
The key to understand date parsing is:
Why does Java have so many ways to parse a date?
- There are several systems to measure a time. For instance, the historical Japanese calendars were derived from the time ranges of the reign of the respective emperor or dynasty. Then there is, e.g., the Unix timestamp.
Fortunately, the whole (business) world managed to use the same.
- Historically, the systems were being switched from/to, for various reasons. E.g., from the Julian calendar to the Gregorian calendar in 1582; so, the 'western' dates before that need to be treated differently.
- And, of course, the change did not happen at once. Because the calendar came from the headquarters of some religion and other parts of Europe believed in other deities, for instance Germany did not switch until the year 1700.
...and why is the LocalDateTime
, ZonedDateTime
et al. so complicated
There are time zones.
A time zone is basically a "stripe"*[3] of the Earth's surface whose authorities follow the same rules of when does it have which time offset. This includes summer time rules.
The time zones change over time for various areas, mostly based on who conquers whom. And one time zone's rules change over time as well.
There are time offsets. That is not the same as time zones, because a time zone may be, e.g., "Prague", but that has summer time offset and winter time offset.
If you get a timestamp with a time zone, the offset may vary, depending on what part of the year it is in. During the leap hour, the timestamp may mean two different times, so without additional information, it can't be reliably converted.
Note: By timestamp I mean "a string that contains a date and/or time, optionally with a time zone and/or time offset."
Several time zones may share the same time offset for certain periods. For instance, the GMT/UTC time zone is the same as the "London" time zone when the summer time offset is not in effect.
To make it a bit more complicated (but that's not too important for your use case):
The scientists observe Earth's dynamic, which changes over time; based on that, they add seconds at the end of individual years. (So 2040-12-31 24:00:00 may be a valid date-time.) This needs regular updates of the metadata that systems use to have the date conversions right. E.g., on Linux, you get regular updates to the Java packages including these new data.
The updates do not always keep the previous behavior for both historical and future timestamps. So it may happen that parsing of the two timestamps around some time zone's change comparing them may give different results when running on different versions of the software. That also applies to comparing between the affected time zone and other time zone.
Should this cause a bug in your software, consider using some timestamp that does not have such complicated rules, like Unix timestamp.
Because of 7, for the future dates, we can't convert dates exactly with certainty. So, for instance, current parsing of 8524-02-17 12:00:00 may be off a couple of seconds from the future parsing.
JDK's APIs for this evolved with the contemporary needs
- The early Java releases had just
java.util.Date
which had a bit naive approach, assuming that there's just the year, month, day, and time. This quickly did not suffice.
- Also, the needs of the databases were different, so quite early,
java.sql.Date
was introduced, with its own limitations.
- Because neither covered different calendars and time zones well, the
Calendar
API was introduced.
- This still did not cover the complexity of the time zones. And yet, the mix of the above APIs was really a pain to work with. So as Java developers started working on global web applications, libraries that targeted most use cases, like JodaTime, got quickly popular. JodaTime was the de facto standard for about a decade.
- But the JDK did not integrate with JodaTime, so working with it was a bit cumbersome. So, after a very long discussion on how to approach the matter, JSR-310 was created mainly based on JodaTime.
How to deal with it in Java's java.time
Determine what type to parse a timestamp to
When you are consuming a timestamp string, you need to know what information it contains. This is the crucial point. If you don't get this right, you end up with a cryptic exceptions like "Can't create Instant", "Zone offset missing", "unknown zone id", etc.
Does it contain the date and the time?
Does it have a time offset?
A time offset is the +hh:mm part. Sometimes, +00:00 may be substituted with Z as 'Zulu time', UTC
as Universal Time Coordinated, or GMT as Greenwich Mean Time. These also set the time zone.
For these timestamps, you use OffsetDateTime
.
Does it have a time zone?
For these timestamps, you use ZonedDateTime
.
Zone is specified either by
- name ("Prague", "Pacific Standard Time", "PST"), or
- "zone ID" ("America/Los_Angeles", "Europe/London"), represented by java.time.ZoneId.
The list of time zones is compiled by a "TZ database", backed by ICAAN.
According to ZoneId
's javadoc, the zone id's can also somehow be specified as Z and offset. I'm not sure how this maps to real zones.
If the timestamp, which only has a TZ, falls into a leap hour of time offset change, then it is ambiguous, and the interpretation is subject of ResolverStyle
, see below.
If it has neither, then the missing context is assumed or neglected. And the consumer has to decide. So it needs to be parsed as LocalDateTime
and converted to OffsetDateTime
by adding the missing info:
- You can assume that it is a UTC time. Add the UTC offset of 0 hours.
- You can assume that it is a time of the place where the conversion is happening. Convert it by adding the system's time zone.
- You can neglect and just use it as is. That is useful e.g. to compare or subtract two times (see
Duration
), or when you don't know and it doesn't really matter (e.g., local bus schedule).
Partial time information
- Based on what the timestamp contains, you can take
LocalDate
, LocalTime
, OffsetTime
, MonthDay
, Year
, or YearMonth
out of it.
If you have the full information, you can get a java.time.Instant
. This is also internally used to convert between OffsetDateTime
and ZonedDateTime
.
Figure out how to parse it
There is an extensive documentation on DateTimeFormatter
which can both parse a timestamp string and format to string.
The pre-created DateTimeFormatter
s should cover more or less all standard timestamp formats. For instance, ISO_INSTANT
can parse 2011-12-03T10:15:30.123457Z
.
If you have some special format, then you can create your own DateTimeFormatter (which is also a parser).
private static final DateTimeFormatter TIMESTAMP_PARSER = new DateTimeFormatterBuilder()
.parseCaseInsensitive()
.append(DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SX"))
.toFormatter();
I recommend to look at the source code of DateTimeFormatter
and get inspired on how to build one using DateTimeFormatterBuilder
. While you're there, also have a look at ResolverStyle
which controls whether the parser is LENIENT, SMART or STRICT for the formats and ambiguous information.
TemporalAccessor
Now, the frequent mistake is to go into the complexity of TemporalAccessor
. This comes from how the developers were used to work with SimpleDateFormatter.parse(String)
. Right, DateTimeFormatter.parse("...")
gives you TemporalAccessor
.
// No need for this!
TemporalAccessor ta = TIMESTAMP_PARSER.parse("2011-... etc");
But, equipped with the knowledge from the previous section, you can conveniently parse into the type you need:
OffsetDateTime myTimestamp = OffsetDateTime.parse("2011-12-03T10:15:30.123457Z", TIMESTAMP_PARSER);
You do not actually need to the DateTimeFormatter
either. The types you want to parse have the parse(String)
methods.
OffsetDateTime myTimestamp = OffsetDateTime.parse("2011-12-03T10:15:30.123457Z");
Regarding TemporalAccessor
, you can use it if you have a vague idea of what information there is in the string, and want to decide at runtime.
I hope I shed some light of understanding onto your soul :)
Note: There's a backport of java.time
to Java 6 and 7: ThreeTen-Backport. For Android it has ThreeTenABP.
[3] Not just that they are not stripes, but there also some weird extremes. For instance, some neighboring Pacific Islands have +14:00 and -11:00 time zones. That means, that while on one island, there is 1st May 3 PM, on another island not so far, it is still 30 April 12 PM (if I counted correctly :) )
ZonedDateTime
rather than aLocalDateTime
. The name is counter-intuitive; theLocal
means any locality in general rather than a specific time zone. As such, aLocalDateTime
object is not tied to the time line. To have meaning, to get a specify moment on the time line, you must apply a time zone. – VirgenvirgieLocalDateTime
vs.ZonedDateTime
vs.OffsetDateTime
vs.Instant
vs.LocalDate
vs.LocalTime
, how to keep calm about why it's so complicated and how to do it right at the first shot. – GlycineLocalDateTime
would probably have been namedZonelessOffsetlessDateTime
. – Glycine