Java 8 Date and Time: parse ISO 8601 string without colon in offset [duplicate]
Asked Answered
B

4

42

We try to parse the following ISO 8601 DateTime String with timezone offset:

final String input = "2022-03-17T23:00:00.000+0000";

OffsetDateTime.parse(input);
LocalDateTime.parse(input, DateTimeFormatter.ISO_OFFSET_DATE_TIME);

Both approaches fail (which makes sense as OffsetDateTime also use the DateTimeFormatter.ISO_OFFSET_DATE_TIME) because of the colon in the timezone offset.

java.time.format.DateTimeParseException: Text '2022-03-17T23:00:00.000+0000' could not be parsed at index 23

But according to Wikipedia there are 4 valid formats for a timezone offset:

<time>Z 
<time>±hh:mm 
<time>±hhmm 
<time>±hh

Other frameworks/languages can parse this string without any issues, e.g. the Javascript Date() or Jacksons ISO8601Utils (they discuss this issue here)

Now we could write our own DateTimeFormatter with a complex RegEx, but in my opinion the java.time library should be able to parse this valid ISO 8601 string by default as it is a valid one.

For now we use Jacksons ISO8601DateFormat, but we would prefer to use the official date.time library to work with. What would be your approach to tackle this issue?

Backbone answered 29/9, 2017 at 10:57 Comment(7)
It's worth noting that ISO_OFFSET_DATETIME requires the time colons and the date separators too, which aren't required by ISO-8601. Basically, it's pretty strict, unfortunately.T
If the problem is the missing colon in the timezone offset, then call input = input.replaceFirst("\+\d\d", "$0:") before parsing the date.Stoat
@JonSkeet - ISO 8601 allows offsets without a colon in the basic format, and offsets with a colon in the extended format. However, it doesn't allow one to mix and match. Here we see the date and time portions in extended format (having hyphens and colons) but the offset in basic format, which is not strictly compliant.Tolbooth
@MattJohnson: True, but ISO_OFFSET_DATE_TIME doesn't accept "just basic" either - the example of "19850412T101530+04" in ISO-8601 fails to parse. There's a BASIC_ISO_DATE formatter, but no BASIC_ISO_OFFSET_DATE_TIME or a "just give me either valid format" formatter.T
This problem is a known bug that bites when the optional colon is missing from between the hours and minutes in the offset-from-UTC. For a a workaround, specify the formatting pattern explicitly as shown in my Answer on the original of this duplicate Question. DateTimeFormatter.ofPattern( "uuuu-MM-dd'T'HH:mm:ss.SSSX" )Batik
Seems that the bug is still not fixed. It is now possible to parse a String of the form 2018-08-26T15:00:00+01, but not 2018-08-26T15:00:00+0100. Tested with OpenJDK 11.Abnormality
As 2018-08-26T15:00:00+0100 is not valid ISO-8601, this would not be a bug. You probably mean 2018-08-26T15:00:00+01:00 (extended) or 20180826T150000+0100 (basic)Villarreal
D
69

If you want to parse all valid formats of offsets (Z, ±hh:mm, ±hhmm and ±hh), one alternative is to use a java.time.format.DateTimeFormatterBuilder with optional patterns (unfortunatelly, it seems that there's no single pattern letter to match them all):

DateTimeFormatter formatter = new DateTimeFormatterBuilder()
    // date/time
    .append(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
    // offset (hh:mm - "+00:00" when it's zero)
    .optionalStart().appendOffset("+HH:MM", "+00:00").optionalEnd()
    // offset (hhmm - "+0000" when it's zero)
    .optionalStart().appendOffset("+HHMM", "+0000").optionalEnd()
    // offset (hh - "Z" when it's zero)
    .optionalStart().appendOffset("+HH", "Z").optionalEnd()
    // create formatter
    .toFormatter();
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+0000", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+00", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000+00:00", formatter));
System.out.println(OffsetDateTime.parse("2022-03-17T23:00:00.000Z", formatter));

All the four cases above will parse it to 2022-03-17T23:00Z.


You can also define a single string pattern if you want, using [] to delimiter the optional sections:

// formatter with all possible offset patterns
DateTimeFormatter formatter = DateTimeFormatter
    .ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS[xxx][xx][X]");

This formatter also works for all cases, just like the previous formatter above. Check the javadoc to get more details about each pattern.


Notes:

  • A formatter with optional sections like the above is good for parsing, but not for formatting. When formatting, it'll print all the optional sections, which means it'll print the offset many times. So, to format the date, just use another formatter.
  • The second formatter accepts exactly 3 digits after the decimal point (because of .SSS). On the other hand, ISO_LOCAL_DATE_TIME is more flexible: the seconds and nanoseconds are optional, and it also accepts from 0 to 9 digits after the decimal point. Choose the one that works best for your input data.
Damaging answered 29/9, 2017 at 12:2 Comment(2)
Thanks for your input, it's currently maybe the best solution for my question. Though I hope the java.time library will in the future provide a built-in DateTimeFormatter for this valid ISO format instead of having me to create one myself.Backbone
Does it support the UTC−06:00 time format?Ganymede
T
9

You don't need to write a complex regex - you can build a DateTimeFormatter that will work with that format easily:

DateTimeFormatter formatter =
    DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ss.SSSX", Locale.ROOT);

OffsetDateTime odt = OffsetDateTime.parse(input, formatter);

That will also accept "Z" instead of "0000". It will not accept "+00:00" (with the colon or similar. That's surprising given the documentation, but if your value always has the UTC offset without the colon, it should be okay.

T answered 29/9, 2017 at 11:8 Comment(0)
K
-2

I wouldn't call it a solution but a workaround. SimpleDateFormat's Z template supports the timezone-syntax you showed, so you can do something like this:

final String input = "2022-03-17T23:00:00.000+0000";

try {
    OffsetDateTime.parse(input);
    LocalDateTime.parse(input, DateTimeFormatter.ISO_OFFSET_DATE_TIME);
}
catch (DateTimeParseException e) {
    SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SZ", Locale.GERMANY);
    sdf.parse(input);
}

You're still using official libraries shipped with the JVM. One isn't part of the date.time-library, but still ;-)

Kearns answered 29/9, 2017 at 11:8 Comment(0)
C
-3

Since it is without colon, can you use your own format string :

final String input = "2022-03-17T23:00:00.000+0000";

    DateFormat df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ");
    Date parsed = df.parse(input);
    System.out.println(parsed);
Candis answered 29/9, 2017 at 11:8 Comment(2)
The OP is using java.time rather than SimpleDateFormat. The pattern still works with DateTimeFormatter, but it's worth providing the code using the API they're using - particularly as it's vastly superior to java.util.Date etc.T
@JonSkeet Point noted.Candis

© 2022 - 2024 — McMap. All rights reserved.