Generic support for ISO 8601 format in Java 6
Asked Answered
W

3

9

Java 7 has introduced support in the SimpleDateFormat class for ISO 8601 format, via the character X (instead of lower or upper case Z). Supporting such formats in Java 6 requires preprocessing, so the best approach is the question.

This new format is a superset of Z (uppercase Z), with 2 additional variations:

  1. The "minutes" field is optional (i.e., 2-digit instead of 4-digit timezones are valid)
  2. A colon character (':') can be used for separating the 2-digit "hours" field from the 2-digit "minutes" field).

So, as one can observe from the Java 7 documentation of SimpleDateFormat, the following 3 formats are now valid (instead of only the second one covered by Z in Java 6) and, of course, equivalent:

  1. -08
  2. -0800
  3. -08:00

As discussed in an earlier question about a special case of supporting such an "expanded" timezone format, always with ':' as a separator, the best approach for backporting the Java 7 functionality into Java 6 is to subclass the SimpleDateformat class and override its parse() method, i.e:

public Date parse(String date, ParsePosition pos)
{
    String iso = ... // Replace the X with a Z timezone string, using a regex

    if (iso.length() == date.length())
    {
        return null; // Not an ISO 8601 date
    }

    Date parsed = super.parse(iso, pos);

    if (parsed != null)
    {
        pos.setIndex(pos.getIndex()+1); // Adjust for ':'
    }

    return parsed;
}

Note that the subclassed SimpleDateFormat objects above must be initialized with the corresponding Z-based pattern, i.e. if the subclass is ExtendedSimpleDateformat and you want to parse dates complying to the pattern yyyy-MM-dd'T'HH:mm:ssX, then you should use objects instantiated as

new ExtendedSimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");

In the aforementioned earlier question the regex :(?=[0-9]{2}$) has been suggested for getting rid of the ':' and in a similar question the regex (?<=[+-]\d{2})$ has been suggested for appending the "minute" field as 00, if needed.

Obviously, running the 2 replacements successfully can be used for achieving full functionality. So, the iso local variable in the overridden parse() method would be set as

iso = date.replaceFirst(":(?=[0-9]{2}$)","");

or

iso = iso.replaceFirst("(?<=[+-]\\d{2})$", "00");

with an if check in between to make sure that the pos value is also set properly later on and also for the length() comparison earlier.

The question is: can we use a single regular expression to achieve the same effect, including the information needed for not unnecessarily checking the length and for correctly setting pos a few lines later?

The implementation is intended for code that reads very large numbers of string fields that can be in any format (even totally non-date), selects only those which comply to the format and returns the parsed Java Date object.

So, both accuracy and speed are of paramount importance (i.e., if using the 2 passes is faster, this approach is preferrable).

Wallraff answered 23/10, 2012 at 22:11 Comment(6)
Have you checked the corresponding code in JDK 7?Meiny
Not yet, because I am not using it, but probably this will not provide much help, since inside the SimpleDateFormat class the patterns are compiled into a grammar before processing, so there is no correspondence to any regex. Thanks and +1 anyway. :-)Wallraff
Have you considered javax.xml.datatype.DatatypeFactory? It supports 8601 format date strings. see download.java.net/jdk7/archive/b123/docs/api/javax/xml/datatype/…Alleviate
Why don't you try Joda Time and have it return a Date object?Homeopathic
JodaTime does not support all date format patterns that Java does. Otherwise, it would always have been a first choice.Wallraff
Much much easier to use the java.time classes that supplant the old legacy date-time classes (Date, Calendar, etc.) and also supplants Joda-Time. Much of the java.time functionality is back-ported to Java 6 & 7 in ThreeTen-Backport and further adapted to Android in ThreeTenABP.Abstain
I
6

Seems that you can use this:

import java.util.Calendar;
import javax.xml.bind.DatatypeConverter;

public class TestISO8601 {
    public static void main(String[] args) {
        parse("2012-10-01T19:30:00+02:00"); // UTC+2
        parse("2012-10-01T19:30:00Z");      // UTC
        parse("2012-10-01T19:30:00");       // Local
    }
    public static Date parse(final String str) {
        Calendar c = DatatypeConverter.parseDateTime(str);
        System.out.println(str + "\t" + (c.getTime().getTime()/1000));
        return c.getTime();
    }
}
Irina answered 5/1, 2013 at 22:33 Comment(0)
C
5

You can use java.time, the modern Java date and time API, in Java 6. This would seem to me as the nice and also future-proof solution. It has good support for ISO 8601.

import org.threeten.bp.OffsetDateTime;
import org.threeten.bp.format.DateTimeFormatter;

public class DemoIso8601Offsets {
    public static void main(String[] args) {
        System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+0200", 
                DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ssXX")));
        System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02", 
                DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ssX")));
        System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02:00"));
        System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00Z"));
    }
}

Output from this program is:

2012-10-01T19:30+02:00
2012-10-01T19:30+02:00
2012-10-01T19:30+02:00
2012-10-01T19:30Z

It requires that you add the ThreeTen Backport library to your project setup.

  • In Java 8 and later and on newer Android devices (from API level 26) the modern API comes built-in.
  • In Java 6 and 7 get the ThreeTen Backport, the backport of the new classes (ThreeTen for JSR 310; see the links at the bottom).
  • On (older) Android use the Android edition of ThreeTen Backport. It’s called ThreeTenABP. And make sure you import the date and time classes from org.threeten.bp with subpackages.

As you can see from the code, +02 and +0200 require a formatter where you specify the format of the offset, while +02:00 (and Z too) conforms with the default format and doesn’t need to be specified.

Can we parse all the offset formats using the same formatter?

When reading mixed data, you don’t want to handle each offset format specially. It’s better to use optional parts in the format pattern string:

    DateTimeFormatter allInOne 
            = DateTimeFormatter.ofPattern("uuuu-MM-dd'T'HH:mm:ss[XXX][XX][X]");
    System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+0200", allInOne));
    System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02", allInOne));
    System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00+02:00", allInOne));
    System.out.println(OffsetDateTime.parse("2012-10-01T19:30:00Z", allInOne));

Output is the same as above. The square brackets in [XXX][XX][X] mean that either format +02:00, +0200 or +02 may be present.

Links

Celka answered 25/8, 2018 at 13:29 Comment(2)
"As you can see from the code, +02 and +0200 require a formatter": +02 does not require a formatter.Fotheringhay
@ArvindKumarAvinash The docs seem self contradictory to me as to whether the minutes of the offset are optional. From OffsetDateTime.parse("2012-10-01T19:30:00+02") I got org.threeten.bp.format.DateTimeParseException: Text '2012-10-01T19:30:00+02' could not be parsed at index 19, so apparently with ThreeTen Backport, used in this answer because Java 6 was asked about, they are not (index 19 is where +02 is). The same code works without exception with java.time.OffsetDateTime.parse(CharSequence) on Java 11, in case that is what you meant.Celka
Q
1

The same approach works for different milliseconds and different offsets:

String DATE_TIME_PATTERN = "yyyy-MM-dd'T'HH:mm:ss[.SSS][.SS][.S][XXX][XX][X]";
DateTimeFormatter formatter = DateTimeFormatter.ofPattern(DATE_TIME_PATTERN);

Date convertDate(String dateString) {
    return Date.from(OffsetDateTime.parse(dateString, formatter).toInstant());
}

Sometimes you need to have two different patterns for getter and setter:

String DATE_TIME_PATTERN_SET = "yyyy-MM-dd'T'HH:mm:ss[.SSS][.SS][.S][XXX][XX][X]";
String DATE_TIME_PATTERN_GET = "yyyy-MM-dd'T'HH:mm:ssXXX";
DateTimeFormatter formatterSet = DateTimeFormatter.ofPattern(DATE_TIME_PATTERN_SET);
DateFormat dateFormat = new SimpleDateFormat(DATE_TIME_PATTERN_GET);

Date convertToDate(String dateString) {
    return Date.from(OffsetDateTime.parse(dateString, formatterSet).toInstant());
}

String convertToString(Date date) {
    dateFormat.setTimeZone(TimeZone.getDefault());
    return dateFormat.format(date).replaceAll("Z$", "+00:00");
}
Quodlibet answered 30/11, 2023 at 8:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.