Java 7 has introduced support in the SimpleDateFormat
class for ISO 8601 format, via the character X
(instead of lower or upper case Z
). Supporting such formats in Java 6 requires preprocessing, so the best approach is the question.
This new format is a superset of Z
(uppercase Z), with 2 additional variations:
- The "minutes" field is optional (i.e., 2-digit instead of 4-digit timezones are valid)
- A colon character (':') can be used for separating the 2-digit "hours" field from the 2-digit "minutes" field).
So, as one can observe from the Java 7 documentation of SimpleDateFormat
, the following 3 formats are now valid (instead of only the second one covered by Z
in Java 6) and, of course, equivalent:
- -08
- -0800
- -08:00
As discussed in an earlier question about a special case of supporting such an "expanded" timezone format, always with ':' as a separator, the best approach for backporting the Java 7 functionality into Java 6 is to subclass the SimpleDateformat
class and override its parse()
method, i.e:
public Date parse(String date, ParsePosition pos)
{
String iso = ... // Replace the X with a Z timezone string, using a regex
if (iso.length() == date.length())
{
return null; // Not an ISO 8601 date
}
Date parsed = super.parse(iso, pos);
if (parsed != null)
{
pos.setIndex(pos.getIndex()+1); // Adjust for ':'
}
return parsed;
}
Note that the subclassed SimpleDateFormat
objects above must be initialized with the corresponding Z
-based pattern, i.e. if the subclass is ExtendedSimpleDateformat
and you want to parse dates complying to the pattern yyyy-MM-dd'T'HH:mm:ssX
, then you should use objects instantiated as
new ExtendedSimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
In the aforementioned earlier question the regex :(?=[0-9]{2}$)
has been suggested for getting rid of the ':' and in a similar question the regex (?<=[+-]\d{2})$
has been suggested for appending the "minute" field as 00
, if needed.
Obviously, running the 2 replacements successfully can be used for achieving full functionality. So, the iso
local variable in the overridden parse()
method would be set as
iso = date.replaceFirst(":(?=[0-9]{2}$)","");
or
iso = iso.replaceFirst("(?<=[+-]\\d{2})$", "00");
with an if
check in between to make sure that the pos
value is also set properly later on and also for the length()
comparison earlier.
The question is: can we use a single regular expression to achieve the same effect, including the information needed for not unnecessarily checking the length and for correctly setting pos
a few lines later?
The implementation is intended for code that reads very large numbers of string fields that can be in any format (even totally non-date), selects only those which comply to the format and returns the parsed Java Date
object.
So, both accuracy and speed are of paramount importance (i.e., if using the 2 passes is faster, this approach is preferrable).
Date
object? – HomeopathicDate
,Calendar
, etc.) and also supplants Joda-Time. Much of the java.time functionality is back-ported to Java 6 & 7 in ThreeTen-Backport and further adapted to Android in ThreeTenABP. – Abstain