IST mapped to wrong ZoneId in java.time library
Asked Answered
R

3

5

I am trying to parse a ZonedDateTime from a String of the format (yyyy-MM-dd HH:mm z). The input String (2019-08-29 00:00 IST) generates a UTC timestamp.

Debugging led me to a point where the ZoneId for IST was mapped to Atlantic/Reykjavik which doesn't make sense. It should be mapped Asia.

timestamp = ZonedDateTime.parse(timeInput, DATE_WITH_TIMEZONE_FORMATTER)
            .toInstant().toEpochMilli()

where

DateTimeFormatter DATE_WITH_TIMEZONE_FORMATTER = DateTimeFormatter
      .ofPattern(DATE_TIME_WITH_TIMEZONE_PATTERN).withChronology(IsoChronology.INSTANCE);

Am I missing something here ?

Rubble answered 30/8, 2019 at 22:35 Comment(4)
Curious... I tried this and get the same result (with or without adding IsoChronology). Investigating further... (I presume you meant HH:mm in the time since HH:MM isn't valid).Apical
Yes, it is HH:mm in my code. Thanks for pointing out the typo in the post . I have edited it .Rubble
What version and vendor of Java? This behavior varies accordingly.Task
This is a good example of why you should not exchange textual date-time types using 2-4 character pseudo-zones such as IST. Use proper time zone names such as Asia/Kolkata, Atlantic/Reykjavik, and Europe/Dublin.Task
W
3

First, if there’s any way you can avoid it, don’t rely on three letter time zone abbreviations. They are ambiguous, probably more often than not. When you say that IST should be mapped to Asia, it still leaves the choice between Asia/Tel_Aviv and Asia/Kolkata open (plus the aliases for those two, Asia/Jerusalem and Asia/Calcutta). In other places in the world IST may mean Irish Summer Time and apparently also Iceland Standard Time (or something like it; it certainly makes sense). It’s not the first time I have seen IST recognized as Atlantic/Reykjavik.

If you can’t avoid having to parse IST, control the interpretation through the two-arg appendZoneText method of a DateTimeFormatterBuilder. It accepts a set of preferred zones:

DateTimeFormatter DATE_WITH_TIMEZONE_FORMATTER = new DateTimeFormatterBuilder()
        .appendPattern("yyyy-MM-dd HH:mm ")
        .appendZoneText(TextStyle.SHORT, Collections.singleton(ZoneId.of("Asia/Kolkata")))
        .toFormatter(Locale.ENGLISH);

Substitute your preferred preferred zone where I put Asia/Kolkata. The rest shouldn’t cause trouble:

    String timeInput = "2019-08-29 00:00 IST";
    ZonedDateTime zdt = ZonedDateTime.parse(timeInput, DATE_WITH_TIMEZONE_FORMATTER);
    System.out.println("Parsed ZonedDateTime: "+ zdt);
    long timestamp = zdt
            .toInstant().toEpochMilli();
    System.out.println("timestamp: " + timestamp);

Output:

Parsed ZonedDateTime: 2019-08-29T00:00+05:30[Asia/Kolkata]
timestamp: 1567017000000

Link: Time Zone Abbreviations – Worldwide List (you will notice that IST comes three times in the list and that many other abbreviations are ambiguous too).

Witwatersrand answered 31/8, 2019 at 15:12 Comment(3)
Thanks. It seems counterintuitive to me that I would need to specify a ZoneId and even then could specify the TimeZone Name. For me there is no way to know the ZoneId before hand. I would make a mapping from TimeZone String to Id and set up a bunch of rules that are not ambiguous.Rubble
@Rubble You need to understand that IST is not a time zone. The list of time zones defined in tzdb and maintained by IANA are indeed unique and unambiguous. IST is not in that list. Avoid using these 2-4 character pseudo-zones. Learn to exchange date-time values textually using only standard ISO 8601 formats to avoid these problems.Task
Actually, those pseudo zones can be 2-4 characters long, not just 3. Examples: CT and AEST.Task
A
2

I do not know why you're getting Iceland, even in the master TZDB IST appears only in reference to Irish Standard Time (and prior to 1968 erroneously Irish Summer Time), Israel Standard Time and India Standard Time. The appearance of Iceland may be an error in Java's timezone database.

After further investigation I have found that the problem seems to occur only if the current Locale's language is set to some non-null value. If you create a Locale without a specified language you get Asia/Kolkata, but if the language is present (any language) it returns Atlantic/Reykjavik. This is highly likely to be a bug in Java's implementation.

    String input = "2019-08-29 00:00 IST";
    Locale loc = new Locale.Builder().setRegion("US").build(); // Note no language
    System.out.println(loc.toString());
    DateTimeFormatter DATE_WITH_TIMEZONE_FORMATTER = 
        DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm z").withLocale(loc);
    ZonedDateTime zdt = ZonedDateTime.parse(input, DATE_WITH_TIMEZONE_FORMATTER);
    System.out.println(zdt);

This produces

_US
2019-08-29T00:00+05:30[Asia/Kolkata]

But changing

    Locale loc = new Locale.Builder().setLanguage("ta").build();

produces

ta
2019-08-29T00:00Z[Atlantic/Reykjavik]

Regardless, the bare timezone IST is ambiguous out of context. To avoid confusion, if you want IST to always be Asia/Kolkata you may have to modify the incoming data prior to parsing.

Apical answered 31/8, 2019 at 0:3 Comment(3)
Thanks, but it introduces a dependency on Locale. For me there is no way to know the locale beforehand and the only information I have is the input String. I could personally create a mapping from the Timezone string to Locale and use that in parsing the input .Rubble
Also, I would have assumed in case of ambiguity , the library should have indicated the ambiguity by failing to parse. Maybe the reason it didn't fail is that the default Locale is X (e.g. en_US) and in that locale IST is not ambiguous . Am I right in assuming so ?Rubble
Just a guess: This behavior may vary with recent versions of Java (OpenJDK at least) having switched to using Unicode CLDR by default as the source of locale definitions. See JEP 252.Task
R
2

Avoid using the three-letter time zone ID. Given below is an extract from as old as Java 6 documentation:

Three-letter time zone IDs

For compatibility with JDK 1.1.x, some other three-letter time zone IDs (such as "PST", "CTT", "AST") are also supported. However, their use is deprecated because the same abbreviation is often used for multiple time zones (for example, "CST" could be U.S. "Central Standard Time" and "China Standard Time"), and the Java platform can then only recognize one of them.

Corresponding to the Indian Standard Time which has a time offset of UTC+05:30, you can build a custom formatter using .appendZoneText(TextStyle.SHORT, Set.of(ZoneId.of("Asia/Kolkata"))) with DateTimeFormatterBuilder as shown below:

DateTimeFormatter formatter = new DateTimeFormatterBuilder()
        .appendPattern("uuuu-MM-dd HH:mm")
        .appendLiteral(' ')
        .appendZoneText(TextStyle.SHORT, Set.of(ZoneId.of("Asia/Kolkata")))
        .toFormatter(Locale.ENGLISH);

Now, let's use this custom formatter:

Instant instant = ZonedDateTime.parse("2019-08-29 00:00 IST", formatter).toInstant();
System.out.println(instant);
System.out.println(instant.toEpochMilli());

Output:

2019-08-28T18:30:00Z
1567017000000

Learn more about the the modern date-time API from Trail: Date Time.

Rachmaninoff answered 2/10, 2022 at 7:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.