OBVIOUS ANSWER: Get rid of this obsolete crud that never worked properly and do it right.
Let's first explain this result
I'm not sure why, but you wanted to know why this happens. I dived into the source of SimpleDateFormat
for you.
Given that the yy
part is the last, it takes all remaining digits. Thus, the "2022000000000"
part of the string is parsed into a Long
of value 2022000000000
. This is then immediately converted to an int
, and that's quite problematic; 2022000000000
overflows and turns into int value -929596416
. A standard java.text.CalendarBuilder
instance is then told to set its YEAR
field to that value (-929596416). Which is fine.
When parsing is done, that builder is asked to produce a GregorianCalendar
value. This doesn't work - the GregorianCalendar accepts -929596416 as YEAR value just fine, but SimpleDateFormat
then asks this GregCal instance to calculate the time in millis since the epoch, and that fails; an exception throws an exception indicating this. This exception is caught by the SimpleDateFormat code and results in the Unparseable date
exception that you are getting.
With 2023, you get the same effect: That is turned into an int
without checking if it overflows; that overflows just the same, and results in int value 70403584
. GregorianCalendar DOES accept this year. This then results in what you saw: Year 70403584 - which is explained as follows:
long y = 2023000000000L;
int i = (int) y;
System.out.println(i); // prints 70403584
A deeper dive then is: Why is 70403584 fine, and -929596416 isn't?
Mostly, 'because'. The GregCal internal methods getMinimum(field)
and getMaximum(field)
, when passing the YEAR field (constant value 1) are respectively 1 and 292278994. That means 70403584 is accepted, and -929596416 is not. You told it to be non-lenient. "Lenient" here (the old j.u.Calendar
stuff) is mostly a silly concept (trying to define what is acceptable in non-lenient mode is virtually impossible. Various utterly ridiculous dates nevertheless are acceptable even in non-lenient mode).
We can verify this:
GregorianCalendar cal = new GregorianCalendar();
cal.setLenient(false);
cal.set(Calendar.YEAR, -5);
System.out.println(cal.getTime());
gives you:
Exception in thread "main" java.lang.IllegalArgumentException: YEAR
at java.base/java.util.GregorianCalendar.computeTime(GregorianCalendar.java:2609)
at java.base/java.util.Calendar.updateTime(Calendar.java:3411)
at java.base/java.util.Calendar.getTimeInMillis(Calendar.java:1805)
at java.base/java.util.Calendar.getTime(Calendar.java:1776)
THE EXECUTIVE CONCLUSION: If you were expecting lenient mode to reject these patterns, I have some nasty news for you: non-lenient mode does not work and never did and you should not be relying on it. Specifically here, overflows are not checked (you'd think that in non-lenient mode, any overflow of any value means the value is rejected, but, alas), and 2023000000000 so happens to overflow into a ridiculous but nevertheless, acceptable (even in non-lenient) year, whereas 2022000000000 does not.
So how do you fix this?
You can't. SimpleDateFormat and GregorianCalendar are horrible API and broken implementations. The only fix is to ditch it. Use java.time
. Make a new formatter using java.time.DateTimeFormatter
, parse this value into a LocalDate
, and go from there. You'll solve a whole host of timezone related craziness on the fly, too! (Because java.util.Date
is lying and doesn't represent dates. It represents instants, hence why .getYear()
and company are deprecated, because you can't ask an instant for a year without a timezone, and Date doesn't have one. Calendar is intricately interwoven with it all - hence, storing dates on one timezone and reading them on another causes wonkiness. LocalDate avoids all that).
EDIT: As a fellow dutchie, note that the most recent JDKs break the Europe/Amsterdam
timezone (grumble grumble OpenJDK team doesn't understand what damage they are causing) - which means any conversion between epoch-millis and base dates is extra problematic for software running in dutch locales. For example, if you are storing birthdates and you dip through conversion like this, everybody born before 1940 will break and their birthday will shift by a day. LocalDate
avoids this by never storing anything as epoch-millis in the first place.
ddMMyy
doesn't really match your input - but you really shouldn't be usingSimpleDateFormat
anymore and instead preferDateTimeFormatter
instead – IndicesddMMyyyy
and09012022
and it works – Indicesyyyy
and2023000000000
it generates a year of70403584
so I'm "assuming" that's it's trying to consume all the trailing0
s – IndicesSimpleDateFormat
here? Certainly an option I would seriously consider. – Statuette2023
in your string is the year, then the you needyyyy
(notyy
) to parse it. – StatuetteEurope/Amsterdam
is broken, and any birthdates pre-1940 will shift by a day. Perhaps this disaster will let you claim the resources to fully refactor away from the legacy here. – Custody090120
, which appears to me to be correct for 9 January 2020. Are you wanting the exception or do you want parsing without exception? In the latter case, do you need to get the year correct (2023 and 2020 in your two examples)? – StatuetteLocalDateTime.parse("09012020000000000", DateTimeFormatter.ofPattern("ddMMuuuuHHmmssSSS"))
. Same if year is 2020 instead of 2023. In Java 8 it’s just a little more complicated. – StatuetteSimpleDateFormat
and replace line 1940 inSimpleDateFormat
, which reads:value = number.intValue();
withvalue = number.longValue() > Integer.MAX_VALUE ? -1 : number.intValue();
, and use this cloned copy... – Custody