The basic assumption that NumberFormat would validate its input is wrong. A modern dev might expect a validation, especially because the method throws a ParseException
as a checked exception, but with the magic of open-source I can look at the source and realize I am very wrong and this Java 1.1 code was written with different design principles than I am used to.
The critical code section in the concrete class that we are using here (for one implementation) is in openjdk > DecimalFormat.java > int subparseNumber, where the input string gets converted into a "DigitList". The digit-list for "3.14" with a German locale is indeed [3, 1, 4]
because the thousands-separator is indeed ignored as @GiacomoCatenazzi pointed out in his comment 1, so subsequent code has to interpret it as 314. Also, when an invalid character is encountered, the parsing just stops, so for example "0x134" -> 0 with no error.
There is more to learn from the source-code: NumberFormat is not threadsafe, you may not reuse the same instance across multiple threads. The modern assumption that a function like format.parse(input) -> obj
would be trivially safe because input and format are only accessed readonly does not hold - parsing changes internal state of the NumberFormat-instance. You can only reuse the instance after parse
completes.
So how do I make a failfast conversion of Strings to numbers in Java?
(1) If you know the target type and the number is in the standard decimal format, this works:
Float.valueOf("3,14"); // NumberFormatException
Float.valueOf("3.14"); // 3.14f
Note that NumberFormat.getNumberInstance().parse("3,14")
will return 314 - not an error - so this no-validation-problem is in no way exclusive to the German Locale.
(2) If I have to use German-locale-number-strings for reading numbers, I must check if the input-string matches expectation and NumberFormat does not provide any way to do that, nor does there seem to be a satisfying fail-fast/non-gigo answer to this 12-year old question about the problem: Convert String with Dot or Comma to Float Number
The best idea I have is to validate the input myself and restrict it that way. Here is a solution that is stricter than necessary, banning thousands-separators completely, but for my usecase, this is fine:
if (inputString.contains(".")) {
// throw
}
return Float.valueOf(inputString.replace(',', '.'));
1 You can actually do format.setGroupingUsed(false)
, and then you can parse "3.14" as a 3 instead of a 314, so it is not entirely true they get fully ignored. But there is no code that uses the grouping-character to judge the correctness of the input String, even though there is format.setGroupingSize
and getter which controls how many digits should be grouped together.
.
as thousand separator (and so just ignored). Parsing numbers (and dates) is still not yet a solved problem (all or most locale formats just expect formatting in the other direction, so they fails to handle the common formats, but one) – Hoyt3.14f
or an error, Neither is the case, see mine and the other answer. – Megacycle