Considering the following Java code comparing a small string containing the German grapheme ß
String a = "ß";
String b = a.toUpperCase();
assertTrue(a.equalsIgnoreCase(b));
The comparison fails, because "ß".toUpperCase() is actually equal to "SS", and that ends up failing a check in equalsIgnoreCase()
. The Javadocs for toUpperCase()
do mention this case explicitly, however I don't understand why this does not go to ẞ, the capital variant of ß?
More generally, how should we do case insensitive comparisons, potentially across different locales. Should we just always use either toUpper()
or equalsIgnoreCase()
, but never both?
It seems that the problem is that the implementation of equalsIgnoreCase()
includes the following check: anotherString.value.length == value.length
, which seems incompatible with the Javadocs for toUpper()
, which state:
Since case mappings are not always 1:1 char mappings, the resulting String may be a different length than the original String.
Collator
instead of the built-in methods ofString
. – GarnesSS
is the uppercase because it's defined to be in Unicode. – Denicedenie