I want to match the lower case of "I" of English (i) to lower case of "İ" of Turkish (i). They are the same glyph but they don't match. When I do System.out.println("İ".toLowerCase());
the character i and a dot is printed(this site does not display it properly)
Is there a way to match those?(Preferably without hard-coding it) I want to make the program match the same glyphs irrelevant of the language and the utf code. Is this possible?
I've tested normalization with no success.
public static void main(String... a) {
String iTurkish = "\u0130";//"İ";
String iEnglish = "I";
prin(iTurkish);
prin(iEnglish);
}
private static void prin(String s) {
System.out.print(s);
System.out.print(" - Normalized : " + Normalizer.normalize(s, Normalizer.Form.NFD));
System.out.print(" - lower case: " + s.toLowerCase());
System.out.print(" - Lower case Normalized : " + Normalizer.normalize(s.toLowerCase(), Normalizer.Form.NFD));
System.out.println();
}
The result is not properly shown in the site but the first line(iTurkish) still has the ̇
near lowercase i.
Purpose and Problem
This will be a multi lingual dictionary. I want the program to be able to recognize that "İFEL" starts with "if". To make sure they are not case sensitive I first convert both text to lower case. İFEL becomes i(dot)fel and "if" is not recognized as a part of it
if (turkNorm.equals(engNorm))
returns false. – Lysander"if"
. – Periscope