There is such class as RuleBasedCollator
, but it is limited to single chars so therefore, Collator
isn't the way to go in general.
A pretty good solution to this problem would be to copy and adapt the C++ implementation that open-source browser engines have of the compare
function. I tried to go that route (tried to see Webkit's implementation) but since I don't know C++, I failed to understand exactly what is happening.
Therefore, I decided to implement a couple of basic methods in Java that do this. The current solution works for UTF-8 encoding, but with some tweaks it can be adapted to the encoding of choice too. Here it is:
import java.nio.charset.StandardCharsets;
import java.text.ParseException;
import java.util.Arrays;
public class AlphaCompare {
public static void main(String[] args) throws ParseException {
var strs = Arrays.asList(
"Test1.txt",
"Test2.txt",
"Test11.txt",
"Test22.txt",
"123",
"a123",
"a123a",
"a1231"
);
strs.sort(String::compareTo);
System.out.println("Standard String::compareTo sorting:");
strs.forEach(System.out::println);
System.out.println("-----------------------------------");
strs.sort(AlphaCompare::compareTo);
System.out.println("Custom sorting:");
strs.forEach(System.out::println);
}
public static int compareTo(String thisString, String anotherString) {
byte v1[] = thisString.getBytes(StandardCharsets.UTF_8);
byte v2[] = anotherString.getBytes(StandardCharsets.UTF_8);
return compareChars(v1, v2, v1.length, v2.length);
}
private static int compareChars(byte[] value, byte[] other, int len1, int len2) {
int lim = Math.min(len1, len2);
for (int k = 0; k < lim; k++) {
char c1 = (char) (value[k] & 0xFF);
char c2 = (char) (other[k] & 0xFF);
if (Character.isDigit(c1) && Character.isDigit(c2)) {
int d1 = Character.getNumericValue(c1);
int d2 = Character.getNumericValue(c2);
StringBuilder d1Sb = new StringBuilder(d1);
StringBuilder d2Sb = new StringBuilder(d2);
appendIfDigit(value, lim, k, d1Sb);
appendIfDigit(other, lim, k, d2Sb);
final int digit1 = Integer.valueOf(d1Sb.toString());
final int digit2 = Integer.valueOf(d2Sb.toString());
if (digit1 != digit2) {
return digit1 - digit2;
}
}
if (c1 != c2) {
return c1 - c2;
}
}
return len1 - len2;
}
private static void appendIfDigit(byte[] value, int lim, int k, StringBuilder d1Sb) {
for (int l = k; l < lim; l++) {
char cd1 = (char) (value[l] & 0xFF);
if (Character.isDigit(cd1)) {
d1Sb.append(Character.getNumericValue(cd1));
} else {
break;
}
}
}
}
This outputs the following:
Standard String::compareTo sorting:
123
Test1.txt
Test11.txt
Test2.txt
Test22.txt
a123
a1231
a123a
-----------------------------------
Custom sorting:
123
Test1.txt
Test2.txt
Test11.txt
Test22.txt
a123
a123a
a1231
Process finished with exit code 0
Happy hacking! =)