Java string comparison using bitwise xor
Asked Answered
T

1

10

I came across the below code snippet in a product's code. It is using bitwise XOR for string comparison. Is this better than the String.equals(Object o) method? What is the author trying to achieve here?

private static boolean compareSecure(String a, String b)
  {
    if ((a == null) || (b == null)) {
      return (a == null) && (b == null);
    }
    int len = a.length();
    if (len != b.length()) {
      return false;
    }
    if (len == 0) {
      return true;
    }
    int bits = 0;
    for (int i = 0; i < len; i++) {
      bits |= a.charAt(i) ^ b.charAt(i);
    }
    return bits == 0;
  }

For context, the strings being equated are authentication tokens.

Ty answered 14/11, 2017 at 22:46 Comment(11)
It is a slower way of comparing strings, since it has to always compare all characters. It is no more secure than String.equals().Delectable
Also it will throw an exception if a != null and b==null.Fairchild
@Fairchild Not true. Re-read the first if statement. That might actually be the only improvement over String.equals(): It is null-safe, and maybe that's the "secure" part. In Java 7+ you can get the same null-safe behavior using Objects.equals()Delectable
This is not secure than standard equals.Enmesh
@Zong No collision issue. It is using bits |=, not bits ^=. If any two characters are not equal, one or more bits will be set, and will stay set until the end.Delectable
@Delectable Thanks for pointing that out, my bad :)Disembarrass
This comparison is intended to not be affected by the contents of the strings as much, to prevent timing attacks (normal string comparison leaks information about how long the equal-prefix is).Gormless
@harold I wouldn't expect it to short-circuit on unequal lengths and null if that was the point.Sylvanus
thanks @harold. your comment led me to run a search and came across this post security.stackexchange.com/questions/83660/… . Looks like yours is the best answer.Ty
I wonder if the JIT compiler could optimize it into a short-circuiting loop anyway.Sylvanus
|= (equal or) If set once, while iterating through charAt(i), with a num greater than 0, it will stay, and the bits == 0 will evaluate to false.Farad
U
13

This is a common implementation of string comparison function that is invulnerable to timing attacks.

In short, the idea is to compare all the characters every time you compare strings, even if you find any of them are not equal. In "standard" implementation you just break on the first difference and return false.

This is not secure because it gives away the information about the compared strings. Specifically if the left-side string is a secret you want to keep (e.g. password), and the right-side string is something provided by the user, an unsafe method allows the hacker to uncover your password with a relative ease, by repeatedly trying out different strings and measuring the response time. The more characters in the two strings are identical, the more the 'unsecure' function would take to compare them.

For instance, comparing "1234567890" and "0987654321" using a standard method would result in doing just a single comparison of the first character and returning false, since 1!=0. On the other hand comparing "1234567890" to "1098765432", would result in executing 2 comparison operations, because the first ones are equal, you have to compare the second ones to find they are different. This would take a bit more time and it is measurable, even when we are talking about remote calls.

If you do N attacks with N different strings, each starting with a different character, you should see one of the of the results taking a fraction of a milisecond more then the rest. This means the first character is the same, so the function has to take more time to compare the second one. Rinse and repeat for each position in the string and you can crack the secret orders of magnitude faster then brute force.

Preventing such attack is the point of such implementation.

Edit: As diligently pointed out in comment by Mark Rotteveel, this implementation is still vulnerable to timing attack that is aimed at revealing the length of the string. Still this is not a problem in many cases (either you don't care about attacker knowing the length or you deal with data that is standard and anyone can know the length anyway, for instance some kind of known-length hash)

Unset answered 10/5, 2018 at 12:37 Comment(1)
The code in the question is still vulnerable to a timing attack to find out the length of the string though.Plumbum

© 2022 - 2024 — McMap. All rights reserved.