Finding collisions in hash table
Asked Answered
S

2

5

I was reviewing for my data structures final exam, and I came across a question in past year's final. Having worked on it for the past three hours, I still couldn't figure out a way to solve it, except by trial and error. Here's the question:

"Suppose that the size of your hash table is 31, the constant g is also 31, and that you use the following hash code

int hash = 0;
int n = s.length();
for (int i = 0; i < n; i++)
   hash = g * hash + s.charAt(i);

and that you use separate chaining to resolve collisions. List five different names that would hash to the same location in the table."

I think there must be some kind of tricks, possibly involving the modulo operator, to solve this problem, considering the size of the hash table is 31, which is the same as the constant g. Sorry if I sound like it, but I'm not asking for code or anything, just some thoughts/hints on the question. Any comment is greatly appreciated. Thanks

Snapdragon answered 17/5, 2012 at 2:59 Comment(0)
P
5

According to the Wikipedia article on Java's string hashing algorithm (which is the same as the algorithm you present):

As with any general hashing function, collisions are possible. For example, the strings "FB" and "Ea" have the same hash value. The hashCode() implementation of String uses the prime number 31 and the difference between 'a' and 'B' is just 31, so the calculation is 70 × 31 + 66 = 69 × 31 + 97.

Pursuer answered 17/5, 2012 at 3:7 Comment(3)
Btw, this implementation of hashing leaves Java open to a DoS attack! See ocert.org/advisories/ocert-2011-003.html, cryptanalysis.eu/blog/2011/12/28/… or google for DoS attacks using hash maps.Tonl
@yshavit: Interesting. I had not considered that before, but it makes perfect sense.Pursuer
More ways to generate collisions for String.hashCode : david-soroko.blogspot.co.uk/2015/06/…Troytroyer
U
6

java strings can contain a zero character ("\0"), so all the following would hash to the same value:

"a"
"\0a"
"\0\0a"
"\0\0\0a"
"\0\0\0\0a"

here's proof (thanks to Eric for the ref to this being the hash used):

> cat Foo.java
class Foo {
    public static void main(String[] args) {                                    
        System.out.println("a".hashCode());                                     
        System.out.println("\0a".hashCode());                                   
        System.out.println("\0a".length());
        System.out.println("\0a".equals("a"));
    }                                                                           
}           
> javac Foo.java                                        
> java Foo                                                       
97
97
2
false

i doubt this is the expected answer, though.

also, if this were an exam, i doubt i could remember ASCII codes. so an alternative approach to using sequences of the style in the other answer would be:

"\002\000"
"\001\037"

etc (these are octal triplets - the above both hash to 62). but is it easy to generate five examples (all same hash) for this style?

Ursi answered 17/5, 2012 at 3:51 Comment(1)
yeah, i don't think this is the expected answer haha, but still thanks a lot! I learned one new thing about the zero character, so that's pretty awesome.Snapdragon
P
5

According to the Wikipedia article on Java's string hashing algorithm (which is the same as the algorithm you present):

As with any general hashing function, collisions are possible. For example, the strings "FB" and "Ea" have the same hash value. The hashCode() implementation of String uses the prime number 31 and the difference between 'a' and 'B' is just 31, so the calculation is 70 × 31 + 66 = 69 × 31 + 97.

Pursuer answered 17/5, 2012 at 3:7 Comment(3)
Btw, this implementation of hashing leaves Java open to a DoS attack! See ocert.org/advisories/ocert-2011-003.html, cryptanalysis.eu/blog/2011/12/28/… or google for DoS attacks using hash maps.Tonl
@yshavit: Interesting. I had not considered that before, but it makes perfect sense.Pursuer
More ways to generate collisions for String.hashCode : david-soroko.blogspot.co.uk/2015/06/…Troytroyer

© 2022 - 2024 — McMap. All rights reserved.