How to hash lists?
Asked Answered
I

1

2

Lists are not hashable. However, I am implementing LSH and I am seeking for a hash function that will correspond a list of positive integers (in [1, 29.000]) to k buckets. The number of lists is D, where D > k (I think) and D = 40.000, where k is not yet known (open to suggestions).


Example (D = 4, k = 2):

118 | 27 | 1002 | 225
128 | 85 | 2000 | 8700
512 | 88 | 2500 | 10000
600 | 97 | 6500 | 24000
800 | 99 | 7024 | 25874

The first column should be given as input to the hash function and return the number of a bucket.


What confuses me is that we do not seek for a function to hash a number, but a column, i.e. a list of positive integers.

Any ideas please?

I am using if that matters

Intoxication answered 9/5, 2016 at 21:11 Comment(4)
How about just converting it to hashable types, such as tuple? ( ex. hash(tuple([1, 2, 3])) )Empedocles
@Empedocles you mean something like print hash(tuple([1,2,3,4,5]))? That is what @lejlot suggested, but he deleted his answer..Intoxication
Just to clarify, do you mean you want to take a list and produce a single bucket index, or do you want to take a list of length n and produce n bucket indices?Messier
@Messier the first. Input is a list of positive integers -> h -> index of bucket.Intoxication
A
7

You can just convert it in a hashable type before:

In [4]: hash(l)
TypeError: unhashable type: 'list'

hash(tuple(l)) % k  # 29000
Out[5]: 70846
Alberich answered 9/5, 2016 at 21:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.