How to generate unique ids for RecyclerView, in case of 2 ids per item?

Asked 15/11, 2017 at 8:10 Answered 15/11, 2017 at 10:54

Solved android android-recyclerview uniqueidentifier

Background

Suppose I have a RecyclerView, which has items that can only be unique if you look at 2 ids they have, but not just one of them.

The first Id is the primary one. Usually there aren't 2 items that have the same primary ID, but sometimes it might occur, which is why there is a secondary ID.

In my

The problem

The RecyclerView adapter needs to have a "long" type being returned:

https://developer.android.com/reference/android/support/v7/widget/RecyclerView.Adapter.html#getItemId(int)

What I've tried

The easy way to overcome this, is to have a HashMap and a counter.

The HashMap will contain the combined-keys, and the value will be the id that should be returned. The counter is used to generated the next id in case of a new combined key. The combined key can be a "Pair" class in this case.

Suppose each item in the RecyclerView data has 2 long-type keys:

HashMap<Pair<Long,Long>,Long> keyToIdMap=new HashMap();
long idGenerator=0;

this is what to do in getItemId :

Pair<Long,Long> combinedKey=new Pair(item.getPrimaryId(), item.getSecondary());
Long uniqueId=keyToIdMap.get(combinedKey);
if(uniqueId==null) 
  keyToIdMap.put(combinedKey,uniqueId=idGenerator++);
return uniqueId;

This has the drawback of taking more and more memory. Not much though, and it's very small and proportional to the data you already have, but still...

However, this has the advantage of being able to handle all types of IDs, and you can use even more IDs as you wish (just need something similar to Pair).

Another advantage is that it will use all IDs starting from 0.

The question

Is there perhaps a better way to achieve this?

Maybe a mathematical way? I remember I learned in the past of using prime numbers for similar tasks. Will it work here somehow?

Jobye answered 15/11, 2017 at 8:10 Comment(3)

What is possible range for your ids? String.format(%d:%d, id1, id2) might be simplest (but not fastest) solution. id1*1000000 + id2 far better, but might be out of range. – Tenne 15/11, 2017 at 8:20

@NorthernPoet String cannot be used, because the function needs long type. The math formula you gave is better, but if id1 is too large, it might return a negative value, which I'm not sure the function is ok with that. – Jobye 15/11, 2017 at 8:32

The problem with the formula is collisions. If we don't know anything about id1 and id2 we (1) potentially lose bits if the multiplication overflows, and (2) have no idea if this formula creates more likely collisions than random. A crypto hash would give you the best behavior regarding collisions and doesn't lose entropy. – Hypostatize 15/11, 2017 at 9:5

I think my original idea is the best one I can think of. Should cover all possible ids, with least possible collision.

Jobye answered 15/11, 2017 at 10:54 Comment(0)

Do the existing primary and secondary ids use the entire 64-bit range of longs? If not then it's possible to compute a unique 64-bit long from their values with e.g bit slicing.

Another approach would be to hash the two together with a hash with very low collisions (a crypto hash like SHA2 for example) and using the first 64 bits of the result. Having a range of 64 bits means you can comfortably have millions of items before the chance of a collision becomes likely - the chance of a collision is 50% when you add sqrt(64)=2**32 items, which is more than 4 billion.

Finally, having an unique independent mapping is very versatile and assuming the map is always accessible it's fine (it gets tricky when you try to synchronize that new id across machine etc). In Java you can attempt to increase performance by avoiding the boxed Longs and a separate Pair instance using a custom map implementation, but that's micro-optimizing.

Example using SHA1:

With Guava - the usage is clean and obvious.

HashFunction hf = Hashing.sha1();
long hashedId = hf.newHasher()
       .putLong(primary)
       .putLong(secondary)
       .hash()
       .asLong();

Just the standard JDK, it's pretty horrible and can probably be more efficient, should look something like this (I'm ignoring checked exceptions):

static void updateDigestWithLong(MessageDigest md, long l) {
  md.update((byte)l);
  md.update((byte)(l >> 8));
  md.update((byte)(l >> 16));
  md.update((byte)(l >> 24));
}

// this is from the Guava sources, can reimplement if you prefer
static long padToLong(bytes[] bytes) {
  long retVal = (bytes[0] & 0xFF);
  for (int i = 1; i < Math.min(bytes.length, 8); i++) {
    retVal |= (bytes[i] & 0xFFL) << (i * 8);
  }
  return retVal;
}

static long hashLongsToLong(long primary, long secondary) {
  MessageDigest md = MessageDigest.getInstance("SHA-1");
  updateDigestWithLong(md, primary);
  updateDigestWithLong(md, secondary);
  return padToLong(md.digest());
}

Hypostatize answered 15/11, 2017 at 8:29 Comment(4)

Given the case I've written (primary id that is usually not being shared with other items, and secondary as a backup in these rare cases), is there a good hash function that will be as unique as possible, using this assumptions ? About the HashMap solution, it's ok in my case, because RecyclerView is a part of the UI, which is very temporary anyway. Just thought maybe there is a better way than creating HashMap entries... I think I will stick with the HashMap solution. It's very simple to use. I can even remove entries from there if the items get removed. – Jobye 15/11, 2017 at 8:35

If all the 64 bits are in use, then I'm not familiar with a specific (128 bit -> 64 bit) function. – Hypostatize 15/11, 2017 at 8:53

You could have a cyclic one if needed. The secondary id is usually very small in my case. – Jobye 15/11, 2017 at 8:56

Can you please show how to use the "crypto hash" on Android ? Suppose you have 2 long type variables – Jobye 15/11, 2017 at 9:32