How to generate unique 64 bits integers from Python?
Asked Answered
M

5

61

I need to generate unique 64 bits integers from Python. I've checked out the UUID module. But the UUID it generates are 128 bits integers. So that wouldn't work.

Do you know of any way to generate 64 bits unique integers within Python? Thanks.

Martie answered 20/8, 2010 at 11:9 Comment(5)
How unique do they need to be? unique for that program, or unique for every ID ever generated by any program on any computer (which is what UUID gives you)?Anion
Dave - these are document ID's. Every ID ever generated needs to be unique. I could have multiple servers each has Python processes.Martie
Why not simply assign sequential numbers? They're unique.Lavonnelaw
@Lavonnelaw - How do you coordinate different Python processes on different machines to assign sequential numbers?Martie
(1) Why does that matter? Is it a requirement? If it's a requirement, then why isn't this requirement in the question? (2) That's what database servers are for.Lavonnelaw
T
81

just mask the 128bit int

>>> import uuid
>>> uuid.uuid4().int & (1<<64)-1
9518405196747027403L
>>> uuid.uuid4().int & (1<<64)-1
12558137269921983654L

These are more or less random, so you have a tiny chance of a collision

Perhaps the first 64 bits of uuid1 is safer to use

>>> uuid.uuid1().int>>64
9392468011745350111L
>>> uuid.uuid1().int>>64
9407757923520418271L
>>> uuid.uuid1().int>>64
9418928317413528031L

These are largely based on the clock, so much less random but the uniqueness is better

Transonic answered 20/8, 2010 at 11:14 Comment(5)
uuid1 reveals MAC address and time - uuid4 is more secure.Finn
Right-shifting by 64 bits removes the MAC address and time, leaving only the clock.Tarriance
@LukasCenovsky, The uuid1 will be more likely to be unique precisely for that reason. Depends whether security is required or not, but the trade off is that for uuid4, collisions will be more likelyTransonic
@JohnLaRooy the part with uuid1 is incorrect or misleading, because it creates UNSIGNED integers, not integers (an integer should be signed by default). I think that the correct way is something like this: int.from_bytes(uuid.uuid1().bytes, byteorder='big', signed=True) >> 64Hord
@JohnLaRooy, if you believe stanProkop improves your answer would you be willing to update it? My guess is that it is improved by the extra bit.Kellen
L
32

64 bits unique

What's wrong with counting? A simple counter will create unique values. This is the simplest and it's easy to be sure you won't repeat a value.

Or, if counting isn't good enough, try this.

>>> import random
>>> random.getrandbits(64)
5316191164430650570L

Depending on how you seed and use your random number generator, that should be unique.

You can -- of course -- do this incorrectly and get a repeating sequence of random numbers. Great care must be taken with how you handle seeds for a program that starts and stops.

Lavonnelaw answered 20/8, 2010 at 13:59 Comment(6)
No matter how good your seeds are you are likely to get repeats after approximately 2^32 IDs have been generated if you use the getrandbits() method.Boden
The sequence is theoretically longer. "It produces 53-bit precision floats and has a period of 2**19937-1." Why would getrandbits() not have the full period? Does it generate multiple numbers? Even if it generates 64 distinct values and uses only one bit, the resulting period would be 2^311.Lavonnelaw
How big is the seed? If you use the same seed you would get the same random numbersTurpeth
How would you implement that "just count"?Gawen
"Why would getrandbits() not have the full period?" It may have the full period, but there are only 2**64 distinct 64-bit integers, so you can't get a sequence of 2**19937-1 unique ones. Assuming a random distribution, you'd expect duplicates to start cropping up around the 2**32 mark.Pellucid
What do you mean when you say that "the resulting period would be 2^311"? In order to get something that has that period, you need at least 311 bits to keep track of the state (such as a counter), because you need to represent it by something that can take on 2^311 different values. Does getrandbits represent the state using 311 bits?Fictional
T
9

A 64-bit random number from the OS's random number generator rather than a PRNG:

>>> from struct import unpack; from os import urandom
>>> unpack("!Q", urandom(8))[0]
12494068718269657783L
Tarriance answered 1/6, 2012 at 17:8 Comment(0)
D
2

You can use uuid4() which generates a single random 128-bit integer UUID. We have to 'binary right shift' (>>) each 128-bit integer generated by 64-bit (i.e. 128 - (128 - 64)).

from uuid import uuid4

bit_size = 64
sized_unique_id = uuid4().int >> bit_size
print(sized_unique_id)
Doriandoric answered 7/11, 2018 at 14:47 Comment(2)
You'd do better to simply generate the bytes directly with e.g. os.urandom(8), or secrets.randbelow(2**64). For one thing, only 122 of the 128 bits of a uuid4 are randomly generated; the other 6 are fixed. Your method only gives you 60 random bits, not 64, which increases the chance of a random collision.Farthest
the bit size is not 64. it's 60~64Henchman
H
0

Why not try this?

import uuid
  
id = uuid.uuid1()
  
# Representations of uuid1()

print (repr(id.bytes)) # k\x10\xa1n\x02\xe7\x11\xe8\xaeY\x00\x16>\x99\x0b\xdb

print (id.int)         # 142313746482664936587190810281013480411  

print (id.hex)         # 6b10a16e02e711e8ae5900163e990bdb
  
Heim answered 27/10, 2021 at 18:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.