How to shorten UUID V4 without making it non-unique/guessable
Asked Answered
E

6

17

I have to generate unique URL part which will be "unguessable" and "resistant" to brute force attack. It also has to be as short as possible :) and all generated values has to be of same length. I was thinking about using UUID V4 which can be represented by 32 (without hyphens) hex chars de305d5475b4431badb2eb6b9e546014 but it's a bit too long. So my question is how to generate something unqiue, that can be represented with url charcters with same length for each generated value and shorter than 32 chars. (In node.js or pgpsql)

Ethe answered 30/7, 2015 at 13:55 Comment(2)
possible duplicate of How do you create a random string in Postgresql?Norvell
This question has been asked before and has a ton of options for you to pick from.Norvell
V
14

v4() will generate a large number which is translated into a hexadecimal string. In Node.js you can use Buffer to convert the string into a smaller base64 encoding:

const { v4 } = require('uuid');

function getRandomName() {
    const hexString = v4();
    console.log("hex:   ", hexString);
    
    // remove decoration
    const hexStringUndecorated = hexString.replace(/-/g, "");
    
    const base64String = Buffer
      .from(hexStringUndecorated, 'hex')
      .toString('base64')
    console.log("base64:", base64String);
    
    return base64String;
}

getRandomName()
console.log()
getRandomName()

Which produces:

hex:    93aa74d5-8caf-473f-af2f-42cf806ddfdc
base64: k6p01YyvRz+vL0LPgG3f3A==

hex:    6071c791-e848-4746-b1d2-0944334cad91
base64: YHHHkehIR0ax0glEM0ytkQ==

URLS?

Need something that is more compatible with URLs? You could use base64url, a technique that is used by JWT tokens.

const { v4 } = require("uuid");
const base64url = require("base64url");

function getRandomName() {
  const hexString = v4();
  console.log("hex:      ", hexString);

  // remove decoration
  const hexStringUndecorated = hexString.replace(/-/g, "");

  const buffer = Buffer.from(hexStringUndecorated, "hex");
  const str = base64url(buffer);
  console.log("base64Url:", str);

  return str;
}

getRandomName();
console.log();
getRandomName();

This produces:

hex:       93aa74d5-8caf-473f-af2f-42cf806ddfdc
base64Url: k6p01YyvRz-vL0LPgG3f3A

hex:       6071c791-e848-4746-b1d2-0944334cad91
base64Url: YHHHkehIR0ax0glEM0ytkQ
Veronaveronese answered 17/10, 2020 at 9:51 Comment(3)
There is a bug in the code above hexString.replace("-", "") does not remove the hyphen correctly and as a consequence the Buffer.From function truncates the output and you get some weird shortened base64. One should instead use uuid.replace(/-/g, '');Hydrastis
6fa1ca99-a92b-4d2a-aac2-7c7977119ebc produces b6HKmakrTSqqwnx5dxGevA==Continental
and bd23c8fd-0f62-49f4-9e51-8b5c97601a16 produces vSPI/Q9iSfSeUYtcl2AaFg==. What written in the answer really confuses me.Hep
D
5

UUID v4 itself does not actually guarantee uniqueness. It's just very, very unlikely that two randomly generated UUIDs will clash. That's why they need to be so long - that reduces the clashing chance. So you can make it shorter, but the shorter you make it, the more likely that it won't actually be unique. UUID v4 is 128 bit long because that is commonly considered "unique enough".

Dogie answered 30/7, 2015 at 14:5 Comment(2)
hmm and can it be shorter but with additional use (somehow) of postgres sequence ? It has to be unique only in a single database.Ethe
You want to make it unguessable. If you add a sequence there AND shorten it, you're making it more guessable twice. You basically have conflicting goals.Dogie
C
5

UUID is 36 characters long and you can shorten it to 22 characters (~30%) if you want save ability to convert it back and for it to be url safe.

Here is pure node solution for base64 url safe string:

type UUID = string;
type Base64UUID = string;

/**
 * Convert uuid to base64url
 *
 * @example in: `f32a91da-c799-4e13-aa17-8c4d9e0323c9` out: `8yqR2seZThOqF4xNngMjyQ`
 */
export function uuidToBase64(uuid: UUID): Base64UUID {
  return Buffer.from(uuid.replace(/-/g, ''), 'hex').toString('base64url');
}

/**
 * Convert base64url to uuid
 *
 * @example in: `8yqR2seZThOqF4xNngMjyQ` out: `f32a91da-c799-4e13-aa17-8c4d9e0323c9`
 */
export function base64toUUID(base64: Base64UUID): UUID {
  const hex = Buffer.from(base64, 'base64url').toString('hex');

  return `${hex.substring(0, 8)}-${hex.substring(8, 12)}-${hex.substring(
    12,
    16,
  )}-${hex.substring(16, 20)}-${hex.substring(20)}`;
}

Test:

import { randomUUID } from "crypto";

// f32a91da-c799-4e13-aa17-8c4d9e0323c9
const uuid = randomUUID();

// 8yqR2seZThOqF4xNngMjyQ
const base64 = uuidToBase64(uuid);

// f32a91da-c799-4e13-aa17-8c4d9e0323c9
const uuidFromBase64 = base64toUUID(base64); 
Continental answered 22/5, 2022 at 18:36 Comment(0)
F
2

The short-uuid module does just that.

"Generate and translate standard UUIDs into shorter - or just different - formats and back."

It accepts custom character sets (and offers a few) to translate the UUID to and fro.

You can also base64 the uuid which shortens it a bit to 22. Here's a playground.

Feaze answered 10/5, 2020 at 19:26 Comment(0)
I
1

It all depends on how guessable/unique it has to be.

My suggestion would be to generate 128 random bits and then encode it using base36. That would give you a "nice" URL and it would be unique and probably unguessable enough.

If you want it even shorter you can use base64, but base64 needs to contain two non alphanumeric characters.

Isochor answered 25/5, 2016 at 13:35 Comment(0)
B
1

This is a fairly old thread, but I'd like to point out the top answer does not produce the results it claims. It will actually produce strings that are ~32 characters long, but the examples claim 8 characters. If you want more compression convert the uuid to base 90 using this function.

Using Base64 takes 4 characters for every 3 bytes, and Hex (Base16) takes 2 characters for each byte. This means that Base64 will have ~67% better storage size than hex, but if we can increase that character/byte ratio we can get even better compression. Base90 gives ever so slightly more compression because of this.

const hex = "0123456789abcdef";
const base90 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!#$%&'()*+-./:<=>?@[]^_`{|}~";

/**
 * Convers a Base16 (Hex) represented string to a Base 90 String.
 * @param {String} number Hex represented string
 * @returns A Base90 representation of the hex string
 */
function convertToBase90(number) {
    var i,
        divide,
        newlen,
        numberMap = {},
        fromBase = hex.length,
        toBase = base90.length,
        length = number.length,
        result = typeof number === "string" ? "" : [];

    for (i = 0; i < length; i++) {
        numberMap[i] = hex.indexOf(number[i]);
    }
    do {
        divide = 0;
        newlen = 0;
        for (i = 0; i < length; i++) {
            divide = divide * fromBase + numberMap[i];
            if (divide >= toBase) {
                numberMap[newlen++] = parseInt(divide / toBase, 10);
                divide = divide % toBase;
            } else if (newlen > 0) {
                numberMap[newlen++] = 0;
            }
        }
        length = newlen;
        result = base90.slice(divide, divide + 1).concat(result);
    } while (newlen !== 0);

    return result;
}

/**
 * Compresses a UUID String to base 90 resulting in a shorter UUID String
 * @param {String} uuid The UUID string to compress
 * @returns A compressed UUID String.
 */
function compressUUID(uuid) {
    uuid = uuid.replace(/-/g, "");
    return convertToBase90(uuid);
}

Over a few million random uuids this generates no duplicates and the following output:

Lengths:
Avg: 19.959995 Max:  20 Min:  17

Examples:
Hex:      68f75ee7-deb6-4c5c-b315-3cc6bd7ca0fd
Base 90:  atP!.AcGJh1(eW]1LfAh

Hex:      91fb8316-f033-40d1-974d-20751b831c4e
Base 90:  ew-}Kv&nK?y@~xip5/0e

Hex:      4cb167ee-eb4b-4a76-90f2-6ced439d5ca5
Base 90:  7Ng]V/:0$PeS-K?!uTed
Bonspiel answered 15/2, 2022 at 19:16 Comment(3)
it's not usable in urlContinental
@Continental oh shoot you're right. I missed that parameter of the question. I'll still leave it here for anyone else having the same question but can use non-url characters.Bonspiel
Do you have code to decode back to the original string?Crowning

© 2022 - 2024 — McMap. All rights reserved.