Inserting and selecting UUIDs as binary(16)
Asked Answered
L

7

67

I don't understand why

SELECT UUID();

Returns something like:

3f06af63-a93c-11e4-9797-00505690773f

But if I insert it into a binary(16) field (the UUID() function) with for instance a BEFORE INSERT trigger and run a select, it returns something like:

0782ef48-a439-11

Note that these two UUIDs are not the same data.

I realize binary and an UUID string doesn't look identical, but shouldn't the selected data at least be just as long? Otherwise how can it possibly be equally likely to be unique?

Is it better to store it as char(36)? I just need it to be unique to prevent duplicate inserts. It is never selected or used for joins.

EDIT:

before trigger would be like:

BEGIN

if NEW.UUID IS NULL THEN

NEW.UUID = UUID();

END IF

END
Lauder answered 31/1, 2015 at 11:36 Comment(14)
Show how you're doing the INSERT.Itin
BINARY(16) can only hold 16 characters. So it will contain the first 16 characters of the UUID that you store in it.Itin
According to #10950702 that's not the case. Also #17727182Lauder
That answer uses UNHEX() to convert the UUID to a number that will fit into 16 bytes.Itin
No matter how you spin it, wouldn't that reduce the complexity by around 50% then? This has me very confused.Lauder
I don't understand the question. Complexity of what?Itin
Likelyness that the UUID is unique.Lauder
dev.mysql.com/doc/refman/5.0/en/… explains how it ensures it's unique.Itin
Okay we're completely talking past each other. I know that the initial UUID is more or less guaranteed to be unique,but if you cut away 50% of the data, it probably isn't anymore. So why do people hex or unhex an UUID and reduce the length by 50%? That makes no sense to me.Lauder
Why are you cutting away 50% of the data? When you use UNHEX() every 2 characters becomes one byte of the result, so it will then fit in BINARY(16).Itin
There's something I just don't understand at all about hex, then. I'm sorry.Lauder
Hex digits 0 through F are the numbers 0 through 15 in decimal. Each hex digit corresponds to 4 bits, so 2 hex digits is 8 bits, which is 1 byte.Itin
So to properly save the UUID i should either: A. Hex it to compact it to 16 bytes, or B. Store it as char(36) instead?Lauder
Yes, that's correct.Itin
L
139

So, as a response to comments. The correct way to store a 36-char UUID as binary(16) is to perform the insert in a manner like:

INSERT INTO sometable (UUID) VALUES
       (UNHEX(REPLACE("3f06af63-a93c-11e4-9797-00505690773f", "-","")))

UNHEX because an UUID is already a hexed value. We trim (REPLACE) the dashes in the statement to bring the length down to 32 characters (our 16 bytes represented as HEX). You can do this at any point before storing it, obviously, so it doesn't have to be handled by the database.

You may retrieve the UUID like this:

SELECT HEX(UUID) FROM sometable;

Just in case someone comes across this thread and is unsure how this works.

And remember: If you're selecting a row using the UUID, use UNHEX() on the condition:

SELECT * FROM sometable WHERE UUID = UNHEX('3f06af63a93c11e4979700505690773f');

or literal notation (as mentioned by Alexis Wilke):

SELECT * FROM sometable WHERE UUID = 0x3f06af63a93c11e4979700505690773f;

And NOT HEX()on the column:

SELECT * FROM sometable WHERE HEX(UUID) = '3f06af63a93c11e4979700505690773f';

The last solution, while it works, requires that MySQL HEXes all UUIDs before it can determine which rows match. It's very inefficient.

Edit: If you're using MySQL 8 you should have a look at the UUID functions as mentioned in SlyDave's answer. This answer is still correct, but it doesn't optimise the UUID indexes which can be done natively using those functions. If you're on < MySQL 8 or MariaDB, you can implement Devon's polyfill, which provides identical functionality on previous versions of MySQL.

Lauder answered 31/1, 2015 at 13:35 Comment(2)
Great question and answer. An off-topic question, when the client UI retrieves and update the record later, this UUID is the key. Should I remember the binary or the string at client side for posting back later?Unrivaled
@Unrivaled The client (and your API) should handle the UUID as a string, with or without dashes.Lauder
C
70

As of MySQL 8 you can use two new UUID functions:

  • BIN_TO_UUID

    SELECT BIN_TO_UUID(uuid, true) AS uuid FROM foo;
    -- 3f06af63-a93c-11e4-9797-00505690773f
    
  • UUID_TO_BIN

    INSERT INTO foo (uuid) VALUES (UUID_TO_BIN('3f06af63-a93c-11e4-9797-00505690773f', true));
    

This method also supports rearranging the time component of the uuid to enhance indexing performance (by ordering it chronologically), simply set the second argument to true - this only works for UUID1.

If you are using the true on UUID_TO_BIN flag for indexing performance (recommended), you must also set it on BIN_TO_UUID otherwise it won't convert back properly.

See the documentation for further details.

Cryptogam answered 19/7, 2017 at 9:46 Comment(5)
After a lot of research, we've decided to migrate from MySQL to PostgreSQL because they have a UUID data type (just like auto increment, it auto inserts and more importantly, you can just read it directly using a string). Otherwise we have to refactor our entire codebase everywhere we have an update or read using an id with both mysql 5.7 or 8.Meghan
@Meghan MySQL has a UUID() function for that specific purpose as well. While it's not a datatype, it's equivalent to binary(16) and generates insert-friendly sequential (not random) UUIDs.Lauder
Thanks @Lauder - but ultimately it would mean we have to update every single one of our queries across our entire codebase to include this function..Meghan
Or you could use a trigger, @Meghan :)Lauder
@Meghan - How do you make migration from MySQL to PostgreSQL easier than changing the codebase to UUID inserts? I don't really see how that would work. Can you elaborate?Lauder
M
22

Polyfill for BIN_TO_UUID and UUID_TO_BIN for MySQL 5 or MariaDB with the swap_flag parameter.

DELIMITER $$

CREATE FUNCTION BIN_TO_UUID(b BINARY(16), f BOOLEAN)
RETURNS CHAR(36)
DETERMINISTIC
BEGIN
   DECLARE hexStr CHAR(32);
   SET hexStr = HEX(b);
   RETURN LOWER(CONCAT(
        IF(f,SUBSTR(hexStr, 9, 8),SUBSTR(hexStr, 1, 8)), '-',
        IF(f,SUBSTR(hexStr, 5, 4),SUBSTR(hexStr, 9, 4)), '-',
        IF(f,SUBSTR(hexStr, 1, 4),SUBSTR(hexStr, 13, 4)), '-',
        SUBSTR(hexStr, 17, 4), '-',
        SUBSTR(hexStr, 21)
    ));
END$$


CREATE FUNCTION UUID_TO_BIN(uuid CHAR(36), f BOOLEAN)
RETURNS BINARY(16)
DETERMINISTIC
BEGIN
  RETURN UNHEX(CONCAT(
  IF(f,SUBSTRING(uuid, 15, 4),SUBSTRING(uuid, 1, 8)),
  SUBSTRING(uuid, 10, 4),
  IF(f,SUBSTRING(uuid, 1, 8),SUBSTRING(uuid, 15, 4)),
  SUBSTRING(uuid, 20, 4),
  SUBSTRING(uuid, 25))
  );
END$$

DELIMITER ;

--
-- Tests to demonstrate that it works correctly. These are the values taken from
-- https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
--
-- If you run these SELECTs using the above functions, the 
-- output of the two columns should be exactly identical in all four cases.
SET @uuid = '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT HEX(UUID_TO_BIN(@uuid, 0)), '6CCD780CBABA102695645B8C656024DB';
SELECT HEX(UUID_TO_BIN(@uuid, 1)), '1026BABA6CCD780C95645B8C656024DB';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0), '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1), '6ccd780c-baba-1026-9564-5b8c656024db';

Included are the SELECT samples from https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin that demonstrate that the above code returns the exact same results as the 8.0 function. These functions are considered DETERMINISTIC as they always produce the same output for a given input. See https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html

Motorboating answered 19/9, 2019 at 16:55 Comment(7)
Exactly what I was looking for. Thanks!Likelihood
You saved my day! Since my host just support MySQL 5 i was going to be forced to rewrite all my BIN_TO_UUID calls!Rogerson
You should really add the SWAP_FLAG to both functions as that's what the native functions support: dev.mysql.com/doc/refman/8.0/en/… and dev.mysql.com/doc/refman/8.0/en/…Lauder
@Devon I figured it out and edited your answer. Thanks for pointing me in the right direction.Lauder
i tried to use this on my server, but when i use UUID_TO_BIN(uuid(),1) it gives me strange characters like ��6Z5�*,�U��).. why did i get this?Stemson
@MuhammadFauzi It produces a binary output. If you want to display the UUID, you'd have to do HEX(UUID_TO_BIN(uuid(),1)), or SELECT HEX(col) if you already put it into a binary column.Lauder
Coming back to read my explanation above; while it's not incorrect, it makes a lot more sense to just use UUID() alone if you want to be able to read the UUID. Once you wrap it in UUID_TO_BIN(<uuid>), you get binary output, which is the whole point. To read an existing binary UUID, use BIN_TO_UUID(<binary uuid>).Lauder
D
16

I am using MariaDB so BIN_TO_UUID functions family do not exist. I managed to get the corresponding values anyway.

bin -> hex

Here, uuid is the binary(16) value of an uuid; you'll use the value below to SELECT a readable version of it.

LOWER(CONCAT(
    SUBSTR(HEX(uuid), 1, 8), '-',
    SUBSTR(HEX(uuid), 9, 4), '-',
    SUBSTR(HEX(uuid), 13, 4), '-',
    SUBSTR(HEX(uuid), 17, 4), '-',
    SUBSTR(HEX(uuid), 21)
))

hex -> bin

Here, cc6e6d97-5501-11e7-b2cb-ceedca613421 is a readable version of an UUID, and you'll use the value below in a WHERE clause to seek for it.

UNHEX(REPLACE('cc6e6d97-5501-11e7-b2cb-ceedca613421', '-', ''))

Cheers

Deluxe answered 22/3, 2018 at 15:5 Comment(3)
Why don't you just replace the dashes like in my answer? What's the advantage of doing 5 substring calls?Lauder
You are totally right. I don't know either, head to the handlebars i suppose. Thanks :)Deluxe
Hehe. Okay. I'd also be doing the string-concat stuff in the application layer instead... there's really no reason to make the database do five HEX() and SUBSTR() per UUID when PHP, C#, java or whatever could easily insert the dashes by splitting the UUID. I actually don't work with the dashes anywhere in my application, since they're more or less redundant.Lauder
R
16

The other answers are correct. The UUID() function returns a 36 character string and that needs to be converted using the shown functions (UNHEX() or, on newer platforms, UUID_TO_BIN()).

However, if you use your own software to create your UUIDs, then you can use the Hexadecimal Literal notation instead.

So I would use the following with the MySQL UUID() function:

INSERT INTO sometable (id) VALUES (UNHEX(REPLACE(UUID(), '-', '')));  -- all versions
INSERT INTO sometable (id) VALUES (UUID_TO_BIN(UUID());               -- since v8.0

But use this in case I generate my own UUIDs;

INSERT INTO sometable (id) VALUES 0x3f06af63a93c11e4979700505690773f;

Similarly, you can use Hexadecimal Literals in your WHERE clauses:

SELECT * FROM sometable WHERE id = 0x3f06af63a93c11e4979700505690773f;

This will be faster if you do not have to convert your data to a UUID string each time.

Note: the 'x' in '0xaBc is case sensitive. The hexadecimal digits are not, however.

Refugio answered 8/1, 2019 at 17:15 Comment(3)
Came here looking for a solution with literals... thanks!Gallipot
if you are using a DB with charset not UTF-8 and a MariaDB Java connector with version >= 2.2.4, your insert should be like INSERT INTO sometable (id) VALUES (UNHEX(REPLACE(CONVERT(UUID() using utf8mb4), '-', ''))); to guarantee uniqueness.Johppa
@DaviCavalcanti I wonder why you'd need the "utf8" conversion. A UUID is only composed of ASCII characters anyway.Refugio
W
0

I am not sure why this is not mentioned here, but I always use this syntax:

SELECT * FROM blabla WHERE id = X'01903ace7c9173e98bd6c638938836c4';
Woodham answered 25/6 at 9:12 Comment(0)
F
-2

In MySQL 4.0 and above you can change the size of UUID like using the MID

SELECT MID(UUID(),1,32); # 32 characters long UUID
SELECT MID(UUID(),1,11); # 11 characters long UUID

As @nickdnk pointed out you shouldn't do this. The total length of UUID makes them unique. Striping a part of them can lead to non unique values.

Fanaticism answered 16/7, 2021 at 15:30 Comment(2)
You should not change the size of a UUID. It's designed to be unique and the different parts/sections are generated by different sources. Taking a subset of the UUID like this is asking for trouble.Lauder
@nickdnk, Yeah, you're right. I'm updating my answer.Fanaticism

© 2022 - 2024 — McMap. All rights reserved.