Performance difference between UUID, CHAR, and VARCHAR in PostgreSql table?
Asked Answered
G

3

44

I'm storing UUID v4 values in a PostgreSQL v9.4 table, under column "id".

When I create the table, is there any difference in following write or read performance whether I define the "id" column as VARCHAR(36), CHAR(36), or UUID data type?

Thanks!

Giuliana answered 24/8, 2015 at 18:23 Comment(2)
As far as I'm aware, the UUID type uses 16 bytes, basically storing the UUID as what it is: a 128-bit integer. A char(36) or varchar(36) would use 36 bytes. I can't seem to find where that's documented anymore, however. Beyond that, you need to be aware of the ramifications of how PostgreSQL does table/index clustering if your UUID is going to be your key and you plan on using the CLUSTER option.Davedaveda
simononsoftware.com/how-to-store-uuids-in-postgresqlParnassus
P
77

Use uuid. PostgreSQL has the native type for a reason.

It stores the uuid internally as a 128-bit binary field. Your other proposed options store it as hexadecimal, which is very inefficient in comparison.

Not only that, but:

  • uuid does a simple bytewise sort for ordering. text, char and varchar consider collations and locales, which is nonsensical for a uuid.

  • There is only one canonical respresentation of a uuid. The same is not true for text etc; you have to consider upper vs lower case hex, presence or absence of {...-...}s etc.

There's just no question. Use uuid.

The only other type that makes any sense is bytea, which at least can be used to store the 16 bytes of the uuid directly. This is what I'd do if I was using systems that couldn't cope with data types outside the basic set, like a really dumb ORM of some kind.

Parnassus answered 25/8, 2015 at 1:47 Comment(1)
Thank you very much for such detailed answer.Vuillard
A
9

UUID would be the fastest because its 128 bits -> 16 bytes and comparisons are done numerically.

Char(36) and varchar(36) seems to be the same and slow: http://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text/.

The server should check EOF to determine the job of reading the value has finished or not for each character.

Also text comparison is slower than numerical comparison. And because UUID consists of 16 bytes, comparing UUID is much faster than comparing two texts of 36 characters.

Use native UUID for performance.

Ascendant answered 24/8, 2015 at 19:3 Comment(5)
Your explanation is incorrect. There is no significant performance difference between char and varchar here, at least as stored by PostgreSQL. Both are stored as VARLENA internally. Also, a UUID is 16 bytes (128 bits); I don't know where you get "5 hexadecimal values".Parnassus
Well, I also said that it's 128 bits and 16 bytes. And 5 hexadecimal values comes from the definition, read please: #32188997Ascendant
About char and varchar being the same, yyes, you're right: depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-textAscendant
I looked up for UUID in the wikipedia, and 5 hexadecimal parts were only to show the 128 bits. From version 1 to version 5 now, UUID generating algorithm has changed. Now 128 bits are generated by smaller parts (mac address, date, random algorithm, sha-1 encryption,...) and shown as 5 hexadecimal values. Thus, showing as 5 parts has no effect on calculations & comparisons. Fixed in the answer.Ascendant
just to clarify, uuid doesn't exactly have "versions", and more recent versions are not necessarily better than older ones. these are just different types of uuid that are suited for different use cases.Electro
S
2

The index size is maybe the most notable difference: almost 86% more for VARCHAR.

From a performance perspective I didn't notice significant differences in PostgreSQL 9.5.

Scarlett answered 2/2, 2017 at 14:50 Comment(1)
Could you back this claim with data? Particular measurement, layout of the tables?Auriferous

© 2022 - 2024 — McMap. All rights reserved.