How do you create a random string that's suitable for a session ID in PostgreSQL?

Asked 19/10, 2010 at 17:5 Answered 8/4, 2021 at 18:9

138

I'd like to make a random string for use in session verification using PostgreSQL. I know I can get a random number with SELECT random(), so I tried SELECT md5(random()), but that doesn't work. How can I do this?

Moonfish answered 19/10, 2010 at 17:5 Comment(6)

Another solution can be found here https://mcmap.net/q/168234/-naming-each-row-in-the-table-with-a-random-2-word-string – Curtsy 3/12, 2012 at 0:24

The original question clearly speaks of randomness having a value beyond appearance. I've updated the title of the question to reflect @gersh's intent. – Certes 12/1, 2017 at 16:53

I've edited the title so that existing answers still make perfectly good sense, and Evan's answer bringing things a little more modern fits as well. I don't want to lock this age-old question for a content dispute - so let's make any additional edits accommodating to all the answers please. – Aran 12/1, 2017 at 21:13

Cool, let's see if @Moonfish can clarify this question because there is legitimate disagreement as to his original intention. If his original intention is what I assume it was, many of these answers need to be adjusted, downvoted or retracted. And, perhaps a new question about generating strings for testing purposes (or the like) should be raised (where random()ness isn't necessary). If it's not what I assume, then my answer needs to be catered to the refined question instead. – Certes 12/1, 2017 at 21:27

@EvanCarroll - gersh was last seen Nov 21 2015. – Scolex 12/1, 2017 at 21:32

For anyone comming to this question in year > 2017 consider Evan's answer https://mcmap.net/q/166080/-how-do-you-create-a-random-string-that-39-s-suitable-for-a-session-id-in-postgresql as it uses the methods that were not available when the questio nwas originally asked and answered. – Curmudgeon 12/1, 2017 at 21:34

123

I'd suggest this simple solution:

This is a quite simple function that returns a random string of the given length:

Create or replace function random_string(length integer) returns text as
$$
declare
  chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}';
  result text := '';
  i integer := 0;
begin
  if length < 0 then
    raise exception 'Given length cannot be less than 0';
  end if;
  for i in 1..length loop
    result := result || chars[1+random()*(array_length(chars, 1)-1)];
  end loop;
  return result;
end;
$$ language plpgsql;

And the usage:

select random_string(15);

Example output:

select random_string(15) from generate_series(1,15);

  random_string
-----------------
 5emZKMYUB9C2vT6
 3i4JfnKraWduR0J
 R5xEfIZEllNynJR
 tMAxfql0iMWMIxM
 aPSYd7pDLcyibl2
 3fPDd54P5llb84Z
 VeywDb53oQfn9GZ
 BJGaXtfaIkN4NV8
 w1mvxzX33NTiBby
 knI1Opt4QDonHCJ
 P9KC5IBcLE0owBQ
 vvEEwc4qfV4VJLg
 ckpwwuG8YbMYQJi
 rFf6TchXTO3XsLs
 axdQvaLBitm6SDP
(15 rows)

Pater answered 19/10, 2010 at 21:34 Comment(7)

This solution uses the values at either end of the chars array - 0 and z - half as often as the rest. For a more even distribution of characters, I replaced chars[1+random()*(array_length(chars, 1)-1)] with chars[ceil(61 * random())] – Canice 15/3, 2013 at 4:8

random() gets called length times (like in many of the other solutions). Is there a more efficient way to choose from 62 characters each time? How does this perform compared to md5()? – Hambrick 10/2, 2014 at 1:58

I found another solution that uses ORDER BY random(). Which is faster? – Hambrick 10/2, 2014 at 15:17

Its worth noting that random may use erand48 which is not a CSPRNG, you are probably better off just using pgcrypto. – Sainthood 6/1, 2017 at 17:8

Not a good solution. Not much randomness, and you're implementing your own function. The answer is also 6 years old. Check out this for a totally different method using gen_random_uuid(): faster, more randomness, more efficiently stored in the database. – Certes 12/1, 2017 at 8:9

Good answer except that it doesn't use a secure random number generator and is therefore not so good for session IDs. See: #9816614 – Oas 27/2, 2017 at 18:50

@Oas my answer is more secure, and stores more efficiently anyway – Certes 6/1, 2021 at 18:17

300

You can fix your initial attempt like this:

SELECT md5(random()::text);

Much simpler than some of the other suggestions. :-)

Levitation answered 30/12, 2010 at 21:57 Comment(6)

Note that this returns strings over the "hex digits alphabet" {0..9,a..f} only. May not be sufficient -- depends on what you want to do with them. – Dormeuse 13/7, 2012 at 13:40

what is the length of the returned string? Is there a way to make it return a longer string? – Dislocation 26/6, 2014 at 1:53

When represented in hexadecimal, the length of an MD5 string is always 32 characters. If you wanted a string of length 64, you could concatenate 2 MD5 strings: SELECT concat(md5(random()::text), md5(random()::text)); And if you wanted somewhere in the middle (50 chars for example), you could take a substring of that: SELECT substr(concat(md5(random()::text), md5(random()::text)), 0, 50); – Differentiable 1/8, 2014 at 17:8

Not a very good solution for session ids, not much randomness. The answer is also 6 years old. Check out this for a totally different method using gen_random_uuid(): faster, more randomness, more efficiently stored in the database. – Certes 12/1, 2017 at 8:11

@Evan if you want more 'randomness' without an extension you can SELECT md5(random()::text||random()::text);, or SELECT md5(random()::text||random()::text||random()::text); – Dorty 27/12, 2018 at 16:45

That's three times more randomness, and three times the inefficient storage size and md5 overhead. – Certes 28/12, 2018 at 21:43

123

I'd suggest this simple solution:

This is a quite simple function that returns a random string of the given length:

Create or replace function random_string(length integer) returns text as
$$
declare
  chars text[] := '{0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}';
  result text := '';
  i integer := 0;
begin
  if length < 0 then
    raise exception 'Given length cannot be less than 0';
  end if;
  for i in 1..length loop
    result := result || chars[1+random()*(array_length(chars, 1)-1)];
  end loop;
  return result;
end;
$$ language plpgsql;

And the usage:

select random_string(15);

Example output:

select random_string(15) from generate_series(1,15);

  random_string
-----------------
 5emZKMYUB9C2vT6
 3i4JfnKraWduR0J
 R5xEfIZEllNynJR
 tMAxfql0iMWMIxM
 aPSYd7pDLcyibl2
 3fPDd54P5llb84Z
 VeywDb53oQfn9GZ
 BJGaXtfaIkN4NV8
 w1mvxzX33NTiBby
 knI1Opt4QDonHCJ
 P9KC5IBcLE0owBQ
 vvEEwc4qfV4VJLg
 ckpwwuG8YbMYQJi
 rFf6TchXTO3XsLs
 axdQvaLBitm6SDP
(15 rows)

Pater answered 19/10, 2010 at 21:34 Comment(7)

I found another solution that uses ORDER BY random(). Which is faster? – Hambrick 10/2, 2014 at 15:17

Its worth noting that random may use erand48 which is not a CSPRNG, you are probably better off just using pgcrypto. – Sainthood 6/1, 2017 at 17:8

Good answer except that it doesn't use a secure random number generator and is therefore not so good for session IDs. See: #9816614 – Oas 27/2, 2017 at 18:50

@Oas my answer is more secure, and stores more efficiently anyway – Certes 6/1, 2021 at 18:17

You can get 128 bits of random from a UUID. This is the method to get the job done in modern PostgreSQL.

CREATE EXTENSION pgcrypto;
SELECT gen_random_uuid();

           gen_random_uuid            
--------------------------------------
 202ed325-b8b1-477f-8494-02475973a28f

May be worth reading the docs on UUID too

The data type uuid stores Universally Unique Identifiers (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID, instead.) This identifier is a 128-bit quantity that is generated by an algorithm chosen to make it very unlikely that the same identifier will be generated by anyone else in the known universe using the same algorithm. Therefore, for distributed systems, these identifiers provide a better uniqueness guarantee than sequence generators, which are only unique within a single database.

How rare is a collision with UUID, or guessable? Assuming they're random,

About 100 trillion version 4 UUIDs would need to be generated to have a 1 in a billion chance of a single duplicate ("collision"). The chance of one collision rises to 50% only after 261 UUIDs (2.3 x 10^18 or 2.3 quintillion) have been generated. Relating these numbers to databases, and considering the issue of whether the probability of a Version 4 UUID collision is negligible, consider a file containing 2.3 quintillion Version 4 UUIDs, with a 50% chance of containing one UUID collision. It would be 36 exabytes in size, assuming no other data or overhead, thousands of times larger than the largest databases currently in existence, which are on the order of petabytes. At the rate of 1 billion UUIDs generated per second, it would take 73 years to generate the UUIDs for the file. It would also require about 3.6 million 10-terabyte hard drives or tape cartridges to store it, assuming no backups or redundancy. Reading the file at a typical "disk-to-buffer" transfer rate of 1 gigabit per second would require over 3000 years for a single processor. Since the unrecoverable read error rate of drives is 1 bit per 1018 bits read, at best, while the file would contain about 1020 bits, just reading the file once from end to end would result, at least, in about 100 times more mis-read UUIDs than duplicates. Storage, network, power, and other hardware and software errors would undoubtedly be thousands of times more frequent than UUID duplication problems.

source: wikipedia

In summary,

UUID is standardized.
gen_random_uuid() is 128 bits of random stored in 128 bits (2**128 combinations). 0-waste.
random() only generates 52 bits of random in PostgreSQL (2**52 combinations).
md5() stored as UUID is 128 bits, but it can only be as random as its input (52 bits if using random())
md5() stored as text is 288 bits, but it only can only be as random as its input (52 bits if using random()) - over twice the size of a UUID and a fraction of the randomness)
md5() as a hash, can be so optimized that it doesn't effectively do much.
UUID is highly efficient for storage: PostgreSQL provides a type that is exactly 128 bits. Unlike text and varchar, etc which store as a varlena which has overhead for the length of the string.
PostgreSQL nifty UUID comes with some default operators, castings, and features.

Certes answered 12/1, 2017 at 8:8 Comment(6)

Partly incorrect: A properly generated random UUID has only 122 random bits since 4 bits are used for the version and 2 bits for the variant: en.wikipedia.org/wiki/… – Ervin 20/1, 2017 at 23:47

If the source doesn't do what's written there, then it's not a UUID and shouldn't be called as such by PostgreSQL. – Ervin 20/1, 2017 at 23:54

Note that trimming + encoding (e.g base64) removes entropy, and can end-up to collisions. – Pilaf 6/1, 2021 at 13:49

@Pilaf base64, being a reversible encoding, removes no entropy. – Preceptory 24/6, 2021 at 6:49

@Preceptory Trimming does – Pilaf 30/6, 2021 at 10:42

An upwote for clarifying the limits of randomness in MD5 solution (52 bits)! – Dit 31/3, 2023 at 7:28

Building on Marcin's solution, you could do this to use an arbitrary alphabet (in this case, all 62 ASCII alphanumeric characters):

SELECT array_to_string(array 
       ( 
              select substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', trunc(random() * 62)::integer + 1, 1)
              FROM   generate_series(1, 12)), '');

Quintinquintina answered 29/3, 2011 at 20:49 Comment(1)

Slow, not as random, or as efficient to store. Not a very good solution for session ids, not much randomness. The answer is also 6 years old. Check out this for a totally different method using gen_random_uuid(): faster, more randomness, more efficiently stored in the database. – Certes 12/1, 2017 at 8:12

Please use string_agg!

SELECT string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer, 1), '')
FROM   generate_series(1, 45);

I'm using this with MD5 to generate a UUID also. I just want a random value with more bits than a random () integer.

Rhapsody answered 23/3, 2016 at 0:58 Comment(1)

I suppose I could just concatenate random() until I get the number of bits I want. Oh well. – Rhapsody 23/3, 2016 at 1:25

I was playing with PostgreSQL recently, and I think I've found a little better solution, using only built-in PostgreSQL methods - no pl/pgsql. The only limitation is it currently generates only UPCASE strings, or numbers, or lower case strings.

template1=> SELECT array_to_string(ARRAY(SELECT chr((65 + round(random() * 25)) :: integer) FROM generate_series(1,12)), '');
 array_to_string
-----------------
 TFBEGODDVTDM

template1=> SELECT array_to_string(ARRAY(SELECT chr((48 + round(random() * 9)) :: integer) FROM generate_series(1,12)), '');
 array_to_string
-----------------
 868778103681

The second argument to the generate_series method dictates the length of the string.

Curmudgeon answered 30/12, 2010 at 19:23 Comment(4)

I like this, but found when I used it an an UPDATE statement, all rows were set to the same random password instead of unique passwords. I solved this by adding the primary key ID into the formula. I add it to the random value and the subtract it again. The randomness is not changed, but PostgreSQL is tricked into re-computing the values for each row. Here's an example, using a primary key name of "my_id": array_to_string(ARRAY(SELECT chr((65 + round((random()+my_id-my) * 25)) :: integer) FROM generate_series(1,8)), '') – Anaphase 18/11, 2011 at 19:53

The solution, that @MarkStosberg presented, worked as he said, but not as I expected; the produced data didn't match the pretended pattern (just letter case or just digits). I fixed by arithmetic moduling the random result:

array_to_string(ARRAY(SELECT chr((65 + round((random() * 25 + id) :: integer % 25 )) :: integer) FROM generate_series(1, 60)), '');

– Elmaelmajian 16/2, 2015 at 21:27

No. You are answering the 'How do I generate random session id' not 'How do I generate random string'. You've changed the meaning of the quesiton (and title), based on two words in the description. You're answering different question. and keep abusing your moderation power to change the question meanining. – Curmudgeon 12/1, 2017 at 20:54

Don't think double parens in chr are needed. SELECT chr(65 + round(random() * 25) :: integer) FROM generate_series should suffice as 65 is an integer anyway. – Clermontferrand 12/1 at 5:33

While not active by default, you could activate one of the core extensions:

CREATE EXTENSION IF NOT EXISTS pgcrypto;

Then your statement becomes a simple call to gen_salt() which generates a random string:

select gen_salt('md5') from generate_series(1,4);

 gen_salt
-----------
$1$M.QRlF4U
$1$cv7bNJDM
$1$av34779p
$1$ZQkrCXHD

The leading number is a hash identifier. Several algorithms are available each with their own identifier:

md5: $1$
bf: $2a$06$
des: no identifier
xdes: _J9..

More information on extensions:

pgCrypto: http://www.postgresql.org/docs/9.2/static/pgcrypto.html
Included Extensions: http://www.postgresql.org/docs/9.2/static/contrib.html

EDIT

As indicated by Evan Carrol, as of v9.4 you can use gen_random_uuid()

http://www.postgresql.org/docs/9.4/static/pgcrypto.html

Zodiac answered 21/3, 2013 at 22:49 Comment(3)

The generated salts seem too sequential to be really random, isn't it? – Gaddy 17/5, 2013 at 13:11

Are you referring to the $1$ ? That is a hash type identifier (md5==1), the rest is the randomized value. – Zodiac 17/5, 2013 at 13:25

Yes, that was my erroneous interpretation, thanks for the precision. – Gaddy 17/5, 2013 at 18:44

@Kavius recommended using pgcrypto, but instead of gen_salt, what about gen_random_bytes? And how about sha512 instead of md5?

create extension if not exists pgcrypto;
select digest(gen_random_bytes(1024), 'sha512');

Docs:

F.25.5. Random-Data Functions

gen_random_bytes(count integer) returns bytea

Returns count cryptographically strong random bytes. At most 1024 bytes can be extracted at a time. This is to avoid draining the randomness generator pool.

Wedge answered 28/8, 2014 at 20:59 Comment(3)

I wonder if the sha512 offers any benefit if the underlying data you are hashing is random. Assuming randomness, anything that encodes to a string should be sufficient, and the less computationally complex, the better (eg. base64 encoding?). -- sorry, old comment, but came up while discussing with someone at work – Zodiac 1/10, 2020 at 19:56

@JeffereyCave it's been 6 years, so I don't remember the context, but it looks like the OP was asking about session ids, and sha512 would have the advantage of being 4 times longer than md5, and thus a collision attack would be more difficult? – Wedge 2/10, 2020 at 17:2

I agree that longer is better... I was thinking your answer was complete at select gen_random_bytes(1024) .. and yes ... 6 years. Very arbitrary conversation with colleague brought it up. – Zodiac 3/10, 2020 at 18:16

The INTEGER parameter defines the length of the string. Guaranteed to cover all 62 alphanum characters with equal probability (unlike some other solutions floating around on the Internet).

CREATE OR REPLACE FUNCTION random_string(INTEGER)
RETURNS TEXT AS
$BODY$
SELECT array_to_string(
    ARRAY (
        SELECT substring(
            '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
            FROM (ceil(random()*62))::int FOR 1
        )
        FROM generate_series(1, $1)
    ), 
    ''
)
$BODY$
LANGUAGE sql VOLATILE;

Implore answered 13/7, 2012 at 13:38 Comment(2)

@EvanCarroll: in all fairness, gen_random_uuid() appeared in Version 9.4 as far as I can tell, which was released 2014-12-18, more than a year after the answer you downvoted. Additional nitpick: the answer is only 3 1/2 years old :-) But you're right, now that we have gen_random_uuid(), this is what should be used. Hence I'll upvote your answer. – Dormeuse 12/1, 2017 at 13:6

create extension if not exists pgcrypto;

then

SELECT encode(gen_random_bytes(20),'base64')

or even

SELECT encode(gen_random_bytes(20),'hex')

This is for 20 bytes = 160 bits of randomness (as long as sha1 for example).

Overwhelming answered 8/4, 2021 at 18:9 Comment(0)

select * from md5(to_char(random(), '0.9999999999999999'));

Monospermous answered 14/3, 2012 at 16:27 Comment(0)

I do not think that you are looking for a random string per se. What you would need for session verification is a string that is guaranteed to be unique. Do you store session verification information for auditing? In that case you need the string to be unique between sessions. I know of two, rather simple approaches:

Use a sequence. Good for use on a single database.
Use an UUID. Universally unique, so good on distributed environments too.

UUIDs are guaranteed to be unique by virtue of their algorithm for generation; effectively it is extremely unlikely that you will generate two identical numbers on any machine, at any time, ever (note that this is much stronger than on random strings, which have a far smaller periodicity than UUIDs).

You need to load the uuid-ossp extension to use UUIDs. Once installed, call any of the available uuid_generate_vXXX() functions in your SELECT, INSERT or UPDATE calls. The uuid type is a 16-byte numeral, but it also has a string representation.

Incur answered 13/2, 2014 at 3:46 Comment(4)

This seems like potentially dangerous advice. When it comes to session keys, you want uniqueness and randomness that is cryptographically random enough so as to preclude any reasonable chance of guessing it. The algorithms used by UUIDs guarantee uniqueness by non-random (mostly) mechanisms, which poses a security threat. – Chanda 19/2, 2015 at 21:6

@Chanda The whole purpose of UUIDs is that they are difficult to guess and highly random. Except for the v1 version they have a very high periodicity; v4 is fully 128-bit random. They are being used in every online banking transaction that you do. If they are good enough for that, they are good enough for pretty much anything else. – Incur 20/2, 2015 at 4:45

Well, what do you know. I didn't realize that had been addressed in Version 4. Thanks for correcting me! – Chanda 23/2, 2015 at 16:34

@Incur Small nit, V4 UUIDs are 122 bits of random, not 128. ;) – Unrepair 21/2, 2018 at 10:18

select encode(decode(md5(random()::text), 'hex')||decode(md5(random()::text), 'hex'), 'base64')

Tallulah answered 13/1, 2016 at 19:28 Comment(2)

I amend it to remove the forward-slash and plus sign that sometimes appears in the result and also to generate an uppercase result select upper(replace(replace(substring(encode(decode(md5(random()::text), 'hex')||decode(md5(random()::text), 'hex'), 'base64'), 0, 10), '/', 'A'), '+', 'Z')); – Glaudia 24/12, 2019 at 6:45

Can't you use TRANSLATE() instead of two REPLACE()s? – Rossiter 11/3, 2022 at 9:3

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags