I want to use an ICU system-insensitive sorting collation, to avoid sorting differences between postgres11-on-mac vs postgres11-on-Ubuntu. My first test was to dump out my existing Collate=en_US.UTF-8
and pg_restore them into a db created with Collate=en-US-x-icu
Create Database doc has this to say:
To create a database music with a different locale:
CREATE DATABASE music LC_COLLATE 'sv_SE.utf8' LC_CTYPE 'sv_SE.utf8' TEMPLATE template0;
I seem to have the required icu locales already:
select collname, collprovider from pg_collation where collname like 'en_US%';
collname | collprovider
------------------------+--------------
en_US.UTF-8 | c
en_US | c
en_US.ISO8859-15 | c
en_US.ISO8859-1 | c
en_US | c
en_US | c
en-US-x-icu | i 👈
en-US-u-va-posix-x-icu | i 👈
(8 rows)
But no luck when creating a database with either icu locales.
CREATE DATABASE test LC_COLLATE = 'en-US-x-icu' TEMPLATE template0;
ksysdb=# CREATE DATABASE test LC_COLLATE = 'en-US-x-icu' TEMPLATE template0;
ERROR: invalid locale name: "en-US-x-icu"
I can use LC_COLLATE
with other locales:
The LC_COLLATE
clause does seem to come with some strings attached, such as watching your encoding and specifying an appropriate template. But it seems to give error hints w non-ICU locales.
This works, for example: CREATE DATABASE test LC_COLLATE = 'en_US' TEMPLATE template0;
and this one gives a helpful user message:
ksysdb=# CREATE DATABASE test LC_COLLATE = 'en_US.ISO8859-15' TEMPLATE template0;
ERROR: encoding "UTF8" does not match locale "en_US.ISO8859-15"
DETAIL: The chosen LC_COLLATE setting requires encoding "LATIN9".
Note: a related question, PostgreSQL 10 on Linux - LC_COLLATE locale en_US.utf-8 not valid, doesn't seem all that relevant, as the answer talks about generating an OS-level locale to fix the issue. While the ICU locales, as far as I understand, are expressly intended to be separated from the underlying OS.
check_locale
insrc/backend/utils/adt/pg_locale.c
usessetlocale
to check if the locale name is valid. That is a C library function, consequently it will only work with C library collations. – AmandCREATE DATABASE collation_db TEMPLATE template0 ENCODING 'UTF8' ICU_SHORT_FORM = 'AN_CU_EX_NX_LROOT';
and even per cluster viainitdb --icu-short-form
. Nice examples in the old docs. – Stepaniestepbrother