I have a database that was set up with the default character set SQL_ASCII. I want to switch it to UNICODE. Is there an easy way to do that?
To change the encoding of your database:
- Dump your database
- Drop your database,
- Create new database with the different encoding
- Reload your data.
Make sure the client encoding is set correctly during all this.
Source: http://archives.postgresql.org/pgsql-novice/2006-03/msg00210.php
sudo -u postgres pg_dump your_db > /backups/postgresql.sql
... –
Expressionism First off, Daniel's answer is the correct, safe option.
For the specific case of changing from SQL_ASCII to something else, you can cheat and simply poke the pg_database catalogue to reassign the database encoding. This assumes you've already stored any non-ASCII characters in the expected encoding (or that you simply haven't used any non-ASCII characters).
Then you can do:
update pg_database set encoding = pg_char_to_encoding('UTF8') where datname = 'thedb'
This will not change the collation of the database, just how the encoded bytes are converted into characters (so now length('£123')
will return 4 instead of 5). If the database uses 'C' collation, there should be no change to ordering for ASCII strings. You'll likely need to rebuild any indices containing non-ASCII characters though.
Caveat emptor. Dumping and reloading provides a way to check your database content is actually in the encoding you expect, and this doesn't. And if it turns out you did have some wrongly-encoded data in the database, rescuing is going to be difficult. So if you possibly can, dump and reinitialise.
-bash: syntax error near unexpected token
('` –
Suckling psql
prompt. –
Siegbahn To change the encoding of your database:
- Dump your database
- Drop your database,
- Create new database with the different encoding
- Reload your data.
Make sure the client encoding is set correctly during all this.
Source: http://archives.postgresql.org/pgsql-novice/2006-03/msg00210.php
sudo -u postgres pg_dump your_db > /backups/postgresql.sql
... –
Expressionism Dumping a database with a specific encoding and try to restore it on another database with a different encoding could result in data corruption. Data encoding must be set BEFORE any data is inserted into the database.
Check this : When copying any other database, the encoding and locale settings cannot be changed from those of the source database, because that might result in corrupt data.
And this : Some locale categories must have their values fixed when the database is created. You can use different settings for different databases, but once a database is created, you cannot change them for that database anymore. LC_COLLATE and LC_CTYPE are these categories. They affect the sort order of indexes, so they must be kept fixed, or indexes on text columns would become corrupt. (But you can alleviate this restriction using collations, as discussed in Section 22.2.) The default values for these categories are determined when initdb is run, and those values are used when new databases are created, unless specified otherwise in the CREATE DATABASE command.
I would rather rebuild everything from the begining properly with a correct local encoding on your debian OS as explained here :
su root
Reconfigure your local settings :
dpkg-reconfigure locales
Choose your locale (like for instance for french in Switzerland : fr_CH.UTF8)
Uninstall and clean properly postgresql :
apt-get --purge remove postgresql\*
rm -r /etc/postgresql/
rm -r /etc/postgresql-common/
rm -r /var/lib/postgresql/
userdel -r postgres
groupdel postgres
Re-install postgresql :
aptitude install postgresql-9.1 postgresql-contrib-9.1 postgresql-doc-9.1
Now any new database will be automatically be created with correct encoding, LC_TYPE (character classification), and LC_COLLATE (string sort order).
# dump into file
pg_dump myDB > /tmp/myDB.sql
# create an empty db with the right encoding (on older versions the escaped single quotes are needed!)
psql -c 'CREATE DATABASE "tempDB" WITH OWNER = "myself" LC_COLLATE = '\''de_DE.utf8'\'' TEMPLATE template0;'
# import in the new DB
psql -d tempDB -1 -f /tmp/myDB.sql
# rename databases
psql -c 'ALTER DATABASE "myDB" RENAME TO "myDB_wrong_encoding";'
psql -c 'ALTER DATABASE "tempDB" RENAME TO "myDB";'
# see the result
psql myDB -c "SHOW LC_COLLATE"
Once everything runs fine, you can drop the database myDB_wrong_encoding
Daniel Kutik's answer is correct, but it can be even more safe, with database renaming.
So, the truly safe way is:
- Create new database with the different encoding and name
- Dump your database
- Restore dump to the new DB
- Test that your application runs correctly with the new DB
- Rename old DB to something meaningful
- Rename new DB
- Test application again
- Drop the old database
In case of emergency, just rename DBs back
I had the same issue in postgres 11 and I did change the database encoding using the below steps,
to update all the list of encoding
SET client_encoding = 'UTF8';
UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='postgres';
update pg_database set encoding = pg_char_to_encoding('UTF8') where datname = 'dbname' ;
make sure to apply the update
statment in template0
and template1
and postgres
Database
postgres=# UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='postgres';
UPDATE 1
postgres=# UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='template0';
UPDATE 1
postgres=# UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='template1';
UPDATE 1
© 2022 - 2024 — McMap. All rights reserved.