This question already has a lot of answers, but Mathias Bynens mentioned that 'utf8mb4' should be used instead of 'utf8' in order to have better UTF-8 support ('utf8' does not support 4 byte characters, fields are truncated on insert). I consider this to be an important difference. So here is yet another answer on how to set the default character set and collation. One that'll allow you to insert a pile of poo (💩).
This works on MySQL 5.5.35.
Note, that some of the settings may be optional. As I'm not entirely sure that I haven't forgotten anything, I'll make this answer a community wiki.
Old Settings
mysql> SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)
Config
# 💩 𝌆
# UTF-8 should be used instead of Latin1. Obviously.
# NOTE "utf8" in MySQL is NOT full UTF-8: http://mathiasbynens.be/notes/mysql-utf8mb4
[client]
default-character-set = utf8mb4
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysql]
default-character-set = utf8mb4
New Settings
mysql> SHOW VARIABLES LIKE 'char%'; SHOW VARIABLES LIKE 'collation%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.00 sec)
character_set_system is always utf8.
This won't affect existing tables, it's just the default setting (used for new tables).
The following ALTER code can be used to convert an existing table (without the dump-restore workaround):
ALTER DATABASE databasename CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Edit:
On a MySQL 5.0 server: character_set_client, character_set_connection, character_set_results, collation_connection remain at latin1. Issuing SET NAMES utf8
(utf8mb4 not available in that version) sets those to utf8 as well.
Caveat:
If you had a utf8 table with an index column of type VARCHAR(255), it can't be converted in some cases, because the maximum key length is exceeded (Specified key was too long; max key length is 767 bytes.
). If possible, reduce the column size from 255 to 191 (because 191 * 4 = 764 < 767 < 192 * 4 = 768). After that, the table can be converted.
utf8mb4
, i.e. real UTF-8 with full Unicode support. See How to support full Unicode in MySQL databases. – Elbertinautf8mb4
this is kind setting a dangerous precedent as it sits? – Calamanco