Inserting UTF-8 encoded string into UTF-8 encoded table gives incorrect string value.
PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9D\x84\x8E i...' for column 'body_value' at row 1: INSERT INTO
I have a π
character, in a string that mb_detect_encoding claims is UTF-8 encoded.
I try to insert this string into a MySQL table, which is defined as (among other things) DEFAULT CHARSET=utf8
Edit: Drupal always does SET NAMES utf8
with optional COLLATE
(atleast when talking to MySQL).
Edit 2: Some more details that appear to be relevant. I grab some text from a PostgreSQL database. I stick it onto an object, use mb_detect_encoding to verify that it's UTF-8, and persist the object to the database, using node_save. So while there is an HTTP request that triggers the import, the data does not come from the browser.
Edit 3: Data is denormalized over two tables:
SELECT character_set_name FROM information_schema.
COLUMNS
C WHERE table_schema = "[database]" AND table_name IN ("field_data_body", "field_revision_body") AND column_name = "body_value";
>+--------------------+
| character_set_name |
+--------------------+
| utf8 |
| utf8 |
+--------------------+
Edit 4: Is it possible that the character is "to new"? I'm more than a little fuzzy on the relationship between unicode and UTF-8, but this wikipedia article, implies that the character was standardized very recently.
I don't understand how that can fail with "Incorrect string value".
SELECT character_set_name FROM information_schema.`COLUMNS` C WHERE table_schema = "db_name" AND table_name = "table_name" AND column_name = "column_name";
give β Taxationutf8
is only the BMP. Itsutf8mb4
corresponds to the outside world'sUTF-8
(and includes 4-byte characters). β Marcosmarcotte