Incorrect string value: '\xC2\x9Fe 10...' for column
Asked Answered
F

4

12

We have a Old 5.1 Mysql server running on server 2003. Recently we move to a newer environment with Mysql 5.6 and server 2008. Now on the new server we keep getting errors when inserting special chars like 'Ã'.

Now I have checked the source encoding and it is UTF-8. But the old Mysql server was configured as latin1(Server / tables / colonms) with collation latin_swedish_ci and we did not receive any errors on the old environment.

Now I have done some testing since we are not live on the new environment. I have tried setting all tables to tables / colonms as well as latin1. In both cases I keep getting these errors.

What I noticed is that on the old server the servers default char-set is latin1 and on the new server its utf-8. Could that be the problem? I find this very strange because the source is utf-8.

Is there maybe some option to handle this that could be turned on on the old environment? I'm not sure if something like that exists. I did compare the settings within the mysql admin tool and apart from the default char-set it looks the same.

EDIT:

SHOW VARIABLES LIKE 'char%';

Old server:

+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | utf8                                          | *
| character_set_connection | utf8                                          | *
| character_set_database   | latin1                                        |
| character_set_filesystem | binary                                        |
| character_set_results    | utf8                                          | *
| character_set_server     | latin1                                        |
| character_set_system     | utf8                                          |

New Server:

+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | utf8mb4                                       | *
| character_set_connection | utf8mb4                                       | *
| character_set_database   | utf8                                          |
| character_set_filesystem | binary                                        |
| character_set_results    | utf8mb4                                       | *
| character_set_server     | utf8                                          |
| character_set_system     | utf8                                          |

As far as I understand from the article over at the MySQL site utf8mb4 is a super-set of utf8 this should not create a problem for encoding I think since they are basically identical on encoding right?

Felsite answered 23/6, 2015 at 14:3 Comment(1)
Yes, utf8mb4 is 'better' than utf8. Still, one needs to be consistent throughout MySQL. What is the context of 'Ã'? Of C29Fe? There may be additional clues there. (Still à is valid in both character sets and C29F is (I think) invalid in both.)Rusticus
B
3

The old UTF-8 of MySQL was not real UTF-8. If you try "special" characters (japanese or chinese) you'll probably end up with squares or question marks on your old server.

Your new server is now really using UTF-8 (mb4 stands for multi-bytes 4). The server receives UTF-8 characters but, obviously, can not store UTF-8 characters because your table are not using UTF-8. Convert all the tables to UTF-8 and the database to UTF-8 and you'll solve your problem.

You can do this with :

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Don't forget to backup before.

Source : https://mcmap.net/q/66180/-how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8

Bypass answered 3/7, 2015 at 15:2 Comment(1)
As far as I can see this worked on the new server. But the question that is still unsolved is why it worked on the old server. In the script I say use same settings as source. Therefor I would think it would work the same as on the old one? Or is there a difference as mentioned in encoding between the versions?Felsite
R
2
  1. First, since the old environment was working correctly, the first choice would be to use the same "character set" setting in the new environment. If you still have access to the 5.0 server, grab SHOW VARIABLES;.

5.0 defaulted to latin1; 5.6 defaults to utf8. This is mostly visible in

mysql> SHOW VARIABLES LIKE 'char%';
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | utf8                                          | *
| character_set_connection | utf8                                          | *
| character_set_database   | latin1                                        |
| character_set_filesystem | binary                                        |
| character_set_results    | utf8                                          | *
| character_set_server     | latin1                                        |
| character_set_system     | utf8                                          |

SET NAMES utf8; sets the three flagged lines.

à is hex C3 in latin1 and C383 in utf8. More encodings here. Do this to see what is currently in a table:

SELECT col, HEX(col) FROM table WHERE ...
  1. Another possibility is that the "move" mangled the data. If you can do the same SELECT on both machines, and if they come out differently, then the migration was bad. Since there are many ways to move data, please provide the details of the migration so we can dissect what might have gone wrong.

  2. In your title, you have C29F. That is a strange one -- it is a control code APPLICATION PROGRAM COMMAND, which I have never heard of. (Note: It is not related to the à you mentioned later.) Please provide more examples of the problems; neither of those clues is helpful.

Rusticus answered 29/6, 2015 at 16:38 Comment(1)
See my edit. I have added both outputs from the servers. I have a test database on the new one and will insert some test data to get some more results / cases for you.Felsite
S
1

The significant part of this is that your old server had:

| character_set_database   | latin1 

while your new server has

| character_set_database   | utf8 

It does not matter that the connection and client are using utf8 if the database is using latin1, the tables will default to latin1 and so the data will be stored in latin1 and you will get your error. You can of course explicitly set the character set and collation for any table to be other than the database default.

I guess that when you migrated the database schema you did not edit the character encoding for the database, or the tables before running the migration script.

Now you can either change the database and each table manually, or you can edit the migration script and rerun it. Most migration script and database dumps will include the specific character set for each table as well as for the database, even when they are all the same.

Sunn answered 5/7, 2015 at 1:53 Comment(0)
E
0

One experienced I got when I was moving the my application to new env. I got some weird thing when inserting data related to data to be insert to table, my case it complained about the date was empty so it cannot be inserted to table (No change on source code. Only new env(Mysql server from 5.1 to 5.6, tomcat 6 to tomcat 7, new Suse server version).

I try to replace the mysql connector driver to newer version for my application and It resolved the issue.

Examination answered 23/6, 2015 at 14:37 Comment(1)
I just checked but we have the latest mysql connector odbc 5.3.4 installed on the machine.Felsite

© 2022 - 2024 — McMap. All rights reserved.