I have a large CSV file that I am going to load it into a MySQL table. However, these data are encoded into utf-8 format, because they include some non-english characters. I have already set the character set of the corresponding column in the table to utf-8. But when I load my file. the non-english characters turn into weird characters(when I do a select on my table rows). Do I need to encode my data before I load the into the table? if yes how Can I do this. I am using Python to load the data and using LOAD DATA LOCAL INFILE command. thanks
as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE
Try
LOAD DATA INFILE 'file'
IGNORE INTO TABLE table
CHARACTER SET UTF8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
character set
? I mean, should we write UTF8
or UTF-8
? Quoted or not? Case sensitive? –
Butterfingers as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE
Do not need encode your characters in the file, but you need to make sure that your file is encoding at UTF-8 before load this file to database.
You should send
init_command = 'SET NAMES UTF8'
use_unicode = True
charset = 'utf8'
when doing MySQLdb.connect() e.g.
dbconfig = {}
dbconfig['host'] = 'localhost'
dbconfig['user'] = ''
dbconfig['passwd'] = ''
dbconfig['db'] = ''
dbconfig['init_command'] = 'SET NAMES UTF8'
dbconfig['use_unicode'] = True
dbconfig['charset'] = 'utf8'
conn = MySQLdb.connect(**dbconfig)
edit: ah, sorry, I see you've added that you're using "LOAD DATA LOCAL INFILE" -- this wasn't clear from your initial question :)
Try something like,
LOAD DATA LOCAL INFILE "file" INTO TABLE message_history CHARACTER SET UTF8 COLUMNS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"';
Original Structure,
© 2022 - 2024 — McMap. All rights reserved.
CHARACTER SET utf8mb4
as described here: https://mcmap.net/q/65066/-quot-incorrect-string-value-quot-when-trying-to-insert-utf-8-into-mysql-via-jdbc – Stagger