Does mysqldump handle binary data reliably?
Asked Answered
S

3

46

I have some tables in MySQL 5.6 that contain large binary data in some fields. I want to know if I can trust dumps created by mysqldump and be sure that those binary fields will not be corrupted easily when transferring the dump files trough systems like FTP, SCP and such. Also, should I force such systems to treat the dump files as binary transfers instead of ascii?

Thanks in advance for any comments!

Sycophancy answered 15/5, 2013 at 7:22 Comment(3)
forums.devshed.com/mysql-help-4/… I always use binary ftp mode for all the files. Never had any corruption.Booklet
You should always check the import somehow. Ideally by running a data compare utility, but that often involves duplicating much of the transfer. But even binary diffing zipped dumps at both ends via checksums is better than simply hoping everything is ok.Arnica
I cannot believe that in year 2021 this is still an issue and why MariaDB's mysqldump doesn't store a binary field correctly into the dump. At least a warning could be shown...Quinque
A
71

No, it is not always reliable when you have binary blobs. In that case you MUST use the "--hex-blob" flag to get correct results.

Caveat from comment below:

If you combine the --hex-blob with the -T flag (file per table) then the hex-blob flag will be ignored, silently

I have a case where these calls fail (importing on a different server but both running Centos6/MariaDB 10):

mysqldump --single-transaction --routines --databases myalarm -uroot -p"PASSWORD" | gzip > /FILENAME.sql.gz
gunzip < FILENAME.sql.gz | mysql -p"PASSWORD" -uroot --comments

It produces a file that silently fails to import. Adding "--skip-extended-insert" gives me a file that's much easier to debug, and I find that this line is generated but can't be read (but no error is reported either exporting or importing):

INSERT INTO `panels` VALUES (1003,1,257126,141,6562,1,88891,'??\\\?ŖeV???,NULL);

Note that the terminating quote on the binary data is missing in the original.

select hex(packet_key) from panels where id=1003;
--> DE77CF5C075CE002C596176556AAF9ED

The column is binary data:

CREATE TABLE `panels` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `enabled` tinyint(1) NOT NULL DEFAULT '1',
  `serial_number` int(10) unsigned NOT NULL,
  `panel_types_id` int(11) NOT NULL,
  `all_panels_id` int(11) NOT NULL,
  `installers_id` int(11) DEFAULT NULL,
  `users_id` int(11) DEFAULT NULL,
  `packet_key` binary(16) NOT NULL,
  `user_deleted` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  ...

So no, not only can you not necessarily trust mysqldump, you can't even rely on it to report an error when one occurs.


An ugly workaround I used was to mysqldump excluding the two afflicted tables by adding options like this to the dump:

--ignore-table=myalarm.panels 

Then this BASH script hack. Basically run a SELECT that produces INSERT values where the NULL columns are handled and the binary column gets turned into an UNHEX() call like so:

(123,45678,UNHEX("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"),"2014-03-17 00:00:00",NULL),

Paste it into your editor of choice to play with it if you need to.

echo "SET UNIQUE_CHECKS=0;SET FOREIGN_KEY_CHECKS=0;DELETE FROM panels;INSERT INTO panels VALUES " > all.sql
mysql -uroot -p"PASSWORD" databasename -e "SELECT CONCAT('(',id,',', enabled,',', serial_number,',', panel_types_id,',', all_panels_id,',', IFNULL(CONVERT(installers_id,CHAR(20)),'NULL'),',', IFNULL(CONVERT(users_id,CHAR(20)),'NULL'), ',UNHEX(\"',HEX(packet_key),'\"),', IF(ISNULL(user_deleted),'NULL',CONCAT('\"', user_deleted,'\"')),'),') FROM panels" >> all.sql
echo "SET UNIQUE_CHECKS=1;SET FOREIGN_KEY_CHECKS=1;" > all.sql

That gives me a file called "all.sql" that needs the final comma in the INSERT turned into a semicolon, then it can be run as above. I needed the "large import buffer" tweaks set in both the interactive mysql shell and the command line to process that file because it's large.

mysql ... --max_allowed_packet=1GB

When I reported the bug I was eventually pointed at the "--hex-blob" flag, which does the same as my workaround but in a trivial from my side way. Add that option, blobs get dumped as hex, the end.

Arnica answered 30/7, 2015 at 4:50 Comment(3)
Note that that bug is still flagged "can't repeat" two years later despite my attempts to re-open it and the MySQL people don't seem inclined to fix it.Arnica
"Add that option, blobs get dumped as hex, the end." except they dont, not always anyways. If you combine the --hex-blob with the -T flag (file per table) then the hex-blob flag will be ignored, silently ! I have tested this on an older customer system running 5.6 and it is giving me great trouble.Sycee
@EllertvanKoperen that's a new and even more horrible twist. I'm really sorry to hear it.Arnica
Q
18

The dumps generated from mysqldump can be trusted.

To avoid problems with encodings, binary transfers, etc, use the --hex-blob option, so it translates each byte in a hex number (for example, 'abc' becomes 0x616263). It will make the dump bigger, but it will be the most compatible and secure way to have the info (since it will be pure text, no weird misinterpretations due to special symbols generated with the binary data on a text file).

You can ensure the integrity (and speed up the transfer) of the dump files packing it on a rar or zip file. That way you can easily detect that it didn't get corrupted with the transfer.

When you try to load it on your server, check you have assigned on your my.cnf server config file

[mysqld]
max_allowed_packet=600M

or more if needed.

BTW right now i just did a migration, and dumped lots of binary data with mysqldump and it worked perfectly.

Quechua answered 31/7, 2015 at 8:41 Comment(2)
The response of the MySQL team to the above bug is "will not fix, use --hex-blob workaround" so that does seem to be the best solution.Arnica
I'm not sure I can agree with the first sentence. Only some of the dumps generated from mysqldump can be trusted, since you have to use --hex-blob when creating the dump file.Briones
D
-9

Yes, you can trust dumps generated by mysqldump.

Yes, you should use binary transfer in order to avoid any encoding conversion during transfer. MySQL dump adds control commands to the dump so that the server interprets the file in a specific encoding when reimporting. You do not want to change this encoding.

Disproportionate answered 15/5, 2013 at 8:32 Comment(7)
mysqldump doesn't add control commands by default, --hex-blob flag must be specified to ensure no weird misinterpretations of the binary data inside a text file.Quechua
Sorry, but -1 from me. I also had to use the --hex-blob flag when doing a dump / scp / restore between linux boxes.Augustina
Downvoters: don't assume that something is wrong when you simply do not understand it. The --hex-blob workaround avoids encoding issue, since it outputs ASCII characters only. If the reported bug can't be reproduced, it is simply because the problem exists beween keyboard and chair, and MySQL can't fix this. If someone could provide a repeatable scenario, then the so called "bug" could be fixed. Please give me such a scenario and then I will publicly apologise. Until then I will just assume you don't know what you are talking about.Disproportionate
@GusstavvGil Yes it does, doesn't it? i.sstatic.net/QkF8Q.pngDisproportionate
Sorry, but underestimating a comment and calling people that "they don't know what they are talking" doesn't make your argument right. First: You are using a recent version of mysql to justify a 4 years old comment-- really?. Second: The initial question is about getting the dump not corrupted during a FTP transfer. That's where the --hex-blob flag makes the dump ASCII only- safe to send. You even acknowledged it. So making the dumps only ASCII makes it unnecessary to setup the FTP transfer modes--problem solved. Sorry, but you are the one who didn't understand the question in the first place.Quechua
@GusstavvGil Yes it does, doesn't it? i.sstatic.net/gOgmz.png. Imagine I create a file in some text editor, with binary data inside, then I transfer this file over FTP to a second server. When I open the file on the second computer with the same text editor, the data is corrupted. Does this mean that the text editor is "not reliable"? No, it means I don't know about FTP or SCP transfer mode. Oh my god, I have once again wasted my time trying to fill a pot that's already full of itself. Amusingly, notice that OP understood her own question and accepted my answer.Disproportionate
@Disproportionate if you read my answer you will see an example of the problem. If you read the bug report I submitted you will see the same example. Please, if you don't understand the example ask questions. I'd like the bug fixed, I don't really care how rude the person fixing it is.Arnica

© 2022 - 2024 — McMap. All rights reserved.