Compare 2 large sql files and find differences to recover data
Asked Answered
E

3

9

I have 2 large SQL files of around 8GB each. But, in the latest backup I find that one file has 300MB data missing.

I just want to compare which data is missing, so that I can check that was it just temporary data OR important data that has vanished.

On comparing both files via diff on Ubuntu 14.04 I always get memory allocation error. I have also tried other allowing more than memory solutions and all that, but still no help.

I want to gather all data which exists in sql1 but missing in sql2 to a new file sql3.

Please help!

EDIT: I moved from Simple MySQl-Server to Percona XtraDB Cluster recently, A lot of tables were converted from MyISAM to INNODB in the process. So, can that be a reason for 300MB decreases in mysqldump SQL files? I seriously doubt this because a SQL will be an SQL, but does INNODB SQL code is decreased in any case? An expert advice on this will help.

Ey answered 7/3, 2017 at 4:46 Comment(3)
is it from one table or multiple tablesJagged
@Jagged Its ALL databases. But, I think it doesn't matter, I just want to see which lines are missing from second sql file but exist in first sql file.Ey
Red-Gate SQL Compare is a great tool that will do this for you. red-gate.com/products/mysql/mysql-compareAplasia
C
2

SQL dumps comparison is quite hard to do when dealing with large amounts of data. I would try the following:

  1. Import each SQL file data into its own database
  2. Use one of the methods indicated here to compare database content (I assume the schema is the same). E.g. Toad for MySql

This way of comparison should be faster, as data manipulation is much faster when stored into database and also has the advantage the missing data can easily used. E.g.

SELECT *
FROM db1.sometable
WHERE NOT EXISTS (SELECT 1 
                  FROM db2.sometable 
                  WHERE db1.sometable.pkcol = db2.sometable.pk2)

will return the exact missing information into a convenient way.

Clint answered 7/3, 2017 at 4:52 Comment(5)
Your solution suggests diff which I have tried already and I am getting memory allocation errors because the files are very large and I do not have enough memory. I have 153 databases, I cannot compare 1 by 1.Ey
@rsharpy - no, skip the first answer. Try this or other from below. As mentined, the first should be to get data from each SQL in its own database and perform the comparison at database level. Dump data is harder to manipulate.Clint
@Alexei Thanks friend. If I am going to first import both SQL files into servers and then compare each database with other version, it will take me a lot of time. But, if I just get to know a quick line per line difference report from the 2 sql files, I will quickly understand what is missed.Ey
Toad for MySql works on Windows and I have Mac and Linux only.Ey
@rsharpy - you may take a look upon [mysqldbcompare]( dev.mysql.com/doc/mysql-utilities/1.5/en/mysqldbcompare.html).Clint
U
1

If you export the dump you can use tools like Beyond Compare, Semantic Merge, Winmerge, Code Compare or other diff tools.

Not that some tools (i.e. Beyond Compare) have 4096 characters limit for a row, which becomes a problem in the comparison (I got mad). It's possible to change that in Tools->FileFormat->[choose your format, maybe it is EverythingElse]->Conversion->64000 characters Per Line (this is the maximum).

Also you can try changing the fileformat to SQL(might not help much though; and it will slow your comparison).

enter image description here

Union answered 13/6, 2019 at 14:52 Comment(0)
O
0
  1. Export only the needed tables as separate CSV files. In phpMyAdmin you can do that with an option

phpMyAdmin option

  1. Use any version control app, like SourceTree or any file comparison app, like FileMerge, to compare the files.
Orbital answered 18/3 at 8:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.