Comparing content of 2 files with md5sum
Asked Answered
S

4

9

How can I compare the md5 sums for 2 files in one command?

I can compute them each individually:

my_prompt$ md5sum file_1.sql
20f750ff1aa835965ec93bf36fd8cf22  file_1.sql

my_prompt$ md5sum file_2.sql
733d53913c366ee87b6ce677971be17e  file_2.sql

But wonder how this can be combined into a single comparison computation. I have tried different approaches that fails:

my_prompt$ md5sum file_1.sql == md5sum file_2.sql
my_prompt$ `md5sum file_1.sql` == `md5sum file_2.sql`
my_prompt$ (md5sum file_1.sql) == (md5sum file_2.sql)
my_prompt$ `md5sum file_1.sql` -eq `md5sum file_2.sql`

What am I missing here ? Tried following Compare md5 sums in bash script and https://unix.stackexchange.com/questions/78338/a-simpler-way-of-comparing-md5-checksum without luck.

Sniper answered 28/5, 2021 at 13:5 Comment(0)
K
13

You need a program/built-in that evaluates the comparison. Usually you would use test/[/[[ to do so. With these programs -eq compares decimal numbers. Therefore use the string comparison = instead.

[[ "$(md5sum file_1.sql)" = "$(md5sum file_2.sql)" ]]

The exit code $? of this command tells you wether the two strings were equal.

However, you may want to use cmp instead. This program compares the files directly, should be faster because it doesn't have to compute anything, and is also safer as it cannot give false positives like a hash comparison can do.

cmp file_1.sql file_2.sql
Keffer answered 28/5, 2021 at 13:15 Comment(7)
Exactly what I needed. Clear and precise explanation.Sniper
One question: To get the exit code, you must do a 2nd call ? like echo $?Sniper
Any way to get everything into one single call ?Sniper
Yes, but you usually don't access $?. Instead, you write something like if [[ a = b ]]; then echo equal; else echo different; fi. cmp tells you directly if the files are different and remains quiet if they are the same.Keffer
Might better add a -n 32 to compare the 32 ascii character (16 bytes) onlyOmni
@RyanChen I don't understand. Does the sql file format start with a plain text hexadecimal hash of the entire file? If it does, that would be a great shortcut to rule out equality. But I would do a full check anyways, to rule out false positive equality (for cases like checking if a copy is corrupted, this is absolutely necessary). If the first bytes differ, any reasonable implementation of cmp should already skip the rest.Keffer
my reason for searching "compare two files checksum" was that if i searched with just "compare two files" i'd be flooded with diff-like and other results. but that's why i say that brainful ppl like u are required here on SO (who share solutions to the problem in general, not just answers)Dday
D
6

By passing the filenames as arguments to the md5sum command, we have something like:

$ md5sum foo.json bar.json
07a9a5c765f5d861b506eabd02f5aa4b *foo.json
07a9a5c765f5d861b506eabd02f5aa4b *bar.json

So, we have to compare the first column of the md5sum output:

if [[ $(md5sum foo.json bar.json | awk '{print $1}' | uniq | wc -l) == 1 ]]
then
    echo "Identical files"
else
    echo "There are differences"
fi

In the case we need the return code we can use the test command as follows:

test $(md5sum foo.json bar.json | awk '{print $1}' | uniq | wc -l) == 1

Let's breakdown the command:

$ md5sum foo.json bar.json
07a9a5c765f5d861b506eabd02f5aa4b *foo.json
07a9a5c765f5d861b506eabd02f5aa4b *bar.json

$ md5sum foo.json bar.json | awk '{print $1}'
07a9a5c765f5d861b506eabd02f5aa4b
07a9a5c765f5d861b506eabd02f5aa4b

$ md5sum foo.json bar.json | awk '{print $1}' | uniq
07a9a5c765f5d861b506eabd02f5aa4b

$ md5sum foo.json bar.json | awk '{print $1}' | uniq | wc -l
1

$ test $(md5sum foo.json bar.json | awk '{print $1}' | uniq | wc -l) == 1

$ echo $?
0
Devanagari answered 28/5, 2021 at 13:26 Comment(0)
M
3

This will work with bash

$ md5sum file1 file2 | md5sum --check

You will get OK for both files if md5 are equal. You can also use this for 3 files or more.

Myke answered 20/1, 2022 at 6:58 Comment(3)
This will work, but will not solve the problem, to check if both files are identical. In this line you only check, if the just generated md5sums are correct. They have to be, because you just generated it. ;-)Leschen
This does not work. The OK is just showing that the md5sum of each file is verified and correct. But it is not comparing both files against each other. As long as the two files have a verified md5sum, so the contents of the file are correct, it will show OK. Even though both files are completely different.Chlamydeous
--check will perform the hash again, and compare it to the hash that was produced before the pipe. you're not comparing the two files, you're comparing each file against itselfIrrevocable
A
-2

Following @Socowi's answer, there is a way to get the answer in one line:

[[ "$(md5sum file_1.sql)" = "$(md5sum file_2.sql)" ]] && echo "Same content" || echo "Different content"

&& and || act as and and or. When you get && followed by || it works as a ternary operator in other programming languages. In other words, if the md5sum are equals, then echo "Same content", else echo "Different content".

Abseil answered 17/1, 2023 at 11:6 Comment(1)
this does not work, because the filenames are printed as well and so it is always different!Naominaor

© 2022 - 2024 — McMap. All rights reserved.