Will changing a file name affect the MD5 Hash of a file?
Asked Answered
H

7

104

Will changing a file name effect the MD5 Hash of a file?

Hon answered 20/2, 2011 at 3:53 Comment(0)
D
37

The usual definition of "MD5 hash of a file" is that the hash is based on the file contents. The name can be freely changed.

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

The two hash codes will be the same.

In some (fairly specialized) use cases, file metadata (name, time stamp(s), etc.) are part of the data used to compute the hash. Then

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

will produce two separate hashes.

Deluxe answered 20/2, 2011 at 3:57 Comment(1)
The question is asking about the CLI tool "md5sum", not the algorithm in general.Ageold
A
232

No, the hash is of the file contents only. You can see this in the source for md5sum and its MD5 implementation. You can also test this if you have access to md5sum:

$ echo "some arbitrary content" > file1
$ cp file1 file2
$ md5sum file1
f0007cbddd79de02179de7de12bec4e6  file1
$ md5sum file2
f0007cbddd79de02179de7de12bec4e6  file2
$
Ageold answered 16/1, 2013 at 14:38 Comment(2)
You don't need use Linux to know this. You can produce the same result on MacOSX or Windows.Homeopathy
In case anyone's looking for the windows equivalent like @alexandreMulatinho mentioned: replace md5sum with fciv and cp with copy, and it works just the same. If you then enter the windows subsystem for linux, the md5sum hashes match the fciv ones.Tragic
D
37

The usual definition of "MD5 hash of a file" is that the hash is based on the file contents. The name can be freely changed.

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

The two hash codes will be the same.

In some (fairly specialized) use cases, file metadata (name, time stamp(s), etc.) are part of the data used to compute the hash. Then

$hash1 = md5(file);
// change file name
$hash2 = md5(file);

will produce two separate hashes.

Deluxe answered 20/2, 2011 at 3:57 Comment(1)
The question is asking about the CLI tool "md5sum", not the algorithm in general.Ageold
H
6

In Linux using EXT filesystem, it will not, because a file name is not stored in a file, it is stored in the directory entry (dentry) that the file lives in, where the inode of the file is then mapped to a name. Changing a filename will have no affect on its md5sum in Linux. In Windows, I cannot be sure.

Hauger answered 17/9, 2013 at 19:3 Comment(1)
Also Windows file systems don't store the filename in the file. A straightforward port of md5sum should behave as expected.Overburdensome
H
0

If the hash is computed from the file contents, it shouldn't.

Hydatid answered 20/2, 2011 at 3:55 Comment(1)
The question is asking about the CLI tool "md5sum", not the algorithm in general.Ageold
S
0

In ESXi (Precisely ESXi 5.5) md5sum on same content but different file names is different. That leads me to believe that VMFS-5 file structure includes file name too. If we are not concerned about file name, Is there a way to check only the md5sum of file content? I couldn't see any option. Any suggestions?

Shawnee answered 4/6, 2014 at 11:9 Comment(1)
Which files are you talking about? Virtual disk images (.vmdk)? In vmdk headers there are data which could depend on the file name and location. How did you rename the files in your test? --- Otherwise from the file content point of view VMFS is a normal file system and the content of files does not directly depend on their names.Saintmihiel
I
-1

In response to the comment, https://mcmap.net/q/204589/-will-changing-a-file-name-affect-the-md5-hash-of-a-file:

This works only if one file is copy of another file but not when two different files with different names are generated with exactly same content. I have tried this:

nancy@nancy:~/Documents$ md5sum /home/nancy/Documents/1test.pdf
c5a445b7186dfb220ea79d2001acf3f1  /home/nancy/Documents/1test.pdf
nancy@nancy:~/Documents$ md5sum /home/nancy/Documents/2test.pdf
cefa063abf0c0a9e80b2b75e70100836  /home/nancy/Documents/2test.pdf

Both the files 1test.pdf and 2test.pdf are created using gimp software. Same content is exported twice with two different names.

Ingulf answered 22/12, 2019 at 14:30 Comment(1)
That means the file contents are not exactly the same... The program must export different meta-data in eachOma
H
-2

1.md5 is calculated based on binary content of the FILE. 2.File name,last modified etc. things are meta data.md5 not really rely on meta-data. I have tested this with below steps,lets work with "last modified" meta-data i)I have created a file named "a.txt" and added some content and created a hash say hash is "xyz" ii)Then I have just added a space in the file and again calculated the hash say it returned "abc" iii)I just removed my change in step (ii),on calculating hash again I have got the initial hash("xyz")

This concludes that even though the metadata of file is changed,the hash remains same till the file content remains unaltered.

Hope it helps.

Hack answered 24/4, 2019 at 10:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.