Why does the same JAR file have different hash every time I build it?
Asked Answered
T

4

19

I've been thinking about checking jar file's hash value to determine if it has changed or not, but as it turns out the same jar file has different hashes every time I build it (export as jar file from eclipse, or build it using maven). I've removed manifest file's date values and stuff but it still is different. Is there something in bytecode generation which includes a timestamp or something?

Thamos answered 16/5, 2017 at 5:42 Comment(1)
To answer your other question, no, there are no timestamps in the bytecode. But the zipfile of course does have timestamps. If you want reproducible Java builds, you should check out Bazel.Magavern
B
11

A JAR file is a ZIP file and it contains a last modified date in its local file headers and central directory file header. This will lead to different hashes of your builds.

If you run the JAR command on the exact same set of files (with same file dates) and skip manifest file creation it should give you the exact same JAR file (if the order of files inside the ZIP does not change).

Bennettbenni answered 16/5, 2017 at 6:10 Comment(5)
Is there a way to tell it to ignore file dates? Or remove them from headers?Thamos
@Thamos Doesn't seem to be possible. But I would recommend to switch to another way of hash generation, e.g. use the checksum task on the files before you pack them into the JAR, ant.apache.org/manual/Tasks/checksum.htmlBennettbenni
Seems legitimate. Thanks for your answer mate.Thamos
You might want to check out Bazel. It modifies the Java compiler to strip out all the timestamps, leading to fully reproducible builds.Magavern
@Magavern thanks for pointing to Bazel. Couldn't find a quick reference to the Java timestamp handling, maybe the Bazel team should promote this interesting feature more prominently.Bennettbenni
C
4

I had the same issue with Gradle builds. In my case, my .war file included many built .jar files.

In Gradle, the Jar and War tasks both are essentially variants of the Zip task, which has a property called "preserveFileTimestamps" (https://docs.gradle.org/current/dsl/org.gradle.api.tasks.bundling.Zip.html#org.gradle.api.tasks.bundling.Zip:preserveFileTimestamps) To make SHAs the same, use this property for both jar and war tasks, for example, somewhere in the build.gradle:

plugins.withType(WarPlugin).whenPluginAdded {
    war {
        preserveFileTimestamps = false
    }
}
jar {
    preserveFileTimestamps = false
}

Also an interesting note, if you build on MacOS, make sure .DS_Store files don't get into the built archive, as it will also cause different SHAs.

To disable on MacOS, run this in the terminal:

defaults write com.apple.desktopservices DSDontWriteNetworkStores true

Then reboot it. You will still have to delete the existing .DS_Store files, so from inside your project folder, run:

find . -name '.DS_Store' -exec rm {} \;

If you want to make the SHAs the same even after building on different operating systems, set the reproducibleFileOrder property to true both for war and jar tasks, and make sure the umask is the same on both systems you build (apparently gradle includes the file attributes inside the war/jar files, and I had different SHAs when those attributes were different).

Finally, I was able to get the same SHAs of artifacts wherever I built.

Cheers

Colossus answered 14/1, 2019 at 15:41 Comment(1)
great tip, sergey! The preserveFileTimestamps option works perfectly. This one option significantly improves docker cache hit rate when building a gradle project with gradle docker prepare pluginGuru
A
4

The solution which worked best for me was as follows in my gradle file (note that I also remove the manifest date which can be changed by some tasks):

// Prevent manifest from changing every build
project.tasks.withType(Jar) {
    manifest.attributes Date: ''
}

// Prevent timestamps from appearing in JAR and use reproducible file order
tasks.withType(AbstractArchiveTask) {
    preserveFileTimestamps = false
    reproducibleFileOrder = true
}

Inspired from: https://dzone.com/articles/reproducible-builds-in-java

Archaeozoic answered 10/10, 2019 at 13:31 Comment(0)
D
2

Getting reproducible builds with Java, ie. builds that always produce the same binary output, requires some tweaks since Java is not reproducible-friendly from the beginning: jar files, with files order and timestamp, is a first natural source of variation. In addition to issues caused by Java, some Maven plugins cause additional variations: see Maven Reproducible/Verifiable Builds Wiki page https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74682318

You can use reproducible-build-maven-plugin: https://zlika.github.io/reproducible-build-maven-plugin for the Apache Maven build tool, popular with Java projects or sbt-reproducible-builds plugin https://github.com/raboof/sbt-reproducible-builds for the sbt build tool, popular with Scala projects. For Gradle tool: https://docs.gradle.org/current/userguide/working_with_files.html#sec:reproducible_archives

For general information on 'Reproducible Builds', see https://reproducible-builds.org

Dishevel answered 28/8, 2019 at 20:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.