Golang md5 Sum() function
Asked Answered
G

4

9
package main

import (
    "crypto/md5"
    "fmt"
)

func main() {
    hash := md5.New()
    b := []byte("test")
    fmt.Printf("%x\n", hash.Sum(b))
    hash.Write(b)
    fmt.Printf("%x\n", hash.Sum(nil))
}

Output:

*md5.digest74657374d41d8cd98f00b204e9800998ecf8427e
098f6bcd4621d373cade4e832627b4f6

Could someone please explain to me why/how do I get different result for the two print ?

Gnomon answered 15/6, 2014 at 21:33 Comment(1)
Have you got the correct answer for this, I can see in comments you are not clear about the result?? I have a answer for thisRabinowitz
I
13

I'm building up on the already good answers. I'm not sure if Sum is actually the function you want. From the hash.Hash documentation:

// Sum appends the current hash to b and returns the resulting slice.
// It does not change the underlying hash state.
Sum(b []byte) []byte

This function has a dual use-case, which you seem to mix in an unfortunate way. The use-cases are:

  1. Computing the hash of a single run
  2. Chaining the output of several runs

In case you simply want to compute the hash of something, either use md5.Sum(data) or

digest := md5.New()
digest.Write(data)
hash := digest.Sum(nil)

This code will, according to the excerpt of the documentation above, append the checksum of data to nil, resulting in the checksum of data.

If you want to chain several blocks of hashes, the second use-case of hash.Sum, you can do it like this:

hashed := make([]byte, 0)
for hasData {
    digest.Write(data)
    hashed = digest.Sum(hashed)
}

This will append each iteration's hash to the already computed hashes. Probably not what you want.

So, now you should be able to see why your code is failing. If not, take this commented version of your code (On play):

hash := md5.New()
b := []byte("test")
fmt.Printf("%x\n", hash.Sum(b))             // gives 74657374<hash> (74657374 = "test")
fmt.Printf("%x\n", hash.Sum([]byte("AAA"))) // gives 414141<hash> (41 = 'A')
fmt.Printf("%x\n", hash.Sum(nil))           // gives <hash> as append(nil, hash) == hash

fmt.Printf("%x\n", hash.Sum(b))             // gives 74657374<hash> (74657374 = "test")
fmt.Printf("%x\n", hash.Sum([]byte("AAA"))) // gives 414141<hash> (41 = 'A')
hash.Write(b)
fmt.Printf("%x\n", hash.Sum(nil))           // gives a completely different hash since internal bytes changed due to Write()
Incisive answered 15/6, 2014 at 23:57 Comment(0)
M
2

You have 2 ways to actually get a md5.Sum of a byte slice :

func main() {
    hash := md5.New()
    b := []byte("test")
    hash.Write(b)
    fmt.Printf("way one : %x\n", hash.Sum(nil))
    fmt.Printf("way two : %x\n", md5.Sum(b))
}

According to http://golang.org/src/pkg/crypto/md5/md5.go#L88, your hash.Sum(b) is like calling append(b, actual-hash-of-an-empty-md5-hash).

The definition of Sum :

func (d0 *digest) Sum(in []byte) []byte {
    // Make a copy of d0 so that caller can keep writing and summing.
    d := *d0
    hash := d.checkSum()
    return append(in, hash[:]...)
}

When you call Sum(nil) it returns d.checkSum() directly as a byte slice, however if you call Sum([]byte) it appends d.checkSum() to your input.

Metapsychology answered 15/6, 2014 at 22:13 Comment(4)
doesn't answer me why Sum(nil) is differentGnomon
@Gnomon As far as I can see, the only reason why your Sum(nil) returns something different is that you call Write beforehand, which lets hash produce another check sum the next time checkSum() is called. See play.golang.org/p/8yvaj7MvLBIncisive
thank but the answer is still vague, since hash of an empty md5 hash does not have that number of digit.. thanks @Incisive makes it clearer to me but still does not explain why the format of printed out text is differentGnomon
"%x" prints the hex representation of a byte array, calling Sum(b) appends an the hash of "" to the value of b.Metapsychology
M
1

From the docs:

    // Sum appends the current hash to b and returns the resulting slice.
    // It does not change the underlying hash state.
    Sum(b []byte) []byte

so "*74657374*d41d8cd98f00b204e9800998ecf8427e" is actually a hex representation of "test", plus the initial state of the hash.

fmt.Printf("%x", []byte{"test"})

will result in... "74657374"!

So basically hash.Sum(b) is not doing what you think it does. The second statement is the right hash.

Myramyrah answered 15/6, 2014 at 22:3 Comment(1)
thanks it's clear to me now what Sum(b) is returning, but still unclear why Sum(nil) is differentGnomon
R
0

I would like to tell you to the point:

why/how do I get different result for the two print ?

Ans:

hash := md5.New()

As you are creating a new instance of md5 hash once you call hash.Sum(b) it actually md5 hash for b as hash itself is empty, hence you got 74657374d41d8cd98f00b204e9800998ecf8427e as output.

Now in next statement hash.Write(b) you are writing b to the hash instance then calling hash.Sum(nil) it will calculate md5 for b that you just written and sum it to previous value i.e 74657374d41d8cd98f00b204e9800998ecf8427e

This is the reason you are getting these outputs.

For your reference look at the Sum API:

func (d0 *digest) Sum(in []byte) []byte {
85      // Make a copy of d0 so that caller can keep writing and summing.
86      d := *d0
87      hash := d.checkSum()
88      return append(in, hash[:]...)
89  }
Rabinowitz answered 9/3, 2017 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.