Little endian data and sha 256
Asked Answered
C

3

7

I have to generate sha256 hashes of data that is in little endian form. I would like to know if I have to convert it to big endian first, before using the sha 256 algorithm. Or if, the algorithm is "endian-agnostic".

EDIT: Sorry, I think I wasnt clear. What I would like to know is the following: The sha256 algorithm requires to pad the end of a message with certain bits. The first step is to add a 1 at the end of the message. Then, to pad it with zero up to the end. At the very end, you must add the length of the message in bits. What I would like to know is if this padding can be performed in little endian. For example, for a 640 bit message, I could write the last word as 0x280 (in big endian), or 0x8002000 (in little endian). Can this padding be done in little endian?

Clarenceclarenceux answered 7/6, 2011 at 18:10 Comment(1)
This is tricky... though in general, SHA does not care about endianness. To the hash, any input is a multiple of 512 bit blocks of binary "blob" (if necessary, it will add padding). Insofar, endianness is insignificant. On the other hand, if you have for example a struct once in little and once in big endian, and you hash them, they will of course produce different hashes. But that's because they're different binary data, not because the hash cares.Bogtrotter
T
2

The SHA-256 implementation itself should take care of padding - you shouldn't have to deal with that unless you're implementing your own specialized SHA-256 code. If you are, note that the padding rules specified in the "pre-processing step" say that the length is a 64-bit big-endian integer. See SHA-2 - Wikipedia

It's hard to even figure out what "endian-agnostic" would mean, but the order of all the bits, bytes and words for a hash algorithm matter a whole lot, so I sure wouldn't use that term.

Translocation answered 8/7, 2011 at 23:6 Comment(0)
A
8

SHA256 is endian-agnostic if all you want is a good hash. But if you are writing SHA256 and want to the same results with a correct implementation then you must play games on little endian hardware. SHA256 combines arithmetic addition (mod 2*32) and boolean operation thus is not endian-agnostic internally.

Astrophotography answered 10/12, 2012 at 21:34 Comment(0)
T
2

The SHA-256 implementation itself should take care of padding - you shouldn't have to deal with that unless you're implementing your own specialized SHA-256 code. If you are, note that the padding rules specified in the "pre-processing step" say that the length is a 64-bit big-endian integer. See SHA-2 - Wikipedia

It's hard to even figure out what "endian-agnostic" would mean, but the order of all the bits, bytes and words for a hash algorithm matter a whole lot, so I sure wouldn't use that term.

Translocation answered 8/7, 2011 at 23:6 Comment(0)
V
2

Let me reply regarding sha 256 as well as sha 512. in short: The algorithm itself is endian agnostic. The endian sensitive parts are when data is imported from a byte buffer to the algorithm working variables and when it is exported back to the digest result - also a byte buffer. If the import / export include casting, then endian matters.

Where could casting occur: In sha 512 there is a working buffer of 128 bytes. In my code its defined like this:

    union
    {
        U64   w [80]; (see U64 example below)
        byte  buffer [128];
    };

Input data is copied to this byte buffer and then work is done on W. This means the data was casted to some 64 bit type. This data will have to be swapped. in my case its swapped for little endian machines.

A better method would be to prepare a get macro that takes each byte and places it in its correct place in the u64 type.

When the algorithm is done the digest result is output from the working variables to some byte buffer, if this is done by memcpy it will also have to be swapped.

Another casting could occur when implementing sha 512 - which is designed for 64 bit machines - on 32 bit machines. In my case I have a 64 bit type that is defined:

    typedef struct {
        uint high;
        uint low;
    } U64;

Assume I define it for little endian as well, as follows:

    typedef struct {
        uint low;
        uint high;
    } U64;

And then the k algorithm init is done like this:

    static const SHA_U64 k[80] =  
    { 
        {0xD728AE22, 0x428A2F98}, {0x23EF65CD, 0x71374491}, ...
        ...
        ...
    }

But i need the logic value of k[0].high to be the same in any machine. So in this example I will need another k array with high and low values swapped.

After the data is stored in the working parameters any bitwise manipulation would have the same result on both big/little endian machines.

Good method would be to avoid any casting: Import bytes from input buffer to your working parameters using macro. Work with logical values without thinking about the memory mapping. Export output to digest result with a macro.

Macro for taking 32 bits from a byte buffer to int32 (BE = big endian):

    #define GET_BE_BYTES_FROM32(a) 
    ((((NQ_UINT32) (a)[0]) << 24) | 
    (((NQ_UINT32) (a)[1]) << 16)  | 
    (((NQ_UINT32) (a)[2]) << 8)   | 
    ((NQ_UINT32) (a)[3])) 

    #define GET_LE_BYTES_FROM32(a) 
    ((((NQ_UINT32) (a)[3]) << 24) | 
    (((NQ_UINT32) (a)[2]) << 16)  | 
    (((NQ_UINT32) (a)[1]) << 8)   | 
    ((NQ_UINT32) (a)[0])) 
Vanny answered 24/8, 2016 at 11:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.