When is htonl(x) != ntohl(x) ? (Or when is converting to and from Network Byte Order not equivalent on the same machine?)
Asked Answered
C

3

9

In regards to htonl and ntohl. When would either of these two lines of code evaluate to false.

 htonl(x) == ntohl(x);

 htonl(ntohl(x)) == htonl(htonl(x));

In other words, when are these two operations not equivalent on the same machine? The only scenario I can think of is a machine that does not work on 2's complement for representing integers.

Is the reason largely historical, for coding clarity, or for something else?

Do any modern architectures or environments exists today where these converting to and from network byte order on the same machine is not the same code in either direction?

Canikin answered 23/7, 2012 at 17:45 Comment(7)
Did you mean x != ntohl(htonl(x))? Also, is there a specific environment you received conflicting results, or some code?Jugglery
I understand Endianess and Network Byte Order completely. My question is more curiousity of why there are two conversion functions to begin with. Put it another way when is ntohl(htonl(x)) != htonl(htonl(x))?Canikin
ntohl(htonl(x)) will never not be equivalent to htonl(ntohl(x)) on any machine as both are transitive inverse operations of one anotherJohnnie
Ah, I get your question. That does seem a bit odd. The necessity for byte reordering is clear, but why wasn't it just one function that would switch between orderings each time it was called. You don't need to know the endianess of the input to have a general solution for changing it back and forth. That is curious, if I didn't completely butcher your question that is :PJugglery
@DanielDiPaolo I think what OP getting at is that htonl(htonl(x)) = x = ntonl(htonl(x)) for all situations.Jugglery
I think even if the two functions were equivalent, it would still be desirable to have two different functions in order to keep clear the intent of the code. With something like swap_int32(x) it's not obvious whether you mean to be internalizing or externalizing the data.Jayejaylene
Fun fact, a function that is its own inverse is an involution (took me a bit to remember that term). Might be a good way to describe ntohl and htonl.Jugglery
C
7

I couldn't find the original draft of the Posix spec, but a recent one found online has a hint.

Network byte order may not be convenient for processing actual values. For this, it is more sensible for values to be stored as ordinary integers. This is known as ‘‘host byte order ’’. In host byte order:

The most significant bit might not be stored in the first byte in address order.

**Bits might not be allocated to bytes in any obvious order at all.**

8-bit values stored in uint8_t objects do not require conversion to or from host byte order, as they have the same representation. 16 and 32-bit values can be converted using the htonl(), htons(), ntohl(),and ntohs() functions.

Interesting though is the the following statement is made under the discussion of

The POSIX standard explicitly requires 8-bit char and two’s-complement arithmetic.

So that basically rules out my idea of a 1's complement machine implementation.

But the "any obvious order at all" statement basically suggests that the posix committee at least considered the possibility of posix/unix running on something other than big or little endian. As such declaring htonl and ntohl as differnet implementations can't be ruled out.

So the short answer is "htonl and ntohl are the same implementation, but the interface of two different functions is for future compatibility with the unknown."

Canikin answered 25/7, 2012 at 8:52 Comment(2)
+1. HTML version available here.Bister
So, a crazy machine may have a 8 bits right rotated from network byte order as its host byte order. On such a machine, htonl and ntohl would not be the same. htonl would have to rotate left by 8 bits, and ntohl would have to rotate right by 8 bits.Woolsey
G
8

I wrote a TCP/IP stack for a UNIVAC 1100 series mainframe many years ago. This was a 36 bit, word addressable computer architecture with 1's complement arithmetic.

When this machine did communications I/O, 8 bit bytes arriving from the outside world would get put into the lower 8 bits of each 9 bit quarter-word. So on this system, ntohl() would squeeze 8 bits in each quarter word down into the lower 32 bits of the word (with the top 4 bits zero) so you could do arithmetic on it.

Likewise, htonl() would take the lower 32 bits in a word and undo this operation to put each 8 bit quantity into the lower 8 bits of each 9 bit quarter word.

So to answer the original question, the ntohl() and htonl() operations on this computer architecture were very different from each other.

For example:

COMP*                                 . COMPRESS A WORD
          LSSL      A0,36             . CLEAR OUT A0
          LSSL      A1,1              . THROW AWAY TOP BIT
          LDSL      A0,8              . GET 8 GOOD ONE'S
          LSSL      A1,1              .
          LDSL      A0,8              .
          LSSL      A1,1              .
          LDSL      A0,8              .
          LSSL      A1,1              .
          LDSL      A0,8              .
          J         0,X9              .
.
DCOMP*                                . DECOMPRESS A WORD
          LSSL      A0,36             . CLEAR A0
          LSSL      A1,4              . THROW OUT NOISE
          LDSL      A0,8              . MOVE 8 GOOD BITS
          LSSL      A0,1              . ADD 1 NOISE BIT
          LDSL      A0,8              . MOVE 8 GOOD BITS
          LSSL      A0,1              . ADD 1 NOISE BIT
          LDSL      A0,8              . MOVE 8 GOOD BITS
          LSSL      A0,1              . ADD 1 NOISE BIT
          LDSL      A0,8              . MOVE 8 GOOD BITS
          J         0,X9              .

COMP is the equivalent to ntohl() and DCOMP to htonl(). For those not familiar with UNIVAC 1100 assembly code :-) LSSL is "Left Single Shift Logical" a registers by a number of positions. LDSL is "Left Double Shift Logical" a pair of registers by the specified count. So LDSL A0,8 shifts the concatenated A0, A1 registers left 8 bits, shifting the high 8 bits of A1 into the lower 8 bits of A0.

This code was written in 1981 for a UNIVAC 1108. Some years later, when we had an 1100/90 and it grew a C compiler, I started a port of the BSD NET/2 TCP/IP implementation and implemented ntohl() and htonl() in a similar way. Sadly, I never completed that work..

If you wonder why some of the Internet RFCs use the term "octet", its because some computers in the day (like PDP-10s, Univacs, etc.) had "bytes" that were not 8 bits. An "octet" was defined specifically to be an 8 bit byte.

Ginsberg answered 30/7, 2016 at 14:36 Comment(2)
UNIVAC 1100 --> images.fineartamerica.com/images-medium-large-5/…Paget
Its similar example is on github.com/akrsnr/c/blob/master/hton_ntoh_difference.cPaget
C
7

I couldn't find the original draft of the Posix spec, but a recent one found online has a hint.

Network byte order may not be convenient for processing actual values. For this, it is more sensible for values to be stored as ordinary integers. This is known as ‘‘host byte order ’’. In host byte order:

The most significant bit might not be stored in the first byte in address order.

**Bits might not be allocated to bytes in any obvious order at all.**

8-bit values stored in uint8_t objects do not require conversion to or from host byte order, as they have the same representation. 16 and 32-bit values can be converted using the htonl(), htons(), ntohl(),and ntohs() functions.

Interesting though is the the following statement is made under the discussion of

The POSIX standard explicitly requires 8-bit char and two’s-complement arithmetic.

So that basically rules out my idea of a 1's complement machine implementation.

But the "any obvious order at all" statement basically suggests that the posix committee at least considered the possibility of posix/unix running on something other than big or little endian. As such declaring htonl and ntohl as differnet implementations can't be ruled out.

So the short answer is "htonl and ntohl are the same implementation, but the interface of two different functions is for future compatibility with the unknown."

Canikin answered 25/7, 2012 at 8:52 Comment(2)
+1. HTML version available here.Bister
So, a crazy machine may have a 8 bits right rotated from network byte order as its host byte order. On such a machine, htonl and ntohl would not be the same. htonl would have to rotate left by 8 bits, and ntohl would have to rotate right by 8 bits.Woolsey
S
-2

Not all machines will have the same endianness, and these methods take care of that. It is given that 'network order' is big endian. If you have a machine that is running a big endian architecture and you run ntohl, the output will be the same as the input (because the endianness is the same as network). If your machine is a little endian architecture, ntohl will convert the data from big to little endian. The same can be said about htonl (converts host data to network byte order when necessary). To answer your question, these two operations are not equivalent when you're transmitting data between two machines with different endianness.

Silverman answered 23/7, 2012 at 17:52 Comment(5)
That wasn't my question. When are these operations not equivalent on the same machine?Canikin
@Canikin it seems like he answers it. On big-endian machines, ntohl(x) == x and htonl(x) == x.Johnnie
I understand.. it seems that the ntohl and htonl would run the same underlying conversion routine and thus they'd always be equivalent on the same machine, but I haven't been able to find any documentation to verify thisSilverman
He's answering the wrong question. I'm asking "what kind of computer exists such that "htonl(x) != ntohl(x)" is not a true expression.Canikin
Or what I really meant in the above comment. When does "htonl(x) == nhtol(x)" not evaluate to true?Canikin

© 2022 - 2024 — McMap. All rights reserved.