Understanding htonl() and ntohl()
Asked Answered
S

4

24

I am trying to use unix sockets to test sending some udp packets to localhost.

It is my understanding that when setting ip address and port in order to send packets, I would fill my sockaddr_inwith values converted to network-byte order. I am on OSX and I'm astonished that this

printf("ntohl: %d\n", ntohl(4711));
printf("htonl: %d\n", htonl(4711));
printf("plain: %d\n", 4711);

Prints

ntohl: 1729232896
htonl: 1729232896
plain: 4711

So neither function actually returns the plain value. I would have expected to see either the results differ, as x86 is little-endian (afaik), or be identical and the same as the actual number 4711. Clearly I do not understand what htonl and ntohl and their variants do. What am I missing?

The relevant code is this:

int main(int argc, char *argv[])
{
   if (argc != 4)
   {
      fprintf(stderr, "%s\n", HELP);
      exit(-1);
   }

   in_addr_t rec_addr = inet_addr(argv[1]); // first arg is '127.0.0.1'
   in_port_t rec_port = atoi(argv[2]);      // second arg is port number
   printf("Address is %s\nPort is %d\n", argv[1], rec_port);
   char* inpath = argv[3];

   char* file_buf;
   unsigned long file_size = readFile(inpath, &file_buf); // I am trying to send a file
   if (file_size > 0)
   {
      struct sockaddr_in dest;
      dest.sin_family      = AF_INET;
      dest.sin_addr.s_addr = rec_addr; // here I would use htons
      dest.sin_port        = rec_port;
      printf("ntohs: %d\n", ntohl(4711));
      printf("htons: %d\n", htonl(4711));
      printf("plain: %d\n", 4711);
      int socket_fd = socket(AF_INET, SOCK_DGRAM, 0);
      if (socket_fd != -1)
      {
         int error;
         error = sendto(socket_fd, file_buf, file_size + 1, 0, (struct sockaddr*)&dest, sizeof(dest));
         if (error == -1)
            fprintf(stderr, "%s\n", strerror(errno));
         else printf("Sent %d bytes.\n", error);
      }
   }

   free(file_buf);
   return 0;
}
Stirk answered 28/4, 2016 at 20:11 Comment(2)
Note that your text says "htons" and "ntohs", but you're actually calling htonl() and ntohl().Paella
@JohnBollinger Yes, that came from trying both with the same result, thanks for the remark.Stirk
S
13

Both functions reverse the bytes' order (on little-endian machines). Why would that return the argument itself?

Try htons(ntohs(4711)) and ntohs(htons(4711)).

Saul answered 28/4, 2016 at 20:14 Comment(2)
Um…right. I guess the real question is then why I am unable to send anything to localhost unless I use the plain ip address, but that is a separate issue.Stirk
"reverse the bytes" IF they have to be reversed on your system... You can guess that both of them or non of them reverses the bytes... depends on your arch.Elijah
L
33

As others have mentioned, both htons and ntohs reverse the byte order on a little-endian machine, and are no-ops on big-endian machines.

What wasn't mentioned is that these functions take a 16-bit value and return a 16-bit value. If you want to convert 32-bit values, you want to use htonl and ntohl instead.

The names of these functions come from the traditional sizes of certain datatypes. The s stands for short while the l stands for long. A short is typically 16-bit while on older systems long was 32-bit.

In your code, you don't need to call htonl on rec_addr, because that value was returned by inet_addr, and that function returns the address in network byte order.

You do however need to call htons on rec_port.

Lorient answered 28/4, 2016 at 20:22 Comment(1)
If I try to send packets between two programs running on localhost, and treat rec_port with htons, 8080 turns into 36895.Stirk
F
14

"Network byte order" always means big endian.

"Host byte order" depends on architecture of host. Depending on CPU, host byte order may be little endian, big endian or something else. (g)libc adapts to host architecture.

Because Intel architecture is little endian, this means that both functions are doing the same: reversing byte order.

Froghopper answered 28/4, 2016 at 20:18 Comment(1)
"Network byte order" evolved to mean big endian. It was not alway that way. It is extraordinarily common now. en.wikipedia.org/wiki/Endianness#NetworkingWorktable
S
13

Both functions reverse the bytes' order (on little-endian machines). Why would that return the argument itself?

Try htons(ntohs(4711)) and ntohs(htons(4711)).

Saul answered 28/4, 2016 at 20:14 Comment(2)
Um…right. I guess the real question is then why I am unable to send anything to localhost unless I use the plain ip address, but that is a separate issue.Stirk
"reverse the bytes" IF they have to be reversed on your system... You can guess that both of them or non of them reverses the bytes... depends on your arch.Elijah
S
10

these functions are poorly named. Host to network and network to host are actually the same thing and really should be called 'change endianness if this is a little endian machine'

So on a little endian machine you do

net, ie be, number = htonl / ntohl (le number)

and send the be number on the wire. And when you get a big endian number from the wire

le num = htonl/ntohl (net ,ie be, number)

on a big end machine

net, ie be, number = htonl / ntohl (be number)

and

 be num = htonl/ntohl (net ,ie be, number)

and in the last cases you see that these functions do nothing

Stour answered 28/4, 2016 at 20:22 Comment(4)
Well, be careful. Although there are few, if any, machines these days that use a byte order different from both little-endian and big-endian, there have been such machines in the past, with some prominence. It is at least conceivable that there are present or future machines where, say, htonl() is not its own inverse. POSIX engineered these functions to be able to handle that if they should ever need to do.Paella
ok then it should be called 'change endianness to big if this is not a big endian machine'Stour
Even if the two functions do the same thing, having different names for them is still useful because it clarifies the code's intent -- i.e. if I see x=ntohl(blah) I know that the value assigned to x will be semantically meaningful, whereas if I see x=htonl(blah) I know that the value of x is not one that I can rely on to be native-endian. If the code was just x=maybe_endian_swap(blah), then it would not be obvious to me whether I can e.g. printf("%i\n", x); afterwards and see a semantically meaningful value printed.Dovecote
Endianness is not a point of view of origin, it's a point of view of destination: commandcenter.blogspot.com/2012/04/byte-order-fallacy.htmlElijah

© 2022 - 2024 — McMap. All rights reserved.