I came across two threads:
Socket with recv-timeout: What is wrong with this code?
Reading / Writing to a socket using a FILE stream in c
one uses htonl
and the other doesn't.
Which is right?
I came across two threads:
Socket with recv-timeout: What is wrong with this code?
Reading / Writing to a socket using a FILE stream in c
one uses htonl
and the other doesn't.
Which is right?
Since other constants like INADDR_LOOPBACK
are in host byte order, I submit that all the constants in this family should have htonl
applied to them, including INADDR_ANY
.
(Note: I wrote this answer while @Mat was editing; his answer now also says it's better to be consistent and always use htonl
.)
Rationale
It is a hazard to future maintainers of your code if you write it like this:
if (some_condition)
sa.s_addr = htonl(INADDR_LOOPBACK);
else
sa.s_addr = INADDR_ANY;
If I were reviewing this code, I would immediately question why one of the constants has htonl
applied and the other does not. And I would report it as a bug, whether or not I happened to have the "inside knowledge" that INADDR_ANY
is always 0 so converting it is a no-op.
The code you write is not only about having the correct runtime behavior, it should also be obvious where possible and easy to believe it is correct. For this reason you should not strip out the htonl
around INADDR_ANY
. The three reasons for not using htonl
that I can see are:
htonl
because they will know it does nothing (since they know the value of the constant by heart).htonl
. I submit that people who know C but do not have expertise with its socket APIs will find it easier to maintain code if it is consistent. I will agree to disagree with your assertion that it is better to write code which is not friendly to newbies. –
Injun htonl
and friends, so there will be zero performance impact. –
Meeker INADDR_ANY
is the "any address" in IPV4. That address is 0.0.0.0
in dotted notation, so 0x000000
in hex on any endianness. Passing it through htonl
has no effect.
Now if you want to wonder about other macro constants, look at INADDR_LOOPBACK
if it's defined on your platform. Chances are it will be a macro like this:
#define INADDR_LOOPBACK 0x7f000001 /* 127.0.0.1 */
(from linux/in.h
, equivalent definition in winsock.h
).
So for INADDR_LOOPBACK
, an htonl
is necessary.
For consistency, it could thus be better to use htonl
in all cases.
htonl()
introduces a code maintenance bug is naive -- you clearly haven't actually done any sockets interface programming. There is no occassion when maintenance would require changing an INADDR_ANY to an INADDR_LOOPBACK, and there is no such person as a qualified maintainer of sockets code who doesn't know that INADDR_ANY stands for zero. I would say better not to dumb it down and invite an unqualified maintainer into the source -- they cost more than they benefit. –
Medorra Neither is right, in the sense that both INADDR_ANY
and htonl
are deprecated, and lead to complex, ugly code that only works with IPv4. Switch to using getaddrinfo
for all of your socket address creation needs:
struct addrinfo *ai, hints = { .ai_flags = AI_PASSIVE|AI_ADDRCONFIG };
getaddrinfo(0, "1234", &hints, &ai);
Replace "1234"
with your port number or service name.
atoi
than using htons
, but otherwise, OK. –
Medorra AI_NUMERICSERV
is only needed to inhibit string-based service lookup. If your service string is a number anyway, it should be a no-op. But it couldn't hurt to include it. –
Downhearted /etc/services
or first try atoi
on the parameter? –
Medorra /etc/services
when the argument is numeric. Then again glibc never ceases to amaze me, so you might want to use strace
and check that... ;-) –
Downhearted /etc/services
was not opened when the service was a number (as in your answer). It was opened when the service was a name (like "asp"). This was using glibc 2.13 (the current stable). –
Injun AI_NUMERICSERV
. The only use would be explicitly rejecting non-numeric service names, but you could just as easily have rejected them earlier yourself. Thanks for checking. –
Downhearted getaddrinfo
is not a panacea, and that the much simpler inet_pton
should be used if possible - blog.powerdns.com/2014/05/21/… –
Adulterine inet_pton
cannot work, for example, with link-local addresses requiring a scope id, unless you add address-family-specific logic on top of it. The blog post you linked is about a stupid glibc bug which has hopefully been reported and fixed. If not somebody should do that. In the post-Drepper era glibc is much better about actually fixing bugs rather than closing them as WONTFIX. –
Downhearted Stevens uses htonl(INADDR_ANY)
consistently in the book UNIX Network Programming (my copy is from 1990).
The current release version of FreeBSD defines 12 INADDR_
constants in netinet/in.h
; 9 of the 12 require htonl()
for proper functionality. (The 9 are INADDR_LOOPBACK
and 8 other multicast group addresses such as INADDR_ALLHOSTS_GROUP
and INADDR_ALLMDNS_GROUP
.)
In practice, it makes no difference whether you use INADDR_ANY
or htonl(INADDR_ANY)
, other than the possible performance hit from htonl()
. And even that possible performance hit may not exist -- with my 64-bit gcc 4.2.1
, turning on any level of optimization at all seems to activate compile-time htonl()
conversion of constants.
In theory it would be possible for some implementer to redefine INADDR_ANY
to a value where htonl()
actually does something, but such a change would break tens of thousands of existing pieces of code out there and wouldn't survive in the "real world"... Too much code exists which depends explicitly or implicitly on INADDR_ANY
being defined as some sort of zero-valued integer. Stevens likely didn't intend for anyone to assume that INADDR_ANY
is always zero when he wrote:
cli_addr.sin_addr.s_addr = htonl(INADDR_ANY); cli_addr.sin_port = htons(0);
In assigning a local address for the client using
bind
, we set the Internet address toINADDR_ANY
and the 16-bit Internet port to zero.
Was going to add this as a comment, but it got a little long-winded ...
I think it's clear from the answers and the commentary here that htonl()
needs to be used on these constants (albeit that calling it on INADDR_ANY
and INADDR_NONE
are tantamount to no-ops). The problem that I see as to where the confusion arises is that it is not explicitly called out in documentation - someone please correct me if I simply missed it, but I have not seen in the man pages, nor in the include header where it explicitly states that the defines for INADDR_*
are in host order. Again, not a big deal for INADDR_ANY
, INADDR_NONE
, and INADDR_BROADCAST
, but it is significant for INADDR_LOOPBACK
.
Now, I've done quite a bit of low-level socket work in C, but the loopback address rarely, if ever, gets used in my code. Although this topic is over a year old, this very problem just jumped up to bite me in the behind today, and it was because I went on the mistaken assumption that the addresses defined in the include header are in network order. Not sure why I had that idea - probably because the in_addr
structure needs to have the address in network order, inet_aton
and inet_addr
return their values in network order, and so my logical assumption was that these constants would be usable as-is. Throwing together a quick 5-liner to test that theory showed me otherwise. If any of the powers-that-be happen to see this, I would make the suggestion to explicitly call out that the values are, in fact, in host order, not network order, and that htonl()
should be applied to them. For consistency's sake, I would also suggest, as others have done so already here, that htonl()
be used for all of the INADDR_*
values, even if it does nothing to the value.
Let's summarize it a little bit, as none of the previous answers seems to be up to date and I may not be the last person who will see this question page. There have been opinions both for and against usage of htonl around INADDR_ANY constant or avoiding it entirely.
Nowadays (and it's been nowadays for quite some time now) system libraries are mostly IPv6 ready, so we use IPv4 as well as IPv6. The situation with IPv6 is much easier as the data structures and constants don't suffer from byte order. One would use 'in6addr_any' as well as 'in6addr_loopback' (both struct in6_addr type) and both of them are constant objects in the network byte order.
See why IPv6 doesn't suffer from the same problem (if IPv4 addresses were defined as four byte arrays they wouldn't suffer either):
struct in_addr {
uint32_t s_addr; /* address in network byte order */
};
struct in6_addr {
unsigned char s6_addr[16]; /* IPv6 address */
};
For IPv4, it would be nice to also have 'inaddr_any' and 'inaddr_loopback' as 'struct in_addr' constants (so that they can also be compared with memcmp or copied with memcpy). Indeed it might be a good idea to create them in your program as they aren't provided by glibc and other libraries:
const struct in_addr inaddr_loopback = { htonl(INADDR_LOOPBACK) };
With glibc, this only works for me inside a function (and I can't make it static
), as htonl
is not a macro but an ordinary function.
The problem is that glibc (in contrast with what was claimed in other answers) doesn't provide htonl as a macro but rather as a function. Therefore you would have to:
static const struct in_addr inaddr_any = { 0 };
#if BYTE_ORDER == BIG_ENDIAN
static const struct in_addr inaddr_loopback = { 0x7f000001 };
#elif BYTE_ORDER == LITTLE_ENDIAN
static const struct in_addr inaddr_loopback = { 0x0100007f };
#else
#error Neither big endian nor little endian
#endif
That would be a really nice addition to the headers and then you could work with IPv4 constants as easily as you can with IPv6.
But then to implement that, I had to use some constants to initialize that. When I know the respective bytes exactly, I don't need any constants. Just as some people claim that htonl()
is redundant for a constant that evaluates to zero, anyone else could claim that the constant itself is redundant as well. And he would be right.
In the code I prefer to be explicit than implicit. Therefore if those constants (like INADDR_ANY, INADDR_ALL, INADDR_LOOPBACK) are all consistently in host byte order, then it's only correct if you treat them like that. See for example (when not using the above constant):
struct in_addr address4 = { htonl(use_loopback ? INADDR_LOOPBACK : INADDR_ANY };
Of course you could say that you don't need to call htonl
for INADDR_ANY and therefore you could:
struct in_addr address4 = { use_loopback ? htonl(INADDR_LOOPBACK) : INADDR_ANY };
But then when ignoring the byte order of the constant because it's zero anyway, then I don't see much logic in using the constant at all. And the same applies to INADDR_ALL, as it's easy to type 0xffffffff as well;
Another way to get around it is to avoid setting those values directly altogether:
struct in_addr address4;
inet_pton(AF_INET, "127.0.0.1", &address4);
This adds a little bit of useless processing but it has no byte order problems and it is virtually the same for IPv4 and IPv6 (you just change the address string).
But the question is why are you doing that at all. If you want to connect()
to IPv4 localhost (but sometimes to IPv6 localhost, or just any hostname), getaddrinfo() (mentioned in one of the answers) is much better for that, as:
It is a function used for translating any hostname/service/family/socktype/protocol a
to a list of matching struct addrinfo
records.
Each struct addrinfo
includes a polymorphic pointer to struct sockaddr
that you can directly use with connect()
. Therefore you don't need to care about the construction of struct sockaddr_in
, typecasting (via a pointer) to struct sockaddr
, etc.
struct addrinfo *ai, hints = { .ai_family = AF_INET }; getaddrinfo(0, "1234", &hints, &ai);
record that in turn include pointers polymorphic struct sockaddr
structures which you need for the connect()
call.
So, the conclusion is:
1) The standard API fails to provide directly usable struct in_addr
constants (instead it provides rather useless unsigned integer constants in host order).
struct addrinfo *ai, hints = { .ai_family = AF_INET, .ai_protocol = IPPROTO_TCP };
int error;
error = getaddrinfo(NULL, 80, &hints, &ai);
if (error)
...
for (item = result; item; item = item->ai_next) {
sock = socket(item->ai_family, item->ai_socktype, item->ai_protocol);
if (sock == -1)
continue;
if (connect(sock, item->ai_addr, item->ai_addrlen) != -1) {
fprintf(stderr, "Connected successfully.");
break;
}
close(sock);
}
When you are sure your query is selective enough that it only returns one result, you could do (omitting error handling for brevity) the following:
struct *result, hints = { .ai_family = AF_INET, .ai_protocol = IPPROTO_TCP };
getaddrinfo(NULL, 80, &hints, &ai);
sock = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
connect(sock, result->ai_addr, result->ai_addrlen);
If you're afraid getaddrinfo()
might be significantly slower than using the constants, the system library is the best place to fix that. A good implementation would just return the requested loopback address when service
is null and hints.ai_family
is set.
I don't usually like to answer when there is already a "decent" answer. In this case, I am going to make an exception because information I added to these answers is being misconstrued.
INADDR_ANY
is defined as an all-zero-bits IPv4 address, 0.0.0.0
or 0x00000000
. Calling htonl()
on this value will result in the same value, zero. Therefore, calling htonl()
on this constant value is not technically necessary.
INADDR_ALL
is defined as an all-one-bits IPv4 address, 255.255.255.255
or 0xFFFFFFFF
. Calling htonl()
with INADDR_ALL
will return INADDR_ALL
. Again, calling htonl()
is not technically necessary.
Another constant defined in the header files is INADDR_LOOPBACK
, defined as 127.0.0.1
, or 0x7F000001
. This address is given in network-byte order, and cannot be passed to the sockets interface without htonl()
. You must use htonl()
with this constant.
Some would suggest that consistency and code readability demand that programmers use htonl()
for any constant named INADDR_*
-- because it is required for some of them. These posters are wrong.
An example given in this thread is:
if (some_condition)
sa.s_addr = htonl(INADDR_LOOPBACK);
else
sa.s_addr = INADDR_ANY;
Quoting from "John Zwinck":
"If I were reviewing this code, I would immediately question why one of the constants has htonl applied and the other does not. And I report it as a bug, whether or not I happened to have the "inside knowledge" that INADDR_ANY is always 0 so converting it is a no-op. And I think (and hope) many other maintainers would do the same."
If I were receiving such a bug report, I would immediately throw it away. This process would save me a lot of time, fielding bug reports from people who don't have the "basic minimum knowledge" that INADDR_ANY
is always 0. (Suggesting that knowing the values of INADDR_ANY
et al. somehow violates encapsulation or whatever is another non-starter -- the same numbers are used in the netcat
output and inside the kernel. Programmers need to know the actual numerical values. People who don't know aren't lacking inside knowledge, they are lacking basic knowledge of the area.)
Really, if you have a programmer maintaining sockets code, and that programmer doesn't know the bit patterns of INADDR_ANY and INADDR_ALL, you are already in trouble. Wrapping 0 in a macro which returns 0 is the kind of mentality that is a slave to meaningless consistency and doesn't respect domain knowledge.
Maintaining sockets code is about more than understanding C. If you don't understand the difference between INADDR_LOOPBACK
and INADDR_ANY
at a level compatible with netstat
output, then you are dangerous in that code and shouldn't be changing it.
Straw-man arguments proposed by Zwinck regarding the needless use of htonl()
:
This is a straw argument because we have a portrayal that experienced socket programmers know the value of INADDR_ANY
by heart. This is like writing that only an experienced C programmer knows the value of NULL
by heart. Writing "by heart" gives the impression that the number is slight difficult to memorize, perhaps a few digits, such as 127.0.0.1
. But no, we are hyperbolically discussing the difficult of memorizing the patterns named "all zero bits" and "all one bits."
Considering that these numerical values appear in the output of, e.g., netstat
and other system utilities, and also considering that some of these values appear in IP headers, there is no such thing as a competent sockets programmer who does not know these values, whether by heart or by brain. In fact, attempting sockets programming without knowing these basics can be dangerous to the network availability.
This argument is intended to be absurd and dismissive, so it doesn't need much refuting.
It's hard to know where this argument came from. It could be an attempt to supply stupid-seeming arguments to the opposition. In any case, not using the htonl()
macro makes no difference to performance when you provide a constant and use a typical C compiler -- the constant expressions are reduced to a constant in either case.
A reason not to use htonl()
with INADDR_ANY is that most experienced sockets programmer knows that it is not needed. What's more: those programmers who do not know need to learn. There is no extra "cost" with use of htonl()
, the trouble is the cost of establishing a coding standard which fosters ignorance of such critically important values.
By definition, encapsulation fosters ignorance. That very ignorance is the usual benefit of using an encapsulated interface -- knowledge is expensive and finite, therefore encapsulation is usually good. The question becomes: which efforts of programming are best enhanced via encapsulation? Are there programming tasks which are disserved by encapsulation?
It is not technically incorrect to use htonl()
, because it has no effect on this value. However, arguments that you should use it may be misleading.
There are those who would argue that a better situation would be one in which the developer did not need to know that INADDR_ANY
is all zeroes and so on. This land of ignorance is worse, not better. Consider that these "magic values" are used throughout various interfaces with TCP/IP. For example, when configuring Apache, if you would like to listen only to IPv4 (and not IPv6), you must specify:
Listen 0.0.0.0:80
I have run into programmers who mistakenly supplied the local IP address instead of INADDR_ANY
(0.0.0.0) above. These programmers don't know what INADDR_ANY
is, and they probably wrap it in htonl()
while they are at it. This is the land of abstaction-thinking and encapsulating.
The ideas of "encapsulation" and "abstraction" have been widely accepted and too-widely applied, but they do not always apply. In the domain of IPv4 addressing, it's not appropriate to treat these constant values as "abstract" -- they are converted directly into bits on the wire.
My point is this: there is no "correct" usage of INADDR_ANY
with htonl()
-- both are equivalent. I would not recommend adopting a requirement that the value be used any particular way, because the INADDR_X
family of constants only have four members, and only one of them, INADDR_LOOPBACK
has a value which is different depending on byte ordering. It is better to just know this fact than to establish a standard for using the values which turns a "blind eye" to the bit patterns of the values.
In many other APIs, it is valuable for programmers to proceed without knowing the numeric value or bit patterns of constants used by the APIs. In the case of the sockets API, these bit patterns and values are used as input and displayed pervasively. It is better to know these values numerically than to spend time thinking about using htonl()
on them.
When programming in C, especially, most "use" of the sockets API involves grabbing some other person's source code, and adapting it. This is another reason it is so important to know what INADDR_ANY
is before touching a line which uses it.
htonl
while the distinct member, INADDR_LOOPBACK
means something very different from the rest. Enabling sleepy "not enough coffee" changes is a non-goal in that terrain. Thinking beyond the INADDR_X
family, you can't make assumptions about another family of defined constants, they might already include the htonl()
. –
Medorra © 2022 - 2024 — McMap. All rights reserved.
htonl()
doesn't do anything to the result -- zero in results in zero out; similarly, for INADDR_ALL, 0xFFFFFFFF in tohtonl()
results in 0xFFFFFFFF out. However, INADDR_LOOPBACK is different -- it is specified in network byte order as 0x7F000001. For this constant, use ofhtonl()
is required. – Medorraifconfig
,netstat
,tcpdump
, then they're going to be nothing but confused. If they have basic knowledge of this area, they won't be confused. – Medorra