What is the rationale for non-addressable functions in namespace std?
Asked Answered
P

1

15

[namespace.std] disallows taking the address of, or a reference to, most functions in the std namespace. This is a big pitfall, as it often seems to work to pass a standard-library function as an argument, even though this could stop working, or worse, on a different compiler.

Presumably, this was done to allow implementations to optimize the standard library specially. This restriction makes C++ harder to use.

Can you give explicit examples of how C++ implementations benefit from this restriction on the std namespace?

If these optimizations are so important as to warrant making C++ harder to use, why don't some non-system libraries need the same thing?

Precedency answered 1/7, 2022 at 16:51 Comment(13)
Related: Can a pointer be formed to a non-addressable function from STD in an unevaluated context?. There answer there touches upon one reason for the rule. Also mentioned in Holding or passing around non-addressable-functions since C++20Marcomarconi
The rationale as I've understood it is to allow implementations freedom. I have myself been bitten by this in the past when I moved "working" code and tried compiling it on a different platform where the "functions" were compiler built-ins that one couldn't possibly take the addresses of since they were not even real functions.Camisole
Most probably it's not about optimization, but about allowing the library to silently overload or template some functions.Murguia
Some non-addressable functions have traditionally been implemented as assembly instructions emitted in place. Kind of a "hard coded" inline— a kind of intrinsic. Other compilers may implement them as "proper" functions, however relying on that will be non-portable. (I recall one compiler implemented some of them as C-preprocessor macros.) As a legit workaround, you can wrap them in a lambda to make them addressable (via the lambda).Spreadeagle
i once read something along the line of what Eljay wrote. In my words from memory: The standard only specifies what happens when you call a function with certain parameters, eg foo(1,1), but that does not necessarily imply that there is such a function void (int,int), it could be void(int,int,int=0) or something else entirelyLingcod
"harder to use" is a very generic term and also very subjective. I understand your curiosity and the question is completely reasonable. However your reasoning that "this restriction makes C++ harder to use" is a personal opinion and the resulting question that is built on that opinion "warrant making C++ harder to use" has no real objective answer. Hence no one really tackled that part of your question.Pickens
btw I would suggest to remove the language-lawyer tag and use language-design instead. This isnt an answer you can find by reading and lawyering about the standardLingcod
@Pickens It makes the language specification longer (one more rule) and it makes the language less consistent (instead of one concept, a function, we now have two, an addressable and a non-addressable function). I think it's hard to argue that eliminating non-addressable functions wouldn't make the language easier to use. My question is at what cost?Precedency
@Spreadeagle given that modern compilers can inline incredibly complex things--e.g., I've even seen gcc inline calls to function pointers in a constexpr array of function pointers --I wonder if this rationale of compiler intrinsics being easier to inline was previously valid but is now just out of date.Precedency
@Murguia that is a good theory, so I'm curious if there are examples.Precedency
An intrinsic is not the same as an inline. When you use the SSE2 intrinsic _mm256_zeroall(), it's injecting the SSE2 assembly instructions in place. There is no function; it's not inlining a function. It's more like (very old school) a bunch of emit code that blindly outputs arbitrary bytes into the function (presumably those bytes are carefully curated machine code bytes).Spreadeagle
@Spreadeagle Yes, but given how good inlining has gotten, there's really no penalty for throwing an inline function around an intrinsic. E.g., suppose that std::copy can be implemented more efficiently as an intrinsic. Why not have __builtin_std_copy as the intrinsic and std::copy as an inline wrapper around that?Precedency
Ahh, I see. Yes, that could be done, and would allow intrinsic NAFs and macro NAFs as implementation details to be made addressable and while easily allowing for inlining (because optimizers are amazingly powerful these days) without any loss of expressivity and performance. Due to backwards compatibility, I hold no hope that WG21 would be on board with that kind of change for the NAFs called out in the standard. Alas. (I've been burned by NAFs. Likely all of us here have been burned by them, at one time or another. Learning opportunity.)Spreadeagle
C
10

Firstly, it's worth noting that this design did not originate in C; it's entirely new in C++.

For the same syntactic reason, it is permitted to take the address of a library function even if it is also defined as a macro.89

89 This means that an implementation must provide an actual function for each library function, even if it also provides a macro for that function.

- C89 Standard, 4.1.6 Use of library functions

These strict guarantees would be very restrictive in C++ though, for a number of reasons. As a disclaimer, I haven't been able to find quotes from Bjarne himself, so everything that I'm about to say is a collection of community consensus and personal experience.

1. Adding overloads may break source compatiblity

Say you have a function:

bool is_even(int x) { return x % 2 == 0; }

It may be initially safe to call std::partition(begin, end, is_even), but if an overload for long and long long was added in addition to int, then the use of is_even would become ill-formed.

Essentially, any addressable function cannot receive extra overloads in the future because it breaks existing code. This is why [namespace.std] specifically says "possibly ill-formed".

2. Signatures are more prone to change than in C

Another way to break compatibility is to make an existing function more generic. For example, it lets the standard library make its functions more generic, such as turning:

// possible historical implementation in <math.h> until C++11
#define isnan(x) implementation-defined

into

bool isnan(long double); //possibly with overloads for float and double

and subsequently into

bool isnan(std::floating_point auto);

With features such as function overloading and templates, the implementation of a function can change drastically over time.

Of course, no one could have foreseen these drastic changes in the math library, but the restrictions on non-addressable functions have made them possible without breaking any conforming code.

3. Functions may not have an address

There are two possible reasons why a function might not have any address:

  • it is an intrinsic function
    • this means a function call just tells the compiler to produce some IR instructions, and an actual function might not even exist
  • it is an immediate function (i.e. consteval, C++20)

The former reason may have been a significant contributor to the decision. Nowadays, there is usually an inline function wrapper around any intrinsics used in the standard library, but this would turn into common practice was not obvious back in the day.

A more modern example of intrinsic standard library functions is std::move, which was made "kinda intrinsic" in the MSVC STL. See Improving the State of Debug Performance in C++.

4. "Functions" may be function objects

In C++, it is also possible to implement a function as a function object, such as:

// inline has only been added in C++17, but compilers could have supported its
// functionality before that
inline const struct {
    float operator()(float x) const;
} sqrt;

If a function is actually a function object, then there would be a difference in behavior when taking its address, as you wouldn't get a function pointer. However, calling it would behave the same (except ADL doesn't take place).

This is yet another form of flexibility that is made possible by making functions non-addressable.

Conclusion

Making it possible to take the address of standard library functions would have significantly reduced the flexibility of implementers. Almost any change, such as adding overloads would break compatibility, effectively freezing language progress.

This is not a big issue in C, where function signatures are frozen anyway, but would have significant negative consequences in C++.

Catharinecatharsis answered 31/8, 2023 at 20:21 Comment(3)
I wonder if #4 will become more common now that C++23 allows static operator().Precedency
It seems to me that #4 is not really possible unless some other mechanism already rules out ADL or overloads. Ironically it also seems to me that implementing functions as inline function object instances (~niebloids?) also breaks if the standard extends the overloads. I guess it's different because the standard doesn't care about breaking standard library implementations :)Anastice
#4 is actually applied in all known implementations - currently, ranges algorithms are implemented as function objects in all implementations. And no implementation will differentiate niebloids and CPOs (MSVC STL did, but reverted the decision recently).Darendaresay

© 2022 - 2025 — McMap. All rights reserved.