Why is writing to a non-const object after casting away const of pointer to that object not UB?
Asked Answered
A

5

6

According to the C++ Standard it's okay to cast away const from the pointer and write to the object if the object is not originally const itself. So that this:

 const Type* object = new Type();
 const_cast<Type*>( object )->Modify();

is okay, but this:

 const Type object;
 const_cast<Type*>( &object )->Modify();

is UB.

The reasoning is that when the object itself is const the compiler is allowed to optimize accesses to it, for example, not perform repeated reads because repeated reads make no sense on an object that doesn't change.

The question is how would the compiler know which objects are actually const? For example, I have a function:

void function( const Type* object )
{
    const_cast<Type*>( object )->Modify();
}

and it is compiled into a static lib and the compiler has no idea for which objects it will be called.

Now the calling code can do this:

Type* object = new Type();
function( object );

and it will be fine, or it can do this:

const Type object;
function( &object );

and it will be undefined behavior.

How is compiler supposed to adhere to such requirements? How is it supposed to make the former work without making the latter work?

Asymptomatic answered 16/12, 2011 at 6:33 Comment(3)
Why do you make a promise if you intend to break it right away? const is a promise from the programmer to the compiler (and a contract that other programmers reusing the component agree on), no more and no less. The compiler may or may not do something differently according to that promise, but that is circumstantial. Now, the thing is, if something is not constant, you should not give that promise in the first place.Disdain
@Damon: In real life one party writes the function, the other writes the calling code and they can't affect each other.Asymptomatic
@Daemon There are case where you do keep the promise -- that is, the object is unchanged when the function ends -- but you make temporary changes to it during execution, for various reasons.Environmentalist
R
6

When you say "How it is supposed to make the former work without making the latter work?" an implementation is only required to make the former work, it needn't - unless it wants to help the programmer - make any extra effort in trying to make the latter not work in some particular way. The undefined behavior gives a freedom to the implementation, not an obligation.

Take a more concrete example. In this example, in f() the compiler may set up the return value to be 10 before it calls EvilMutate because cobj.member is const once cobj's constructor is complete and may not subsequently be written to. It cannot make the same assumption in g() even if only a const function is called. If EvilMutate attempts to mutate member when called on cobj in f() undefined behavior occurs and the implementation need not make any subsequent actions have any particular effect.

The compiler's ability to assume that a genuinely const object won't change is protected by the fact that doing so would cause undefined behavior; the fact that it does, doesn't impose additional requirements on the compiler, only on the programmer.

struct Type {
    int member;
    void Mutate();
    void EvilMutate() const;
    Type() : member(10) {}
};


int f()
{
    const Type cobj;
    cobj.EvilMutate();
    return cobj.member; 
}

int g()
{
     Type obj;
     obj.EvilMutate();
     return obj.member; 
}
Roundelay answered 16/12, 2011 at 10:40 Comment(0)
C
3

The compiler can perform optimization only on const objects, not on references/pointers to const objects (see this question). In your example, there is no way the compiler can optimize function, but he can optimize the code using a const Type. Since this object is assumed by the compiler to be constant, modifying it (by calling function) can do anything, including crashing your program (for example if the object is stored in read-only memory) or working like the non-const version (if the modification does not interfere with the optimizations)

The non-const version has no problem and is perfectly defined, you just modify a non-const object so everything is fine.

Capita answered 16/12, 2011 at 9:14 Comment(2)
The compiler can optimize function if it inlines the call, or creates a separate definition that must only be called for objects defined as const. Both possibilities are becoming more and more likely, nowadays even if function is defined in a separate translation unit.Intercostal
@hvd: you are right, I overlooked inlining since it is not really an optimization of function per se, but the possibility of having two versions of a function depending on the constness of the object given did not come to my mind and is very interesting.Capita
T
2

If an object is declared const, an implementation is allowed to store it in such a way that attempts to modify it could cause hardware traps, without having any obligation to ensure any particular behavior for those traps. If one constructs a const pointer to such an object, recipients of that pointer will not generally be allowed to write it, and would thus be in no danger of triggering those hardware traps. If code casts away the const-ness and writes to the pointer, a compiler would be under no obligation to protect the programmer against any hardware oddities that might occur.

Further, in the event that a compiler can tell that a const object is always going to contain a particular sequence of bytes, it could inform the linker of that, and allow the linker to see if that sequence of bytes occurs anywhere in the code and, if so, regard the address of the const object as being the location of that sequence of bytes (complying with various restrictions about different objects having unique addresses might be a little tricky, but it would be permissible). If the compiler told the linker that a const char[4] was always supposed to contain a sequence of bytes that happened to appear within the compiled code for some function, a linker could assign to that variable the address within the code where that byte sequence appears. If the const was never written, such behavior would save four bytes, but writing to the const would arbitrarily change the meaning of the other code.

If writing to an object after casting away const was always UB, the ability to cast away const-ness wouldn't be very useful. As it is, the ability often plays a role in situations where a piece of code holds onto pointers--some of which are const and some of which will need to be written--for the benefit of other code. If casting away the const-ness of const pointers to non-const objects weren't defined behavior, the code which is holding the pointers would need to know which pointers are const and which ones will need to be written. Because const-casting is allowed, however, it is sufficient for the code holding the pointers to declare them all as const, and for code which knows that a pointer identifies a non-const object and wants to write it, to cast it to a non-cast pointer.

It might be helpful if C++ had forms of const (and volatile) qualifiers which could be used on pointers to instruct the compiler that it may (or, in the case of volatile, should) regard the pointer as identifying a const and/or volatile object even if the compiler knows that the object is, and knows that it isn't const and/or isn't declared volatile. The former would allow a compiler to assume that the object identified by a pointer wouldn't change during a pointer's lifetime, and cache data based upon that; the latter would allow for cases where a variable may need to support volatile accesses in some rare situations (typically at program startup) but where the compiler should be able to cache its value after that. I know of no proposals to add such features, though.

Terrorist answered 23/7, 2015 at 16:13 Comment(17)
"If writing to an object after casting away const was always UB, the ability to cast away const-ness wouldn't be very useful." IIRC const_cast was introduced to deal with "legacy" APIs that are not const-correct; i.e. to deal with cases where a function does not modify the object pointed to, but doesn't take a T const* but a T*. (D&E uses strchr as an example)Fremd
@dyp: The strchr function is a nice example of something which handles pointers which might or might not be const for the benefit of other code which might or might not need to write to them. In the days before templates, it may have been worthwhile to have separate const- and non-const implementations for some very-frequently-used methods, but having to code all such functions twice would have been sufficiently painful that almost any kludge to accomplish a const-cast would have been justifiable. Once templates were added things might have been less painful at the source code level, but...Terrorist
"which handles pointers which might or might not be const for the benefit of other code which might or might not need to write to them" strchr was designed well before const made it into C or C++. D&E suggests in the said example to introduce an overload char const* strchr(const char* p, char c) { return strchr(const_cast<char*>(p), c); } Later, Stroustrup even writes "Note that the result of casting away const from an object originally defined const is undefined (§13.3)" which deviates from today's rules, but illuminates the original purpose of const_cast.Fremd
...compilation times and code size would still have been burdened by the need to compile separate const-pointer and non-const-pointer versions of a lot of methods (even if char *foo(char*) and char const *foo(char const*) perform the same action, I think the C++ standard would require that their addresses compare as distinct; thus, if char *bar(char*) and char const *bar(char const*) call the above methods, their code couldn't match unless the linker kept track of a "real" address and a "reported" address for each function (with the latter identifying a JMP to the real one).Terrorist
D&E actually suggests that strchr overload is inline; so the compiler should only export it if it is indeed not inlined. However, due to the function being essentially a no-op, I think this is quite unlikely (<=> it will most likely be inlined). Yes, it will impact compilation times, but simplify const correctness. Just a trade-off. (Interestingly, TC++PL says const_cast is used "for getting write access to something declared as const")Fremd
@dyp: The strchr overload you show makes use of a const-cast. One could perhaps minimize code duplication for the particular case of strchr by starting with a version with a const argument/return and then char * strchr(char *src, char c) { const char *r = strchr((char const*)src, ch); return r ? src+(r-src) : 0;}, but in most cases code duplication would end up being highly contagious.Terrorist
The overload is supposed to be a C++ addition to the (unchanged and unchangeable?) C standard library. That is, char* strchr(char*, char) is the legacy API, and we wrap it via char const* strchr(char const*, char) for the cases where the argument is a char const* and we want const-correctness. A better example would be some strcpy(char*, char*), which we would wrap via strcpy(char* d, char const* s) { return strcpy(d, const_cast<char*>(s)); }Fremd
@dyp: My point is that the need for const-casting would exist even without legacy APIs. If C's strchr didn't exist, but if const-casting weren't available and one wanted to offer a C++ version with overloads for both const-ness forms, trying to avoid duplicated code would be difficult. In some cases it may be possible to have a "non-const" wrapper around a "const" function even without the ability to const-cast, but const-casting would generally be necessary to minimize code duplication even if legacy APIs weren't an issue.Terrorist
I agree you can use const-casting for DRY (even though I typically avoid it in favour of other solutions like forwarding to a function template). However, I disagree with your assessment in your answer that "If writing to an object after casting away const was always UB, the ability to cast away const-ness wouldn't be very useful". As I described earlier, I think one of the main reasons why casting away constness has been allowed has nothing to do with regaining write-access through a const-qualified pointer (strchr with const; legacy APIs).Fremd
@dyp: If the fact that a pointer was passed to strchr as a const char* meant that writing the pointer returned from it would be forbidden despite the return type, then it would be necessary to have a separate method for the same purpose which accepted and returned a char* and never converted it to a const char*. Of course, using a separate method wouldn't satisfy the need for compatibility, but even if that weren't an issue the violation of DRY would still be a legitimate objection to such a rule. As for templates, I'm not sure to what extent they would be able to avoid duplicate...Terrorist
...machine code. My understanding is that the addresses of const char* foo(const char *) and char *foo(char *) are required to be distinct; while it might be possible for a compiler to generate two entry points with JMP instructions to one implementation, and have any direct calls simply invoke the shared implementation directly, if a compiler can't have separate addresses for "get pointer to function" and "call function", the two implementations of a function which accepts either a char* or const char* and passes the pointer to a suitable overload of foo would end up...Terrorist
I'm sorry, I don't quite understand your latest example: "despite the return type" A const-correct strchr forwards the cv-qualification of the first argument. So if you pass a const char*, you get a const char* back. If const_cast couldn't remove constness for writing, then it would be forbidden to write through the return value because of the return type, not despite it. What exactly is the signature of the strchr you're talking about?Fremd
...being slightly different, thus requiring that the compiler generate two functionally-identical functions. Being able to cast away const avoids the need for such duplication.Terrorist
@dyp: Given char const foo1[] = "Foo!"; and char foo2[] = "Foo!";, in order to allow both if (strchr(foo1,ch)) ...` and *strchr(foo2,'!')='?'; it is necessary not only that strchr accept a const char* and cast away the const before returning the result, but also that casting away const makes the resulting pointer writable.Terrorist
@dyp: Further, if casting away const wouldn't enable writing a const pointer to a non-const object, then a method like char *findExclamationMark(char const *st) { return strchr(st, '!'); } would also need to have two versions--one for a const char* and one for a non-const char*.Terrorist
Sorry, I still don't quite understand why you're starting with (I assume) some char const* strchr(char const*, char) instead of historical case char* strchr(char*, char). With the latter, you don't need the cast const away and be writeable property. Also, the wrapper is not exported if it is inlined. I have tried to demonstrate both in this demo: coliru.stacked-crooked.com/a/407cfd846cb9bac8Fremd
Let us continue this discussion in chat.Fremd
S
1

Undefined behavior means undefined behavior. The specification makes no guarantees what will happen.

That doesn't mean it won't do what you intend. Just that you're outside of the boundary of behavior that the specification states should work. The specification is there to say what will happen when you do certain things. Outside of the protection of the spec, all bets are off.

But just because you're off the edge of the map does not mean that you will encounter a dragon. Maybe it'll be a fluffy bunny.

Think of it like this:

class BaseClass {};
class Derived : public BaseClass {};

BaseClass *pDerived = new Derived();
BaseClass *pBase = new Base();

Derived *pLegal = static_cast<Derived*>(pDerived);
Derived *pIllegal = static_cast<Derived*>(pBase);

C++ defines one of these casts to be perfectly valid. The other yields undefined behavior. Does that mean that a C++ compiler actually checks the type and flips the "undefined behavior" switch? No.

It means is that the C++ compiler will more than likely assume that pBase is actually a Derived and therefore perform the pointer arithmetic needed to convert the pBase into a Derived*. If it isn't actually a Derived, then you get undefined results.

That pointer arithmetic may in fact be a no-op; it may do nothing. Or it may actually do something. It doesn't matter; you are now outside of the realm of behavior defined by the specification. If the pointer arithmetic is a no-op, then everything may appear to work perfectly.

It's not that the compiler "knows" that in one instance it's undefined and in another it's defined. It's that the specification does not say what will happen. It may appear to work. It may not. The only times that it will work are when it is done properly in accord with the specification.

The same goes for const casts. If the const cast is from an object that was not originally const, then the spec says that it will work. If it's not, then the spec says that anything can happen.

Snowman answered 16/12, 2011 at 6:37 Comment(5)
I can't agree about "all cases" - it's okay to cast away const if the object is not originally const.Asymptomatic
Where does the specification say that? Where does it say that you can cast away const if the object wasn't "originally" const?Snowman
This answer has a Standard reference https://mcmap.net/q/659880/-is-using-const_cast-for-read-only-access-to-a-const-object-allowed - 7.1.5.1/4Asymptomatic
If casting away const was always undefined behavior, do you think the language would provide const_cast?Capita
@LucTouraille: Being able to cast away const-ness is useful in two scenarios: (1) One wants to pass a const to a function which takes a non-const pointer parameter, but won't actually write to it; (2) a function takes a pointer to something that may or may not be const, has some means outside the pointer of knowing whether it is in fact const, and may want to write to it if it isn't. Casting away const in either scenario could be useful even if the other scenario was UB. In fact, both scenarios are okay.Terrorist
I
0

In theory, const objects are allowed to be stored in read-only memory in some cases, which would cause obvious problems if you try to modify the object, but a more likely case is that if at any point the definition of the object is visible, so that the compiler can actually see that the object is defined as const, the compiler can optimise based on the assumption that members of that object do not change. If you call a non-const function on a const object to set a member, and then read that member, the compiler could bypass the read of that member if it already knows the value. After all, you defined the object as const: you promised that that value wouldn't change.

Undefined behaviour is tricky in that it often seems to work as you expect, until you make one slight modification.

Intercostal answered 16/12, 2011 at 6:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.