Address held by pointer changes after pointer is deleted
Asked Answered
S

2

1

In the following code, why is the address held by pointer x changing after the delete? As I understand, the deletecall should free up allocated memory from heap, but it shouldn't change the pointer address.

using namespace std;
#include <iostream>
#include <cstdlib>

int main()
{
    int* x = new int;
    *x = 2;

    cout << x << endl << *x << endl ;

    delete x;

    cout << x << endl;

    system("Pause");
    return 0;
}

OUTPUT:
01103ED8
2
00008123

Observations: I'm using Visual Studio 2013 and Windows 8. Reportedly this doesn't work the same in other compilers. Also, I understand this is bad practice and that I should just reassign the pointer to NULL after it's deletion, I'm simply trying to understand what is driving this weird behaviour.

Surmullet answered 22/12, 2013 at 0:40 Comment(2)
Wierd behavior is sub-set of undefined behavior.Sudan
@Sudan (not to mention that it doesn't even require UB for delete to change the pointer's value.)Vauntcourier
V
5

As I understand, the deletecall should free up allocated memory from heap, but it shouldn't change the pointer address.

Well, why not? It's perfectly legal output -- reading a pointer after having deleted it leads to undefined behavior. And that includes the pointer's value changing. (In fact, that doesn't even need UB; a deleted pointer can really point anywhere.)

Vauntcourier answered 22/12, 2013 at 0:44 Comment(15)
It's definitely the case that dereferencing a pointer after having deleted it provokes UB, but I am uncertain about examining the value of the pointer. In particular, I would have expected int *x, *y; x = new int(0); y = x; delete x; assert(x == y); to be strictly conforming and not trigger the assertion. Can you cite where the standard licenses delete to change the pointer value?Crashing
@Zack Don't be uncertain about that, touching the pointer in any way (except assigning a new, valid address to it) after it has been invalidated is UB.Vauntcourier
@H2CO3 Seriously, chapter and verse please.Crashing
@Zack You seem to be way too sceptical. Why? C++11, 3.7.4.2 p4 : "The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined."Vauntcourier
@BenjaminLindley Asserting that "but this is not UB because I assume XYZ" is.Vauntcourier
Certainly. But as nobody in this comment thread asserted anything like that, I can't see how it's relevant.Silber
@BenjaminLindley "It's definitely the case that dereferencing a pointer after having deleted it provokes UB, but I am uncertain about examining the value of the pointer" - isn't that enough?Vauntcourier
No, it is not. Stating that he is uncertain about whether it is UB is not anywhere close to asserting that it is not UB.Silber
@BenjaminLindley I can't believe you don't feel the strong assertion "but this is not UB because he's not dereferencing the pointer" behind that sentence. It's just formulated politely (or sugar-coated).Vauntcourier
Well, perhaps it's because I myself was just as uncertain, and appreciate the citation. Thank you.Silber
@BenjaminLindley You're welcome! It's worth reading a bit more around in that section (for me too, I'm no C++ expert...), there are interesting things to be found.Vauntcourier
@Zack No, I am not. But ask some veteran C++ people around here, and you will get the same answer. This is basically the same case as reading the value of an uninitialized variable (and that's UB too).Vauntcourier
@Zack Also, here are some answers to the issue, found on the comp.lang.c++.moderated mailing list: one, two, three - these all point out the reason why it is undefined behavior to even read the value of the invalid pointer.Vauntcourier
@Zack Furthermore, here are answers to questions on Stack Overflow, by some highly reputed C++ programmers that support my claim: one, two.Vauntcourier
@H2CO3 Have now read everything you linked and the relevant standard sections; conclusions too long for a comment; see new answer to this question.Crashing
C
1

Having read relevant bits of both C++98 and C++11 [N3485], and all the stuff H2CO3 pointed to:

Neither edition of the standard adequately describes what an "invalid pointer" is, under what circumstances they are created, or what their semantics are. Therefore, it is unclear to me whether or not the OP's code was intended to provoke undefined behavior, but de facto it does (since anything that the standard does not clearly define is, tautologically, undefined). The text is improved in C++11 but is still inadequate.

As a matter of language design, the following program certainly does exhibit unspecified behavior as marked, which is fine. It may, but should not also exhibit undefined behavior as marked; in other words, to the extent that this program exhibits undefined behavior, that is IMNSHO a defect in the standard. Concretely, copying the value of an "invalid" pointer, and performing equality comparisons on such pointers, should not be UB. I specifically reject the argument to the contrary from hypothetical hardware that traps on merely loading a pointer to unmapped memory into a register. (Note: I cannot find text in C++11 corresponding to C11 6.5.2.3 footnote 95, regarding the legitimacy of writing one union member and reading another; this program assumes that the result of this operation is unspecified but not undefined (except insofar as it might involve a trap representation), as it is in C.)

#include <string.h>
#include <stdio.h>

union ptr {
    int *val;
    unsigned char repr[sizeof(int *)];
};

int main(void)
{
    ptr a, b, c, d, e;

    a.val = new int(0);
    b.val = a.val;
    memcpy(c.repr, a.repr, sizeof(int *));

    delete a.val;
    d.val = a.val; // copy may, but should not, provoke UB
    memcpy(e.repr, a.repr, sizeof(int *));

    // accesses to b.val and d.val may, but should not, provoke UB
    // result of comparison is unspecified (may, but should not, be undefined)
    printf("b %c= d\n", b.val == d.val ? '=' : '!');

    // result of comparison is unspecified
    printf("c %c= e\n", memcmp(c.repr, e.repr, sizeof(int *)) ? '!' : '=');
 }

This is all of the relevant text from C++98:

[3.7.3.2p4] If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined. [footnote: On some implementations, it causes a system-generated runtime fault.]

The problem is that there is no definition of "using an invalid pointer value", so we get to argue about what qualifies. There is a clue to the committee's intent in the discussion of iterators (a category which is defined to include bare pointers):

[24.1p5] ... Iterators can also have singular values that are not associated with any container. [Example: After the declaration of an uninitialized pointer x (as with int* x; [sic]), x must always be assumed to have a singular value of a pointer.] Results of most expressions are undefined for singular values; the only exception is an assignment of a non-singular value to an iterator that holds a singular value. In this case the singular value is overwritten the same way as any other value. Dereferenceable and past-the-end values are always non-singular.

It seems at least plausible to assume that an "invalid pointer" is also meant to be an example of a "singular iterator", but there is no text to back this up; going in the opposite direction, there is no text confirming the (equally plausible) assumption that an uninitialized pointer value is meant to be an "invalid pointer" as well s a "singular iterator". So the hair-splitters among us might not accept "results of most expressions are undefined" as clarifying what qualifies as use of an invalid pointer.

C++11 has changed the text corresponding to 3.7.2.3p4 somewhat:

[3.7.4.2p4] ... Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior. [footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.]

(the text elided by the ellipsis is unchanged) We now have somewhat more clarity as to what is meant by "use of an invalid pointer value", and we can now say that the OP's code's semantics are definitely implementation-defined (but might be implementation-defined to be undefined). There is also a new paragraph in the discussion of iterators:

[24.2.1p10] An invalid iterator is an iterator that may be singular.

which confirms that "invalid pointer" and "singular iterator" are effectively the same thing. The remaining confusion in C++11 is largely about the exact circumstances that produce invalid/singular pointers/iterators; there should be a detailed chart of pointer/iterator lifecycle transitions (like there is for *values). And, as with C++98, the standard is defective to the extent it does not guarantee that copying-from and equality comparison upon such values are valid (not undefined).

Crashing answered 22/12, 2013 at 15:26 Comment(9)
"copying the value of an "invalid" pointer, and performing equality comparisons on such pointers, should not be UB" - You can always make your own language if you don't like C++. The reason this is in the standard is that the "hypothetical" hardware you were addressing isn't hypothetical. Also, the pointers being invalid is the very same issue which is reading an uninitialized variable (trap representations, anyone?).Vauntcourier
@H2CO3 You may have guessed that I don't much care for trap representations either ;-) Anyhow, no, I do not have to just take or leave the language, I am allowed to think the standard is buggy and to point out the bugs. I have good and solid reasons for objecting to the language as it is -- largely the difficulty of being certain that one never copies or compares such values. Your own citations included an observation that given some vector<T*> x, assigning to x[n] (n in range, invalid pointer in slot) might trigger a read from x[n] and therefore be UB.Crashing
I have guessed that, but the thing is, if you are going to write code that relies on UB, that's still not good until the standard disagrees with you (in particular, you will be regarded as an uninformed programmer). You are free to think this and that, and you are free to share your opinion with the committee, but you can't deny the current state of the standard. Nevertheless, I'm not arguing that I like this being UB, and in fact, I think that C++ has quite a few serious flaws which should be amended.Vauntcourier
@H2CO3: As Zack shows here, the behavior is implementation-defined, not undefined, in C++11. (And no, Zack, the implementation can not define the behavior as "undefined". "Undefined behavior" means it can vary from run to run, the compiler can assume you NEVER do that and optimize accordingly, and so on. Implementation-defined means it must do something specific and the documentation for your compiler must say what that is. Otherwise the distinction would be meaningless.) Very good discussion. +1Tanatanach
@Tanatanach The C++11 footnote I quoted gives the implementation license to define this specific situation as runtime-undefined behavior.Crashing
@Zack: But a "runtime fault" is not undefined behavior... "Undefined behavior" has a very specific meaning in the spec. It does not mean "something like a seg fault"; it means any behavior of the program whatsoever will comply with the standard. If the author(s) intended this to be undefined, they would have used that wording instead of "implementation-defined" (which also has a very specific meaning). Anyhow thanks for doing the legwork on this. All of it is news to me.Tanatanach
@Nemo: I would suggest that one of the big mistakes the authors of C made was failing to recognize a category of behavior where implementations had to make a documented and testable selection from among alternatives which could include inconsistent or undefined behaviors. I would posit that a very common program requirement is "Given valid input, code must behave in a precisely-defined fashion; given invalid input, code can do anything within broad constraints." In many cases, the cost for a compiler to put some loose behavioral guarantees on arithmetic overflow...Notebook
...may be far below the extra cost of validation code sufficient to prevent all arithmetic overflows resulting from invalid inputs. In cases where even loose behavioral guarantees would suffice to meet constraints, an optimizer which requires the programmer to add sufficient checks to prevent any form of overflow may end up yielding a program that's less efficient than would have been possible if a programmer could indicate that a compiler must offer some loose guarantees about overflows, and could then let such overflows happen.Notebook
@nemo,supercat C++11 doesn't define the term "system-generated runtime fault", which means a system-generated runtime fault qualifies as an "undefined operation" for purposes of the language in [intro.execution] about "if any such execution contains an undefined operation", and that language brings in all the baggage of explicitly-stated UB. It would be nice if the standard made a distinction between UB that the compiler is and isn't allowed to assume never happens, but that's not what we have right now.Crashing

© 2022 - 2024 — McMap. All rights reserved.