Is it legal to compare dangling pointers?
Asked Answered
S

3

73

Is it legal to compare dangling pointers?

int *p, *q;
{
    int a;
    p = &a;
}
{
    int b;
    q = &b;
}
std::cout << (p == q) << '\n';

Note how both p and q point to objects that have already vanished. Is this legal?

Slily answered 7/6, 2015 at 13:27 Comment(14)
Define "legal".Nevernever
At least not undefined behaviour.Localize
@rightfold Do I run the risk of getting a cease-and-desist from a language lawyer?Slily
Of some relevance: #17025366Ritaritardando
@OliverCharlesworth ergh, a mixed C and C++ question.. the two languages have considerably different rules in this area. The C standard unambiguously says that p and q are indeterminate here.Lingulate
As a data point, gcc optimizes int*f(){int a;return &a;} to return 0;.Biggs
This kinda needs to become two parts... (1) is it valid to use a dangling pointer to stack object, and (2) if so, what's the result of the comparison. I've tried to address both in my answerLingulate
I would like to know what is the use for doing thisBattista
@EdHeal: There is value in rigour. Take any course in formal semantics to find out why.Corunna
@LightnessRacesinOrbit - what is the pragmatic use for this? As to formal semantics that was a headache that I lost long ago after finishing my MScBattista
@EdHeal: As I said, if you want to know the pragmatic outcome of rigoursly studying formal semantics, [re-]take a course on it. Answering that is way out of the scope of this comment thread. The language-laywer tag exists for questions in that domain. I'm not saying that everyone needs to do so in order to simply produce computer programs, but then not every question need be tagged language-lawyer, and your implication that there's no value in it whatsoever is short-sighted.Corunna
@LightnessRacesinOrbit - I just cannot think of any use for it - and wish to be enlightened to any useBattista
@EdHeal I suppose, in this particular case you could detect if the compiler was doing some optimizationLingulate
A pointer may point legally to anywhere when it may be assigned to NULL. What you should worry about to be illegal is to read its content.Ofay
L
59

Introduction: The first issue is whether it is legal to use the value of p at all.

After a has been destroyed, p acquires what is known as an invalid pointer value. Quote from N4430 (for discussion of N4430's status see the "Note" below):

When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of the deallocated storage become invalid pointer values.

The behaviour when an invalid pointer value is used is also covered in the same section of N4430 (and almost identical text appears in C++14 [basic.stc.dynamic.deallocation]/4):

Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

[ Footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault. — end footnote ]

So you will need to consult your implementation's documentation to find out what should happen here (since C++14).

The term use in the above quotes means necessitating lvalue-to-rvalue conversion, as in C++14 [conv.lval/2]:

When an lvalue-to-rvalue conversion is applied to an expression e, and [...] the object to which the glvalue refers contains an invalid pointer value, the behaviour is implementation-defined.


History: In C++11 this said undefined rather than implementation-defined; it was changed by DR1438. See the edit history of this post for the full quotes.


Application to p == q: Supposing we have accepted in C++14+N4430 that the result of evaluating p and q is implementation-defined, and that the implementation does not define that a hardware trap occurs; [expr.eq]/2 says:

Two pointers compare equal if they are both null, both point to the same function, or both represent the same address (3.9.2), otherwise they compare unequal.

Since it's implementation-defined what values are obtained when p and q are evaluated, we can't say for sure what will happen here. But it must be either implementation-defined or unspecified.

g++ appears to exhibit unspecified behaviour in this case; depending on the -O switch I was able to have it say either 1 or 0, corresponding to whether or not the same memory address was re-used for b after a had been destroyed.


Note about N4430: This is a proposed defect resolution to C++14, that hasn't been accepted yet. It cleans up a lot of wording surrounding object lifetime, invalid pointers, subobjects, unions, and array bounds access.

In the C++14 text, it is defined under [basic.stc.dynamic.deallocation]/4 and subsequent paragraphs that an invalid pointer value arises when delete is used. However it's not clearly stated whether or not the same principle applies to static or automatic storage.

There is a definition "valid pointer" in [basic.compound]/3 but it is too vague to use sensibly.The [basic.life]/5 (footnote) refers to the same text to define the behaviour of pointers to objects of static storage duration, which suggests that it was meant to apply to all types of storage.

In N4430 the text is moved from that section up one level so that it does clearly apply to all storage durations. There is a note attached:

Drafting note: this should apply to all storage durations that can end, not just to dynamic storage duration. On an implementation supporting threads or segmented stacks, thread and automatic storage may behave in the same way that dynamic storage does.


My opinion: I don't see any consistent way to interpret the standard (pre-N4430) other than to say that p acquires an invalid pointer value. The behaviour doesn't seem to be covered by any other section besides what we have already looked at. So I am happy to treat the N4430 wording as representing the intent of the standard in this case.


Lingulate answered 7/6, 2015 at 13:28 Comment(38)
Show Standard quotes.Localize
How is printing a pointer value any less legal than comparing it for equality?Slily
@Slily it's equally legal (i.e.: not)Lingulate
@MattMcNabb I am very interested in the rationale behind your answer, because as far as I know what you have stated is not true - at all.Orgel
The value in the pointer is not "invalid", see [basic.compound]p3.Orgel
@FilipRoséen-refp can you quote which part of that section is relevant? I don't see anything in particularLingulate
@Filip: "A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer (4.10)." I can see why you'd want to use this to say that these are not invalid pointers, but they obviously are. I think this is a wording defect because the sentence I just quoted tells us what values the pointer is physically capable of holding, as distinct from the notion of "validity" that describes whether it actually points to an object (and it's pretty obvious that the rules Matt quotes should apply to dangling pointers; what else could they apply to?!).Corunna
@LightnessRacesinOrbit Comparing two pointers does not mandate that they point to a living object, this is covered in [expr.eq]p1: "Pointers of the same type (after pointer conversions) can be compared for equality. Two pointers of the same type compare equal if and only if they are both null, both point to the same function, or both represent the same address (3.9.2)."Orgel
@LightnessRacesinOrbit "the address of a byte in memory" seems a bit vague, this is a crappy definitionLingulate
@FilipRoséen-refp pointers cannot be compared until after lvalue-to-rvalue conversion has been performed on the glvalues given as argument, and it is that conversion which causes UB (or implementation-defined)Lingulate
@MattMcNabb: Indeed. Not only is it an abstraction leak, but it contradicts with other wording that requires pointers to point to objects.Corunna
@Localize standard quotes includedLingulate
@LightnessRacesinOrbit it is in p2 in N3797, I was (by mistake) looking at N3337. @MattMcNabb the lvalue-to-rvalue conversion is not an issue, since the values of the pointers are not invalid. The quote related to invalid pointer values and undefined behavior are relevant for cases such as int * p = reinterpret_cast<int*> (0xDEADBEEF); // might trap.Orgel
@FilipRoseen: According to your argument, that could not trap, since the memory at 0xDEADBEEF does, in fact, contain a byte, and 0xDEADBEEF is an address.Localize
@FilipRoséen-refp: Please quote a standard, not a draft.Corunna
@FilipRoséen-refp what's your argument to say that the pointers are not invalid? 3.7.4.2/4 shows that it depends on where the pointer is pointing to as to whether it is valid or not.Lingulate
@Localize No, because the address is not obtained by taking the address of an object - as such there is nothing saying that it is a valid location for a pointer-to-int.Orgel
maybe we need a defect report for this, I don't see any reason why deallocated stack objects should have a different status to deallocated heap objectsLingulate
@LightnessRacesinOrbit currently on the move, don't have access to the official release at this time (but if this discussion keeps on growing I will quote the relevant document - and probably write my own answer - later tonight. I have a dinner to attend in 30 minutes, just killing some time at the uni).Orgel
@LightnessRacesinOrbit Please buy me a copy of the standard so I can do that (it'd be great if you could mail me a printed copy, so I can show the actual standard in my answers instead of just its content, which appears to be of no relevance to you (the content, I mean)). Btw., Filip says he would also be interested in a printed copy.Collegium
pointer validity is also talked about in container contexts, several operations "invalidate pointers" and those may have been to stack space (e.g. a std::string using SSO)Lingulate
The rest of us don't buy the Standard. We quote the newest freely available draft, usually FDIS or so, but the wording of such matters does not tend to change much.Localize
@LightnessRacesinOrbit If you know the difference between an Nxxxx document, a FDIS, and an official standard, then you ought to recognize the N-number corresponding to the closest approximation to the official standard which is publicly available online for free. It is ludicrous to expect people to spend several hundred dollars just to have a little more persuasive force in what amounts to a bar-bet argument.Councilwoman
@zwol: actually, it's quite reasonable to stipulate any barrier to entry in order to shoot someone down in what amounts to a bar-bet argument. The point is to win, not to be right ;-) If getting to the right answer was the point, then of course Lightness could have said "...and the published standard is the same/different", rather than attempting to discredit the quotation without replacing it. I mean, I think Lightness is right, but the issue with Filip's quotes is that they don't support his claims, not that they're inaccurate.Approach
@zwol: It is ludicrous to expect anyone to take the word of a standard quote that is not from a standard. As for naming the draft, it is the responsibility of the author, not the reader, to perform the relevant translation from Nxxxx to "C++yy FDIS". Of course Steve is right in that I am only making a side-request; I'm not suggesting that Filip's failure to cite a standard has anything to do with his being wrong: that's a totally different issue. :PCorunna
@Puppy: And by "the rest of us" you mean you and a select few of your internet friends.Corunna
@LightnessRacesinOrbit Personally I'm quite fine with N3936 quotes unless someone with a copy of C++14 specifically steps in and points out a difference (of which, AFAIK, there are none). Same goes for C++11 and N3337.Lingulate
@LightnessRacesInOrbit: I'm afraid that without a statistical study, there is no way to demonstrate that people buy the Standard to quote in random internet discussions.Localize
@LightnessRacesinOrbit: Look, just give up and buy printed copy of standard for everybody with C++ tag on stackoverflow. Then you'll never have to request citation from standard again. Much faster and more efficient than talking about it.Various
@Lightness: I'm simply viewing the original comment you made that is now deleted, in which you use "the rest of us" in exactly the same way as me. I merely find it hypocritical of you.Localize
I disagree with the last line that it may also be unspecified behavior.Zonate
@hackks can you clarify? The implementation may define something like "the value of the pointer is unspecified", or "the value representation remains unchanged", or something, and then the == operator may or may not find they both give the same byte. I have observed true and false with g++ -std=c++14 depending on optimization settingLingulate
@MattMcNabb; Both pointers may point to the same location or may not. Result either be true or false. I do not understand why the value of the pointer would be unspecified?Zonate
@Zonate It is the result of p == q that I am saying would be unspecified. (That was OP's original question)Lingulate
@MattMcNabb; I was talking about your last comment. What I am saying is that, comparing two pointers which are not initialized with any address may or not point to the same address. But it is guaranteed that they will point to some memory location (random). In that case the result of == operator will not be unspecified.Zonate
@Zonate my answer is answering OP's question of comparing two pointers which pointed to objects, after the objects have been freed.This is different to "uninitialized pointer". Uninitialized pointers contain indeterminate values and attempting to compare two indeterminate values causes UB (N3936 [dcl.init]/12)Lingulate
N4430, if adopted, would settle the "invalid" issue.Weirdo
Would it be ok to return and compare uintptr_t instead?Chu
R
4

Historically, there have been some systems where using a pointer as an rvalue might cause the system to fetch some information identified by some bits in that pointer. For example, if a pointer could contain the address of an object's header along with an offset into the object, fetching a pointer could cause the system to also fetch some information from that header. If the object has ceased to exist, the attempt to fetch information from its header could fail with arbitrary consequences.

That having been said, in the vast majority of C implementations, all pointers that were alive at some particular moment in time will forever hold the same relationships with regard to the relational and subtraction operators as they had at that particular time. Indeed, in most implementations if one has char *p, one may determine whether it identifies part of an object identified by char *base; size_t size; by checking whether (size_t)(p-base) < size; such comparison will work even retrospectively if there is any overlap in the objects' lifetime.

Unfortunately, the Standard defines no means by which code can indicate that it requires any of the latter guarantees, nor is there a standard means by which code can ask whether a particular implementation can promise any of the latter behaviors and refuse compilation if it does not. Further, some hyper-modern implementations will regard any use of relational or subtraction operators on two pointers as a promise by the programmer that the pointers in question will always identify the same live object, and omit any code which would only be relevant if that assumption didn't hold. Consequently, even though many hardware platforms would be able to offer guarantees that would be useful to many algorithms, there's no safe way by which code can exploit any such guarantees even if code will never need to run on hardware which does not naturally provide them.

Renaldorenard answered 8/6, 2015 at 16:5 Comment(0)
M
-3

The pointers contain the addresses of the variables they reference. The addresses are valid even when the variables that used to be stored there are released / destroyed / unavailable. As long as you don't try to use the values at those addresses you are safe, meaning *p and *q will be undefined.

Obviously the result is implementation defined, therefore this code example can be used to study the features of your compiler if one doesn't want to dig into to assembly code.

Whether this is a meaningful practice is totally different discussion.

Marlin answered 10/6, 2015 at 14:46 Comment(2)
It's not simply "legal", it's "implementation-defined".Heinous
The result of (p == q) is "implementation-defined", I agree.Marlin

© 2022 - 2024 — McMap. All rights reserved.