Can I read any readable valid memory location via a (unsigned) char* in C++?
Asked Answered
D

3

7

My search foo seems lacking today.

I would like to know if it is legal according to std C++ to inspect "any" memory location via an (unsigned(?)) char*. By any location I mean any valid address of an object or array (or inside an array) inside the program.

By way of example:

void passAnyObjectOrArrayOrSomethingElseValid(void* pObj) {
   unsigned char* pMemory = static_cast<unsigned char*>(pObj)
   MyTypeIdentifyier x = tryToFigureOutWhatThisIs(pMemory);
}

Disclaimer: This question is purely academical. I do not intend to put this into production code! By legal I mean if it's really legal according to the standard, that is if it would work on 100% of all implementations. (Not just on x86 or some common hardware.)

Sub-question: Is static_cast the right tool to get from the void* address to the char* pointer?

Delwyn answered 9/7, 2011 at 12:51 Comment(0)
S
8

C++ assumes strict aliasing, which means that two pointers of fundamentally different type do not alias the same value.

However, as correctly pointed out by bdonlan, the standard makes an exception for char and unsigned char pointers.

Thus, in general this is undefined behaviour for any pointer type to read any deliberate address (which might be any type), but for the particular case of unsigned char as in the question it is allowed (ISO 14882:2003 3.10(15)).

static_cast does compile-time type checking, so it is unlikely to always work. In such a case, you will want reinterpret_cast.

Sheri answered 9/7, 2011 at 13:2 Comment(6)
That's not true- anyone who comes up with a better idea can search for reinterpret_cast- they can't search for C-style casts. In addition, C-style casts can become even worse casts without warning, like const_cast.Abrahan
Good point... being able to text search for reinterpret_cast is a big big plus, I'd never have thought of that.Sheri
unsigned char and char are permitted to alias with any other type; see ISO/IEC 9899:1999 (E) §6.5/7, last bullet as well as footnote 73. It is however undefined behavior to pass an arbitrary address (ie, one not derived from taking an address of a valid object) for other reasons (§6.5.6/8, §6.5.3.2/4, etc)Cutting
Note that 9899:1999 is the C99 spec, not C++, but it's likely to be the same, since C++ tries to be generally compatible when it comes to semantics...Cutting
You're right, that would be ISO 14882:2003 3.10(15), I'll update the answer.Sheri
@Damon: q("static_cast does compile-time type checking, so it is unlikely ...") -- since I pass a void* there is nothing static_cast could check. The question here really is whether to use static_c or reinterpret_c and since I do not need any cast to get from an object to a void-pointer (i.e. void* p = &object; is valid) I thought static_c is more appropriate than reinterpret_c. Obviously, if I directly cast an object to a char-type, I need reinterpret_c so that may be an argument to also use reinterpret_c here.Delwyn
C
2

Per ISO/IEC 9899:1999 (E) §6.5/7:

 7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • [...]
  • a character type

So it is legal (in C) to dereference and examine a (valid) pointer via unsigned char. However, the contents you'll find there are unspecified; tryToFigureOutWhatThisIs has no well-defined way of actually figuring out what it's looking at. I don't have a copy of the C++ spec here, but I suspect it uses the same definition, in order to maintain compatibility.

Cutting answered 9/7, 2011 at 15:55 Comment(1)
C++ has similar language: "a char or unsigned char type" rather than "a character type"Uracil
A
0

You can only use a char*, not an unsigned char*. Using an unsigned char* will break strict aliasing rules and invoke undefined behaviour, but there is an exception for char*. However, trying to actually do anything with the memory you read is very highly dubious and very likely to do something undefined. That's why it's rarely done in idiomatic C++ code.

Abrahan answered 9/7, 2011 at 13:6 Comment(5)
hmm, sure about that? I thought the exception applied for unsigned char as well. (Don't think I ever actually looked it up however)Heilungkiang
@jalf: I'm pretty sure that I saw a Standard quote and it only said char*.Abrahan
ISO/IEC 9899:1999 (E) §6.5/7 references "a character type"; this allows for either signed or unsigned accesses. It's not the C++ spec, but it's likely to be the same...Cutting
In both C++03 and C++11 it is definitely char and unsigned char. Apparently, signed char is bad even if char is a signed type. 3.10 [basic.lval]/15 in C++03, 3.10 [basic.lval]/10 in C++11Uracil
@Dennis, @bdonlan: thanks, very helpful. I asked because I wasn't sure, so great to see the relevant standard quotes. :)Heilungkiang

© 2022 - 2024 — McMap. All rights reserved.