0xDEADBEEF vs. NULL
Asked Answered
D

10

26

Throughout various code, I have seen memory allocation in debug builds with NULL...

memset(ptr,NULL,size);

Or with 0xDEADBEEF...

memset(ptr,0xDEADBEEF,size);
  1. What are the advantages to using each one, and what is the generally preferred way to achieve this in C/C++?
  2. If a pointer was assigned a value of 0xDEADBEEF, couldn't it still deference to valid data?
Dovecote answered 6/5, 2011 at 6:29 Comment(18)
Maybe an answer of this question will help you... maybe...Rena
Why the memset? Why not just ptr = NULL?Madrid
@Fred: the point is to mark the memory pointed to by ptr, not the pointer itself. Typically, the ptr is set to NULL after the memory that it points to has been marked.Kaifeng
Neither. Stop having pointers to things that don't exist.Prosthetics
@FredOverflow, It's not a pointer. It's a buffer of memory.Expiry
@FredOverflow: I believe this question is about initializing memory when using a custom allocator.Bootless
@GMan: I assume this is about a custom allocator, i.e. what ends up getting called when you call new MyClass, before the constructor actually gets called.Bootless
@Ebo: I don't understand, why initialize the memory at all then? It's just going to be over-written.Prosthetics
@GMan: He said "debug build". See my explanation below. It's very common in debug builds to initialize memory with a value that clearly identifies uninitialized memory, like 0xcdcdcdcd (which is what I believe Microsoft's debug allocator uses). It's extremely useful.Bootless
@EboMike: marking also happens when memory is freed in VC debug builds.Kaifeng
@Ebo: Hm, right. I never really had much of a problem there, so I'll just say out of this one. :)Prosthetics
@sean: Exactly, although with a different value (0xdddddddd IIRC) to clearly identify "deleted memory" when looking at it in the debugger.Bootless
@Ebo: and a different signature for buffer bounds (dead mans zone)Kaifeng
@sean: Guard words, correct. Freeing memory will also verify that the guard words around the allocation are still intact and assert if that's no the case. The first line in defense in trapping buffer and array overruns.Bootless
One thing to emphasis here. Pointer are NOT usually set to magic values (as it does not help). What they point it is usually painted with a magic value by the memory allocator to indicate state (not allocated/just allocated/released). This is not usually done by the program but rather by the memory allocator.Hug
Thing is: assuming CHAR_BIT is 8, memset(ptr, 0xDEADBEEF, size); and memset(ptr, 0xEF, size); have the exact same effect.Torse
@trinithis: the description of memset() (7.21.6.1 in the Standard) gives the prototype as void *memset(void *s, int c, size_t n); and says: "The memset() function shall copy c (converted to an unsigned char) into each of the first n bytes of the object pointed to by s.". Futhermore I tested it and verified my implementation behaves as the Standard describes.Torse
if needed, initialize to 0xCC insteadDyan
Q
61
  1. Using either memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) is a clear indication of the fact that the author did not understand what they were doing.

    Firstly, memset(ptr, NULL, size) will indeed zero-out a memory block in C and C++ if NULL is defined as an integral zero.

    However, using NULL to represent the zero value in this context is not an acceptable practice. NULL is a macro introduced specifically for pointer contexts. The second parameter of memset is an integer, not a pointer. The proper way to zero-out a memory block would be memset(ptr, 0, size). Note: 0 not NULL. I'd say that even memset(ptr, '\0', size) looks better than memset(ptr, NULL, size).

    Moreover, the most recent (at the moment) C++ standard - C++11 - allows defining NULL as nullptr. nullptr value is not implicitly convertible to type int, which means that the above code is not guaranteed to compile in C++11 and later.

    In C language (and your question is tagged C as well) macro NULL can expand to (void *) 0. Even in C (void *) 0 is not implicitly convertible to type int, which means that in general case memset(ptr, NULL, size) is simply invalid code in C.

    Secondly, even though the second parameter of memset has type int, the function interprets it as an unsigned char value. It means that only one lower byte of the value is used to fill the destination memory block. For this reason memset(ptr, 0xDEADBEEF, size) will compile, but will not fill the target memory region with 0xDEADBEEF values, as the author of the code probably naively hoped. memset(ptr, 0xDEADBEEF, size) is eqivalent to memset(ptr, 0xEF, size) (assuming 8-bit chars). While this is probably good enough to fill some memory region with intentional "garbage", things like memset(ptr, NULL, size) or memset(ptr, 0xDEADBEEF, size) still betray the major lack of professionalism on the author's part.

    Again, as other answer have already noted, the idea here is to fill the unused memory with a "garbage" value. Zero is certainly not a good idea in this case, since it is not "garbagy" enough. When using memset you are limited to one-byte values, like 0xAB or 0xEF. If this is good enough for your purposes, use memset. If you want a more expressive and unique garbage value, like 0xDEDABEEF or 0xBAADFOOD, you won't be able to use memset with it. You'll have to write a dedicated function that can fill memory region with 4-byte pattern.

  2. A pointer in C and C++ cannot be assigned an arbitrary integer value (other than a Null Pointer Constant, i.e. zero). Such assignment can only be achieved by forcing the integral value into the pointer with an explicit cast. Formally speaking, the result of such a cast is implementation defined. The resultant value can certainly point to valid data.

Quotidian answered 7/5, 2011 at 18:15 Comment(4)
Normally your answer 1 is correct and 2 is incorrect, but there could be implementations with 32-bit chars which would make memset(ptr, 0xDEADBEEF, size) mean just that. Also on some platforms there are alignment requirements that makes dereferencing 0xDEADBEEF would fail in some cases (and the implementation might not produce non-aligned pointers).Jardena
@skyking: There's nothing "incorrect" in my second point. Meanwhile, good point about implementations with 32-bit chars.Quotidian
Well that depends on what you mean by normal and valid data, a bit clumpsy of me to express in that way. What is clear is that on most platforms the 0xDEADBEEF can't point to data that is required or chosen (by the compiler) to reside on an aligned address. It also depends on what's meant by "valid data" - if you only require that you're able to access the data without segmentation fault then you would on x86 platform pass, but if you require that the data is actually valid then it's a bit less normal case since it would need to be bool or char.Jardena
...in addition one could make the case that x86_64 is more normal these days and then you would end up with a full 64-bit value that is not valid as an address.Jardena
C
10

Writing 0xDEADBEEF or another non-zero bit pattern is a good idea to be able to catch both write-after-delete and read-after-delete uses.

1) Write after delete

By writing a specific pattern you can check if a block that has already been deallocated was written over later by buggy code; in our debug memory manager we use a free list of blocks and before recycling a memory block we check that our custom pattern are still written all over the block. Of course it's sort of "late" when we discover the problem, but still much earlier than when it would be discovered not doing the check. Also we have a special function that is called periodically and that can also be called on demand that just goes through the list of all freed memory blocks and check their consistency and so we can call this function often when chasing a bug. Using 0x00000000 as value wouldn't be as effective because zero may possibly be exactly the value that buggy code wants to write in the already deallocated block e.g. zeroing a field or setting a pointer to NULL (it's instead more unlikely that the buggy code wants to write 0xDEADBEEF).

2) Read after delete

Leaving the content of a deallocated block untouched or even writing just zeros will increase the possibility that someone reading the content of a dead memory block will still find the values reasonable and compatible with invariants (e.g. a NULL pointer as on many architectures NULL is just binary zeroes, or the integer 0, the ASCII NUL char or a double value 0.0). By writing instead "strange" patterns like 0xDEADBEEF most of code that will access in read mode those bytes will probably find strange unreasonable values (e.g. the integer -559038737 or a double with value -1.1885959257070704e+148), hopefully triggering some other self consistency check assertion.

Of course nothing is really specific to the bit pattern 0xDEADBEEF, actually we use different patterns for freed blocks, before-block area, after-block area and and also our memory manager writes another (address-dependent) specific bit pattern to the content part of any memory block before giving it to the application (this is to help finding uses of uninitialized memory).

Calycle answered 6/5, 2011 at 6:49 Comment(0)
B
9

I would definitely recommend 0xDEADBEEF. It clearly identifies uninitialized variables, and accesses to uninitialized pointers.

Being odd, dereferencing a 0xdeadbeef pointer will definitely crash on the PowerPC architecture when loading a word, and very likely crash on other architectures since the memory is likely to be outside the process' address space.

Zeroing out memory is a convenience since many structures/classes have member variables that use 0 as their initial value, but I would very much recommend initializing each member in the constructor rather than using the default memory fill. You will really want to be on top of whether or not you properly initialized your variables.

Bootless answered 6/5, 2011 at 6:34 Comment(3)
I didn't downvote, but I'd say the statement "the memory is likely to be outside the process' address space" is certainly dead wrong. On any 32-it architecture 0xDEADBEEF is guaranteed to be inside the process address space, by definition.Quotidian
However, usually the dereferencing a pointer in the zeropage results in an segfault same might not be true for 0xDEADBEEF the only reason dereferencing could result in an error is because it's not 4-bit aligned (and that's probably also the reason PPC doesn't like it).Trilby
@Jasper: It might not be true for dereferencing 0 either, and on Windows it will result in an error because that memory doesn't belong to the process. However, the point isn't just to force a segfault. Its to create a create a value that says "this is uninitialized" when a bug occurs and you're tracking it down. If that bug happens to be a segfault, then great, you caught it early.Gery
S
6

http://en.wikipedia.org/wiki/Hexspeak

These "magic" numbers are are a debugging aid to identify bad pointers, uninitialized memory etc. You want a value that is unlikely to occur during normal execution and something that is visible when doing memory dumps or inspecting variables. Initializing to zero is less useful in this regard. I would guess that when you see people initialize to zero it is because they need to have that value at zero. A pointer with a value of 0xDEADBEEF could point to a valid memory location so it's a bad idea to use that as an alternative to NULL.

Sabotage answered 6/5, 2011 at 6:35 Comment(1)
Whether 0xDEADBEEF can point to valid memory or not depends on the implemenation: it can't point to valid memory under Windows, for example (since Windows maps kernel code into this address). And it's unaligned, so it generally can't point to the beginning of an object. It's a good choice for uninitialized memory (as opposed to a null pointer---your distinction of the two is good).Forgotten
E
4

One reason that you null the buffer or set it to a special value is that you can easily tell whether the buffer contents is valid or not in the debugger.

Dereferencing a pointer of value "0xDEADBEEF" is almost always dangerous(probably crashes your program/system) because in most cases you have no idea what is stored there.

Expiry answered 6/5, 2011 at 6:38 Comment(2)
It's not really so much "what is stored [at address 0xDEADBEEF]" being inherently mysterious or dangerous, as that the address is unlikely to be part of your virtual address space, causing a memory access violation / SIGSEGV or similar. Still, in my experience it's more common for memory content to be overwritten with DEADBEEF than for pointers to be loaded with it, though of course a pointer to an overwritten structure therefore becomes DEADBEEF indirectly....Skive
...except that, of course, memset cannot be used to overwrite a memory region with a 4-byte patternQuotidian
F
1

DEADBEEF is an example of HexSpeek. With it, as a programmer you convey intentionally an error condition.

Factfinding answered 6/5, 2011 at 6:35 Comment(2)
I already knew that, my question is whether I should use 0xDEADBEEF for uninitialized memory over null.Dovecote
a bit lame to downvote for that; it wasn't clear from your question you already knew this. Although clearly not THE answer to your question, I think it's still useful and on topic information.Factfinding
R
1

I would personally recommend using NULL (or 0x0) as it represents the NULL as expected and comes in handy while comparison. Imagine you are using char * and in between on DEADBEEF for some reason (don't know why), then at least your debugger will come very handy to tell you that its 0x0.

Rugged answered 6/5, 2011 at 6:41 Comment(1)
The problem is the amount of data that has a valid value of 0x00. When you see 0xDEADBEEF, or reminants of it, in the debugger, you know you have screwed up. When you see lots of 0's, you have no idea.Distributary
K
1

I would go for NULL because it's much easier to mass zero out memory than to go through later and set all the pointers to 0xDEADBEEF. In addition, there's nothing at all stopping 0xDEADBEEF from being a valid memory address on x86- admittedly, it would be unusual, but far from impossible. NULL is more reliable.

Ultimately, look- NULL is the language convention. 0xDEADBEEF just looks pretty and that's it. You gain nothing for it. Libraries will check for NULL pointers, they don't check for 0xDEADBEEF pointers. In C++ then the idea of the zero pointer isn't even tied to a zero value, just indicated with the literal zero, and in C++0x there is a nullptr and a nullptr_t.

Kratz answered 6/5, 2011 at 6:46 Comment(2)
Zeroing memory will increase the likelihood that a read-after-delete or write-after-delete will go unnoticed.Calycle
The fact that libraries check for NULL pointers is why you probably shouldn't do this, unless you are 100% sure the behavior will continue into release builds. If you forget to initialize some data, the surrounding code will happily ignore it, because you probably intended to initialize it to zero anyhow. Then, you switch to release build, and it goes back to actually looking like uninitialized data and all your if(!p) checks become worthless. Just be sure to pick a value that your system guarantees is invalid; 0 is generally not the only option.Gery
O
0

Vote me down if this is too opinion-y for StackOverflow but I think this whole discussion is a symptom of a glaring hole in the toolchain we use to make software.

Detecting uninititialized variables by initializing memory with "garabage-y" values detects only some kinds of errors in some kinds of data.

And detecting uninititialized variables in debug builds but not for release builds is like following safety procedures only when testing an aircraft and telling the flying public to be satisfied with "well, it tested OK".

WE NEED HARDWARE SUPPORT for detecting uninitialized variables. As in something like an "invalid" bit that accompanies every addressability entity of memory (=byte on most of our machines) and which is set by the OS in every byte VirtualAlloc() (et. al, or equivalents on other OS's) hands over to applications and which is automatically cleared when the byte is written to but which causes an exception if read first.

Memory is cheap enough for this and processors are fast enough for this. This end of reliance on "funny" patterns and keeps us all honest to boot.

Orlina answered 11/5, 2011 at 16:31 Comment(4)
This hardware support would come down to relying on software being written correctly to make use of it. In general the hardware has no way of knowing when a variable goes from being initialized to uninitialized, because that is a software concept. Further, you're talking about an extremely non-trivial amount of resources dedicated to this. Even just a single "dirty" bit would require an additional 12.5% increase in physical memory on modern systems, and likely more virtual memory since you can't just ask Windows to write 9 bits to the harddrive.Gery
There are many levels of error detection used when debugging modern software. This is just one of them. Nobody presents it as a sliver bullet or as the "one true way" to do it. Normally, it is just a single step in the system of security measures. And this single step, despite being rather simple, has proven to be quite effective at what it is supposed to do. The very nature of this measure makes it more appropriate in a debug builds (although I can see that sometimes it can be applicable it release builds as well).Quotidian
Re Andrey's: No argument that garbage-y fill is useful -- of course it is. I'm greedy: I want more. Re Dennis's: not my intention to design the h'ware solution but one could imagine a single instruction per 4K block, issued by OS on malloc/virtalalloc/etc; the "invalid" bit would be invisible to software. It cost's 12.5% memory -- sounds cheap to me for the benefit.Orlina
Yes, it will only detect some errors, but on the other hand it does detect some errors and those errors are often quite serious. Note that using "garbage-y" values is not about improving run-time safety - it's more to make it more likely that faults will be evident. The analogy with the airplain would be to simulate malfunctions during test flies (like for example turn one motor off) - it's more like not following safety procedures when testing, but deliberately breaking them to see if it still would work.Jardena
J
0

Note that the second argument in memset is supposed to be a byte, that is it is implicitely cast to a char or similar. 0xDEADBEEF would for most platforms convert to 0xEF (and something else for some odd platform).

Also note that the second argument is supposed to formally be an int which NULL isn't.

Now for the advantage of doing these kind of initialization. First of course the behavior would more likely be deterministic (even if we by this ends up in undefined behavior the behavior would in practice be consistent).

Having deterministic behavior will mean that debugging becomes easier, when you found a bug you would "only" have to provide the same input and the fault will manifest itself.

Now when you select which value you would use you should select a value that most likely will result in bad behavior - which means the use of uninitialized data would more likely result in a fault being observed. This means that you would have to use some knowledge of the platform in question (however many of them behave quite similar).

If the memory is used to hold pointers then indeed having cleared the memory will mean that you get a NULL pointer and normally dereferencing that will result in segmentation fault (which will be observed as a fault). However if you use it in another way, for example as an arithmetic type then you will get 0 and for many application that is not that odd number.

If you instead use 0xDEADBEEF you will get a quite large integer, also when interpreting the data as floating point it will also be quite large number (IIRC). If interpreting it as text it will be very long and contain non-ascii characters and if you use UTF-8 encoding it will likely be invalid. Now if used as a pointer on some platform it would fail alignment requirements for some types - also on some platforms that region of memory might be mapped out anyway (note that on x86_64 the value of the pointer would be 0xDEADBEEFDEADBEEF which is out of range for an address).

Note that while filling with 0xEF will have pretty much similar properties, if you want to fill the memory with 0xDEADBEEF you would need to use a custom function since memset doesn't do the trick.

Jardena answered 7/4, 2016 at 8:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.