Is apparent NULL pointer dereference in C actually pointer arithmetic?
Asked Answered
M

5

6

I've got this piece of code. It appears to dereference a null pointer here, but then bitwise-ANDs the result with unsigned int. I really don't understand the whole part. What is it intended to do? Is this a form of pointer arithmetic?

struct hi  
{
   long a;  
   int b;  
   long c;  
};  

int main()  
{  
    struct hi ob={3,4,5};  
    struct hi *ptr=&ob;  
    int num= (unsigned int) & (((struct hi *)0)->b);  

   printf("%d",num);  
   printf("%d",*(int *)((char *)ptr + (unsigned int) & (((struct hi *)0)->b)));  
}  

The output I get is 44. But how does it work?

Mcclean answered 26/12, 2010 at 4:48 Comment(2)
The answers describe how this manages to work, but don't do that: it hurts to look at it.Gallipot
See also #714463Racketeer
K
3

This is not an "and", this is taking the address of the right hand side argument.
This is a standard hack to get the offset of a struct member at run time. You are casting 0 to a pointer to struct hi, then referencing the 'b' member and getting its address. Then you add this offset to the pointer "ptr" and getting real address of the 'b' field of the struct pointed to by ptr, which is ob. Then you cast that pointer back to int pointer (because b is int) and output it. This is the 2nd print. The first print outputs num, which is 4 not because b's value is 4, but because 4 is the offset of the b field in hi struct. Which is sizeof(int), because b follows a, and a is int... Hope this makes sense :)

Knisley answered 26/12, 2010 at 4:56 Comment(4)
so is it something like this- i take the address 0 and then find the offset of my structure variable from 0 and then add it to the origina address ie ptr??Mcclean
yes. Google for offsetof mentioned in other responses, you will find some examples of why this is useful.Knisley
and ((struct hi *)0)->b; gives me a segmentation fault.but when i add the '&' operator it works?? how is that??Mcclean
Well, when you ask for ((struct hi *)0)->b you are asking for the value stored at address 4, which you don't have access to and when you ask for & of that, you are asking for that address, which is 4, and is a valid calculation. You can talk about what the question is, but you can't ask it.Knisley
I
8

It isn't really dereferencing the null pointer. You should look on the whole code. What the code says is: take number 0, treat it as struct hi *, select element b in the struct it points to, and take address of this element. The result of this operation would be the offset of the element b from the beginning of the struct. When you add it to the pointer, you get element b which equals to 4.

Ijssel answered 26/12, 2010 at 4:53 Comment(1)
@James that's what I meant :)Ijssel
S
7

This gives you the offset in bytes of the b field inside the hi struct

((struct hi *)0) is a pointer to a hi struct, starting at address 0.

(((struct hi *)0)->b) is the b field of the above struct

& (((struct hi *)0)->b) is the address of the above field. Because the hi struct is located at address 0, this is the offset of b within the struct.

(unsigned int) & (((struct hi *)0)->b) is a conversion of that from the address type to unsigned int, so that it can be used as a number.

You're not actually dereferencing a NULL pointer. You're just doing pointer arithmetic.


Accessing (((struct hi *)0)->b) will give you a segmentation fault because you're trying to access a forbidden memory location.

Using & (((struct hi *)0)->b) does not give you segmentation fault because you're only taking the address of that forbidden memory location, but you're not trying to access said location.

Sternutatory answered 26/12, 2010 at 4:52 Comment(3)
Is there any reason to use this method over offsetof ?Zilvia
offsetof is probably implemented this way in some header file. So it's better to use offesetof (because if this way doesn't work, people who wrote the header probably used one that does) but it is essentially the same.Ijssel
and ((struct hi *)0)->b; gives me a segmentation fault.but when i add the '&' operator it works?? how is that?Mcclean
Y
3

You must be using a 32-bit compile (or a 64-bit compile on Windows).

The first expression - for num - is a common implementation of the offsetof macro from <stddef.h>; it is not portable, but it often works.

The second expression adds that to 0 (the null pointer) and gives you the same answer - 4. The second expression adds 4 to the base address of the object that ptr points to, and that is the value 4 in the structure.

Your output does not include a newline - it probably should (the behaviour is not completely portable because it is implementation defined if you don't include the newline: C99 §7.19.2: "Whether the last line requires a terminating new-line character is implementation-defined."). On a Unix box, it is messy because the next prompt will appear immediately after the 44.

Yorker answered 26/12, 2010 at 4:54 Comment(2)
the second expression actually adds it to ptr - which points to struct so given the offset of b you get 4 because that's what b is equal to.Ijssel
behavior of what is undefined if you don't include a newline?Tabulate
K
3

This is not an "and", this is taking the address of the right hand side argument.
This is a standard hack to get the offset of a struct member at run time. You are casting 0 to a pointer to struct hi, then referencing the 'b' member and getting its address. Then you add this offset to the pointer "ptr" and getting real address of the 'b' field of the struct pointed to by ptr, which is ob. Then you cast that pointer back to int pointer (because b is int) and output it. This is the 2nd print. The first print outputs num, which is 4 not because b's value is 4, but because 4 is the offset of the b field in hi struct. Which is sizeof(int), because b follows a, and a is int... Hope this makes sense :)

Knisley answered 26/12, 2010 at 4:56 Comment(4)
so is it something like this- i take the address 0 and then find the offset of my structure variable from 0 and then add it to the origina address ie ptr??Mcclean
yes. Google for offsetof mentioned in other responses, you will find some examples of why this is useful.Knisley
and ((struct hi *)0)->b; gives me a segmentation fault.but when i add the '&' operator it works?? how is that??Mcclean
Well, when you ask for ((struct hi *)0)->b you are asking for the value stored at address 4, which you don't have access to and when you ask for & of that, you are asking for that address, which is 4, and is a valid calculation. You can talk about what the question is, but you can't ask it.Knisley
I
0

Just to clarify that you must understand the difference between NULL-pointer dereference and when it's not considered a de-reference. The spec actually dictates that the de-reference does not happen, and is actually optimised away when you have the & (address-of) operator in the expression.

So the &((struct T*)0)->b) actually optimises out the -> and just jumps that number of bytes from offset 0 and assumes it's a struct T *. This really obfuscates things for new beginners. However, it's widely used in the Linux Kernel - and provides an actual sense of list_entry, list_head's and various pointer arithmetic magic that newbies can't comprehend.

In any event, it's a programmatic way of finding the offset of 'b' within the struct T object. It's used in offsetof as well as other list_head operations such as list_entry.

For more information - you can read about this within Robert Love's Book titled "Linux Kernel Development".

Irksome answered 1/1, 2011 at 21:51 Comment(1)
Aren't all address computations required to operate upon and yield yield either addresses of addressable objects, or else the addresses immediately following addressable objects? Would not an implementation be entitled to trigger nasal demons any time an address computation works with or yields anything else?Fruit

© 2022 - 2024 — McMap. All rights reserved.