void pointer = int pointer = float pointer
Asked Answered
P

7

5

I have a void pointer pointing to a memory address. Then, I do

  • int pointer = the void pointer

  • float pointer = the void pointer

and then, dereference them go get the values.

{
    int x = 25;

    void   *p  = &x;
    int    *pi = p;
    float  *pf = p;
    double *pd = p;

    printf("x: n%d\n", x);
    printf("*p: %d\n", *(int *)p);
    printf("*pi: %d\n", *pi);
    printf("*pf: %f\n", *pf);
    printf("*pd: %f\n", *pd);

    return 0;
}

The output of dereferencing pi(int pointer) is 25. However the output of dereferencing pf(float pointer) is 0.000. Also dereferncing pd(double pointer) outputs a negative fraction that keeps changing?

Why is this and is it related to endianness(my CPU is little endian)?

Pastelki answered 21/12, 2015 at 16:8 Comment(7)
There are a ton of UBs going on here. Is this a homework problem?Trifle
What is the desired behaviour? You don't mention it. You play with pointers and types, you get some experimental results and you ask why they occur. But what did you expect, and why did you expect that?Tanto
if you expect all the pointers to show 25 it cant be true, read about mantissaApyretic
@DanielDaranas I thought it was obvious that I expect the same output of 25.Pastelki
@Pastelki I thought it was obvious that I expect the same output of 25. Why would you expect that? Different types such as int, float, and double have can - and in the case of float and double do - have different sizes, which more than implies a different internal representation.Cadge
What is the in-memory binary representation of A) an int = 25, B) a float = 25, and C) a double = 25? Are they the same? If not, how do they differ? Happy learning experience!Became
Possible duplicate of Dereferencing on casting the void pointer to float*/int*Decree
A
7

As per C standard, you'er allowed to convert any pointer to void * and convert it back, it'll have the same effect.

To quote C11, chapter §6.3.2.3

[...] A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

That is why, when you cast the void pointer to int *, de-reference and print the result, it prints properly.

However, standard does not guarantee that you can dereference that pointer to be of a different data type. It is essentially invoking undefined behaviour.

So, dereferencing pf or pd to get a float or double is undefined behavior, as you're trying to read the memory allocated for an int as a float or double. There's a clear case of mismtach which leads to the UB.

To elaborate, int and float (and double) has different internal representations, so trying to cast a pointer to another type and then an attempt to dereference to get the value in other type won't work.

Related , C11, chapter §6.5.3.3

[...] If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

and for the invalid value part, (emphasis mine)

Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

Antelope answered 21/12, 2015 at 16:13 Comment(10)
If p is of void * type then it should be converted to float * and double * too.Dualpurpose
@Dualpurpose but the attempt to dereference will cause the UB, no? Please correct me sir if i'm wrong.Antelope
I am not sure yet. looking at the standard.Dualpurpose
@SouravGhosh AFAIK it is UB. C typecasts just tell the compiler to "treat the variable in memory as a certain type" regardless of the original type. There is a no value/type conversion going on. (edited, missed a no)Myogenic
@DaanTimmer that is what my understanding is, too. However, let me check once more....Antelope
@SouravGhosh; Yes. Now it make sense. Thanks for the quote.Dualpurpose
@SouravGhosh So here's what I understood so far. The reason that int * dereferncing is the only that works is because the original datatype(x) is int. Endianness has nothing to do with this. Real numbers represenations has nothing to do with this. Correct?Pastelki
Just another question to clarify, as int and float are certainly same size (4 bytes), can we expect to read the float value represented by the binary of 25 in pf (ie 0x00000019 and would read the float value 3.5E-44 [which gives 0.000 if not displayed correctly]), or even this cannot be told?Barbarous
@YannRoth once you violate the strict aliasing rule, you'll invoke UB, so nothing more is guaranteed. :)Antelope
@Dualpurpose Thanks to you too sir, glad I'm able to clarify. :)Antelope
A
4

In addition to the answers before, I think that what you were expecting could not be accomplished because of the way the float numbers are represented.

Integers are typically stored in Two's complement way, basically it means that the number is stored as one piece. Floats on the other hand are stored using a different way using a sign, base and exponent, Read here.

So the main idea of convertion is impossible since you try to take a number represented as raw bits (for positive) and look at it as if it was encoded differently, this will result in unexpected results even if the convertion was legit.

Apyretic answered 21/12, 2015 at 16:38 Comment(0)
J
3

So... here's probably what's going on.

However the output of dereferencing pf(float pointer) is 0.000

It's not 0. It's just really tiny.

You have 4-byte integers. Your integer looks like this in memory...

5        0        0        0
00000101 00000000 00000000 00000000

Which interpreted as a float looks like...

sign  exponent  fraction
   0  00001010  0000000 00000000 00000000
   +   2**-117  * 1.0

So, you're outputting a float, but it's incredibly tiny. It's 2^-117, which is virtually indistinguishable from 0.

If you try printing the float with printf("*pf: %e\n", *pf); then it should give you something meaningful, but small. 7.006492e-45

Also dereferncing pd(double pointer) outputs a negative fraction that keeps changing?

Doubles are 8-bytes, but you're only defining 4-bytes. The negative fraction change is the result of looking at uninitialized memory. The value of uninitialized memory is arbitrary and it's normal to see it change with every run.

Jaggy answered 21/12, 2015 at 16:43 Comment(2)
There's something wrong with this answer because 2^-117 isn't 7e-45. I suspect it has to do with denormalized numbers, but I haven't been able to figure out the details of this behavior. The gist of the answer (5 parses out to a very tiny float) is still true though.Jaggy
Excellent explanation. +1 for the memory representation of floatsPastelki
T
2

There are two kinds of UBs going on here:

1) Strict aliasing

What is the strict aliasing rule?

"Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)"

However, strict aliasing can be turned off as a compiler extension, like -fno-strict-aliasing in GCC. In this case, your pf version would function well, although implementation defined, assuming nothing else has gone wrong (usually float and int are both 32 bit types and 32 bit aligned on most computers, usually). If your computer uses IEEE754 single, you can get a very small denorm floating point number, which explains for the result you observe.

Strict aliasing is a controversial feature of recent versions of C (and considered a bug by a lot of people) and makes it very difficult and more hacky than before to do reinterpret cast (aka type punning) in C.

Before you are very aware of type punning and how it behaves with your version of compiler and hardware, you shall avoid doing it.

2) Memory out of bound

Your pointer points to a memory space as large as int, but you dereference it as double, which is usually twice of the size of an int, you are basically reading half a double of garbage from somewhere in the computer, which is why your double keeps changing.

Trifle answered 21/12, 2015 at 16:38 Comment(0)
A
1

The types int, float, and double have different memory layouts, representations, and interpretations.

On my machine, int is 4 bytes, float is 4 bytes, and double is 8 bytes.

Here is how you explain the results you are seeing.

Derefrencing the int pointer works, obviously, because the original data was an int.

Derefrencing the float pointer, the compiler generates code to interpret the contents of 4 bytes in memory as a float. The value in the 4 bytes, when interpreted as a float, gives you 0.00. Lookup how float is represented in memory.

Derefrencing the double pointer, the compiler generates code to interpret the contents in memory as a double. Because a double is larger than an int, this accesses the 4 bytes of the original int, and an extra 4 bytes on the stack. Because the contents of these extra 4 bytes is dependent on the state of the stack, and is unpredictable from run to run, you see the varying values that correspond to interpreting the entire 8 bytes as a double.

Amann answered 21/12, 2015 at 16:43 Comment(0)
T
1

In the following,

printf("x: n%d\n", x); //OK
printf("*p: %d\n", *(int *)p); //OK
printf("*pi: %d\n", *pi); //OK
printf("*pf: %f\n", *pf); // UB
printf("*pd: %f\n", *pd); // UB

The accesses in the first 3 printfs are fine as you are accessing int through the lvalue type of type int. But the next 2 are not fine as the violate 6.5, 7, Expressions.

An int * is not a compatible type with a float * or double *. So the accesses in the last two printf() calls cause undefined behaviour.

C11, $6.5, 7 states:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

Trinitrobenzene answered 21/12, 2015 at 16:48 Comment(0)
S
0

The term "C" is used to describe two languages: one invented by K&R in which pointers identify physical memory locations, and one which is derived from that which works the same in cases where pointers are either read and written in ways that abide by certain rules, but may behave in arbitrary fashion if they are used in other ways. While the latter language is defined the by the Standards, the former language is what became popular for microcomputer programming in the 1980s.

One of the major impediments to generating efficient machine code from C code is that compilers can't tell what pointers might alias what variables. Thus, any time code accesses a pointer that might point to a given variable, generated code is required to ensure that the contents of the memory identified by the pointer and the contents of the variable match. That can be very expensive. The people writing the C89 Standard decided that compilers should be allowed to assume that named variables (static and automatic) will only be accessed using pointers of their own type or character types; the people writing C99 decided to add additional restrictions for allocated storage as well.

Some compilers offer means by which code can ensure that accesses using different types will go through memory (or at least behave as though they are doing so), but unfortunately I don't think there's any standard for that. C14 added a memory model for use with multi-threading which should be capable of achieving required semantics, but I don't think compilers are required to honor such semantics in cases where they can tell that there's no way for outside threads to access something [even if going through memory would be necessary to achieve correct single-thread semantics].

If you're using gcc and want to have memory semantics that work as K&R intended, use the "-fno-strict-aliasing" command-line option. To make code efficient it will be necessary to make substantial use of the "restrict" qualifier which was added in C99. While the authors of gcc seem to have focused more on type-based aliasing rules than "restrict", the latter should allow more useful optimizations.

Sanguinary answered 23/12, 2015 at 16:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.