Consequenes of warning “dereferencing type-punned pointer will break strict-aliasing rules”

Asked 23/5, 2014 at 5:0 Answered 23/5, 2014 at 6:8

Solved c casting compiler-optimization strict-aliasing pointer-aliasing

I have gone through some queries on the similar topic and some material related to it. But my query is mainly to understand the warning for the below code. I do not want a fix !! I understand there are two ways, a union or using memcpy.

uint32 localval;
void * DataPtr;
localval = something;
(*(float32*)(DataPtr))= (*(const float32*)((const void*)(&localval)));

please note the below significant points
1. both the types involved in the cast here are 32 bit. (or am i wrong ?)
2. Both are local variables.

Compiler specific points:
1. The code is supposed to be platform independent, this is a requirement!!
2. I compiled on GCC and it just worked as expected. (I could reinterpret the int as a float) , which is why i ignored the warning.

My questions
1. What optimizations could the compiler perform in this aliasing case ?
2. As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?
3. Can I safely ignore the warning or turn off aliasing ?
4. If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?
5. Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?
6. what are the affects if I disable the aliasing rules ?

Edited
1. based on R's and Matt McNabb's corrections
2. added a new questions

Contravene answered 23/5, 2014 at 5:0 Comment(4)

For insight into what optimizations the strict aliasing allows the compiler to do, see this question #99150 – Corpse 23/5, 2014 at 5:20

In light of the fact that it is undefined behaviour, "platform independent" is not going to be an option. You'd have to research the behaviour for each platform. – Eccentric 23/5, 2014 at 5:47

Ignoring the aliasing problem, you should still use memcpy. Code might not work correctly if these 2 types have different alignment requirements. Not likely in when both types have the same size, but is a possibility, and that shouldn't be ignored when writing portable code. – Indebted 23/5, 2014 at 6:39

Using memcpy is standard compliant, therefore portable, and in this case it is as efficient as it can be... the compiler generates one instruction, a 32-bit store: check it out coliru.stacked-crooked.com/a/0c8fecda1194b87b, look for movl %esi, (%rdi) – Corpse 23/5, 2014 at 8:5

Language standards try to strike a balance between the sometimes competing interests of programmers that will use the language and compiler writers that want to use a broad set of optimizations to generate reasonably fast code. Keeping variables in registers is one such optimization. For variables that are "live" in a section of a program the compiler tries to allocate them in registers. Storing at the address in a pointer could store anywhere in the program's address space - which would invalidate every single variable in a register. Sometimes the compiler could analyze a program and figure out where a pointer could or could not be pointing, but the C (and C++) language standards consider this an undue burden, and for "system" type of programs often an impossible task. So the language standards relax the constraints by specifying that certain constructs lead to "undefined behavior" so the compiler writer can assume they don't happen and generate better code under that assumption. In the case of strict aliasing the compromise reached is that if you store to memory using one pointer type, then variables of a different type are assumed to be unchanged, and thus can be kept in registers, or stores and loads to these other types can be reordered with respect to the pointer store.

There are many examples of these kind of optimizations in this paper "Undeﬁned Behavior: What Happened to My Code?"

http://pdos.csail.mit.edu/papers/ub:apsys12.pdf

There is an example there of a violation of the strict-aliasing rule in the Linux kernel, apparently the kernel avoids the problem by telling the compiler not to make use of the strict-aliasing rule for optimizations "The Linux kernel uses -fno-strict-aliasing to disable optimizations based on strict aliasing."

struct iw_event {
    uint16_t len; /* Real length of this stuff */
    ...
};
static inline char * iwe_stream_add_event(
    char * stream, /* Stream of events */
    char * ends, /* End of stream */
    struct iw_event *iwe, /* Payload */
    int event_len ) /* Size of payload */
{
    /* Check if it's possible */
    if (likely((stream + event_len) < ends)) {
        iwe->len = event_len;
        memcpy(stream, (char *) iwe, event_len);
        stream += event_len;
    }
    return stream;
}

Figure 7: A strict aliasing violation, in include/net/iw_handler.h of the Linux kernel, which uses GCC’s -fno-strict-aliasing to prevent possible reordering.

2.6 Type-Punned Pointer Dereference

C gives programmers the freedom to cast pointers of one type to another. Pointer casts are often abused to reinterpret a given object with a different type, a trick known as type-punning. By doing so, the programmer expects that two pointers of different types point to the same memory location (i.e., aliasing). However, the C standard has strict rules for aliasing. In particular, with only a few exceptions, two pointers of different types do not alias [19, 6.5]. Violating strict aliasing leads to undeﬁned behavior. Figure 7 shows an example from the Linux kernel. The function ﬁrst updates iwe->len, and then copies the content of iwe, which contains the updated iwe->len, to a buffer stream using memcpy. Note that the Linux kernel provides its own optimized memcpy implementation. In this case, when event_len is a constant 8 on 32-bit systems, the code expands as follows.

iwe->len = 8;
*(int *)stream = *(int *)((char *)iwe);
*((int *)stream + 1) = *((int *)((char *)iwe) + 1);

The expanded code ﬁrst writes 8 to iwe->len, which is of type uint16_t, and then reads iwe, which points to the same memory location of iwe->len, using a different type int. According to the strict aliasing rule, GCC concludes that the read and the write do not happen at the same memory location, because they use different pointer types, and reorders the two operations. The generated code thus copies a stale iwe->len value. The Linux kernel uses -fno-strict-aliasing to disable optimizations based on strict aliasing.

Answers

1) What optimizations could the compiler perform in this aliasing case ?

The language standard is very specific about the semantics (behavior) of a strictly conforming program - the burden is on the compiler writer or language implementor to get it right. Once the programmer crosses the line and invokes undefined behavior then the standard is clear that the burden of proof that this will work as intended falls on the programmer, not on the compiler writer - the compiler in this case has been nice enough to warn that undefined behavior has been invoked although it is under no obligation to even do that. Sometimes annoyingly people will tell you that at this point "anything can happen" usually followed by some joke/exaggeration. In the case of your program the compiler could generate code that is "typical for the platform" and store to localval the value of something and then load from localval and store at DataPtr, like you intended, but understand that it is under no obligation to do so. It sees the store to localval as a store to something of uint32 type and it sees the dereference of the load from (*(const float32*)((const void*)(&localval))) as a load from a float32 type and concludes these aren't to the same location so localval can be in a register containing something while it loads from an uninitialized location on the stack reserved for localval should it decide it needs to "spill" that register back to its reserved "automatic" storage (stack). It may or may not store localval to memory before dereferencing the pointer and loading from memory. Depending on what follows in your code it may decide that localval isn't used and the assignment of something has no side-effect, so it may decide that assignment is "dead code" and not even do the assignment to a register.

2) As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?

The effect could be that an undefined value is stored at the address pointed to by DataPtr.

3) Can I safely ignore the warning or turn off aliasing ?

That is specific to the compiler you are using - if the compiler documents a way to turn off the strict aliasing optimizations then yes, with whatever caveats the compiler makes.

4) If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?

Maybe, sometimes very small changes in another part of your program could change what the compiler does to this code, think for a moment if the function is "inlined" it could be thrown in the mix of some other part of your code, see this SO question.

5) Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?

You cannot dereference a void * so the compiler just cares about the type of your final cast (and in C++ it would gripe if you convert a const to non-const and vice-versa).

6) what are the affects if I disable the aliasing rules ?

See your compiler's documentation - in general you will get slower code, if you do this (like the Linux kernel chose to do in the example from the paper above) then limit this to a small compilation unit, with only the functions where this is necessary.

Conclusion

I understand your questions are for curiosity and trying to better understand how this works (or might not work). You mentioned it is a requirement that the code be portable, by implication then it is a requirement that the program be compliant and not invoke undefined behavior (remember, the burden is on you if you do). In this case, as you pointed out in the question, one solution is to use memcpy, as it turns out not only does that make your code compliant and therefore portable, it also does what you intend in the most efficient way possible on current gcc with optimization level -O3 the compiler converts the memcpy into a single instruction storing the value of localval at the address pointed to by DataPtr, see it live in coliru here - look for the movl %esi, (%rdi) instruction.

Corpse answered 23/5, 2014 at 6:8 Comment(4)

What useful optimizations would be prevented by requiring that a compiler recognize aliasing in cases where a pointer to an object is cast from its "real" type to a new type between the last access as the old type and any access as the new type, and all accesses using the cast pointer or a pointer derived from it are performed before the object is next accessed via any other means? From an implementation perspective, all that would require would be to have a cast from T to U be regarded as a potential read and write of a T except in cases where the compiler could determine otherwise. – Tsan 14/4, 2016 at 20:5

If, given T *p, a compiler were required to regard *(U*)p = whatever; as potentially accessing things of type T or type U, the compiler could still keep things of other types in memory across the access. By contrast, given memcpy(p, &whatever, sizeof (T)), a compiler which can't tell how p was produced would have to flush/invalidate all register-cached values which have been exposed to the outside world, regardless of their types. I'd think limiting the effects to things of type T and U would be much more efficient. – Tsan 14/4, 2016 at 20:9

Out of curiosity, is there any evidence that the authors of C89 were really trying to strike the aforementioned balance? Upon further consideration, I think they figured that it was better to trust that people writing compilers for platforms where aliasing was widely used would try to support that precedent, than to mandate that compilers recognize aliasing in cases where it would be useless (e.g. if any bit pattern other than all zeroes would be a trap representation for either int or float, there would be no reason to mandate that a compiler recognize aliasing between those types). – Tsan 2/9, 2016 at 23:43

If they were trying to strike a balance, their rationale should have explicitly justified ignoring aliasing even in cases where a sane compiler would think it likely, rather than presenting an example where it would be unlikely and saying compilers need not provide for such dubious possibilities. – Tsan 2/9, 2016 at 23:45

You have an incomplete example (as written, it exhibits UB since localval is uninitialized) so let me complete it:

uint32 localval;
void * DataPtr;
DataPtr = something;
localval = 42;
(*(float32*)(DataPtr))= (*(const float32*)((const void*)(&localval)));

Now, since localval has type uint32 and *(const float32*)((const void*)(&localval)) has type float32, they cannot alias, so the compiler is free to reorder the last two statements with respect to each other. This would obviously result in behavior different from what you want.

The correct way to write this is:

memcpy(DataPtr, &localval, sizeof localval);

Straka answered 23/5, 2014 at 5:20 Comment(14)

Thanks for your comment, i understand that this is not allowed, and i understand that memcpy is a better solution. But as i understand although the types are different to the compiler and it says i cannot alias, I am interested to know what could be the possible optimisations in this specific usecase.. (where the types are same) – Contravene 23/5, 2014 at 5:32

Since the types are different, your program is not allowed to do what you're trying to do; doing so invokes undefined behavior, and therefore, the compiler may do anything it likes. In particular, the compiler is likely to assume that objects of different types never alias, so that it can perform the reordering I described. – Straka 23/5, 2014 at 6:2

@R: if i understood correctly... The right side operation could give me a value which i didnt expect and the DataPtr might be copied with this unexpected value ? – Contravene 23/5, 2014 at 7:9

In practice that's what's likely to happen, but formally anything can happen since the behavior is undefined. – Straka 23/5, 2014 at 11:34

The correct thing should be to drop the piano on compiler writers who, given T *p;, are unwilling to regard *(U*)p=whatever; as being a potential access to things of type T or U, but would instead insist that programmers use memcpy, which compilers must treat as a potential access to things of every type even when the programmer knows the pointer will never identify anything that isn't a T or U. I don't know why compiler writers decided they could make the world a better place if they force programmers to write code which is harder to read and less optimizable, but they did. – Tsan 14/4, 2016 at 20:23

@supercat: That only works in special cases that can't be classified in any reasonable way. Consider *(U*)idfunc(p) where the definition of idfunc might or might not be visible and idfunc returns void *. In order to get the behavior you want, you have to drop essentially all non-aliasing assumptions, and that means dropping essentially all opportunities for vectorization. – Straka 14/4, 2016 at 20:38

@R..: Except in the scenario where idfunc is being called indirectly, I don't see why it would be returning a U* rather than a void* unless it was supposed to work with things of multiple types in which case aliasing assumptions would seem somewhat dangerous unless the function uses memcpy internally which would already force the compiler to drop all non-aliasing assumptions. – Tsan 14/4, 2016 at 22:55

@R..: I will grant that there could be some situations in which the particular function which is called will only work with things of type U, though other functions identified by the same pointer type would work with things of other types, and for that scenario it would be helpful to have a means of telling the compiler that the void* will really only be used to access things of type U, but I can't believe that situation would arise anywhere near as often as other cases where code has to use memcpy to throw all aliasing assumptions out the window. – Tsan 14/4, 2016 at 22:57

@R..: Besides--if the compiler can't see into the function that's being called, it will generally have no reason to believe that function wouldn't be capable of accessing anything and everything that has ever been exposed the outside world and isn't targeted by any live restrict pointers, so I'm not sure exactly what aliasing assumptions would be lost in any common scenario even there. – Tsan 14/4, 2016 at 22:58

@supercat: In my example, the store takes place after idfunc returns, and could be reordered with respect to other loads and stores that take place after the return. – Straka 15/4, 2016 at 4:39

@R..: If idfunc is opaque, I'm not sure I see the advantage of taking a load or store which would be sequenced after the use of the returned pointer and reordering it to precede that use, given that (because idfunc is opaque) the load or store would need to remain after the call. Given code like *(T*)func1() = func2();, the fact that the call to func2() could occur between the casting operation and the use of the pointer would imply that func2 itself cannot make use of the object identified by func1. – Tsan 15/4, 2016 at 5:0

@R..: I'm not sure what situations you're seeing where significant code reordering optimizations would be blocked, which would by anywhere near as common as situations where use of memcpy would have even worse effects. Further, there are a fair number of places where TBAA is not sufficient to allow code to be reordered optimally, and directives to take care of those would also take care of the tougher TBAA-related cases. – Tsan 15/4, 2016 at 5:4

@R..: Upon further consideration, I can see situations where things need to be kept as void* even though the code that reads and writes the pointers will use the same type of target. That could be handled, though, somewhat intuitively by saying that a pointer cast act should be interpreted as saying "Something is about to be accessed as a weird type" but an implicit conversion through void* should not. What's most important, though, is that efficient memory management task needs, at minimum, the ability to interpret a range of memory that has been used as one type, as holding... – Tsan 15/4, 2016 at 16:45

...objects of another type that initially have Indeterminate Values, without having to overwrite the memory first, efficient vectorization and other techniques require the ability to move writes across unrelated writes, and C99 presently lacks the semantics to meet either need [C89 could meet the former need if one assumes the authors meant that no named object may be accessed using a pointer of another type--an assumption consistent with the example in the rationale]. One wouldn't have to add much to the language for it to meet both needs, but presently it meets neither. – Tsan 15/4, 2016 at 16:50

The const makes no difference. To check if the types are the same size, you can compare sizeof (uint32) to sizeof (float32). It's also possible that the two types have differing alignment requirements.

Those things aside; the behaviour is undefined to read the memory of localval as if it had a float stored in it, that's what the strict aliasing rules say.

6.5#6:

The effective type of an object for an access to its stored value is the declared type of the object, if any.

6.5#7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types

localval has effective type uint32 , and the list of "the following types" doesn't include float32 so this is a violation of the aliasing rules.

If you were aliasing in dynamically allocated memory, then it is different. There's no "declared type", so the "effective type" is whatever was last stored in the object. You could malloc(sizeof (uint32)), and then store a float32 in it and read it back.

To sum up, you seem to be asking "I know this is undefined, but can I rely on my compiler successfully doing it?" To answer that question you will have to specify what your compiler is, and what switches you are invoking it with, at least.

Of course there is also the option of adjusting your code so it does not violate the strict-aliasing rules, but you haven't provided enough background info to proceed down this track.

Eccentric answered 23/5, 2014 at 5:18 Comment(2)

just a nickpick, but I think for malloced memory the rule is that the type is the type of the first store. – Graeco 23/5, 2014 at 5:20

@JensGustedt the text in N1256 for that case is "the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value". I take that to mean that the effective type is set for subsequent reads - but it does not prevent a subsequent write from setting a new effective type. (Of course my interpretation may or may not be what the standard-writers intended) – Eccentric 23/5, 2014 at 5:23

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags