Does strict aliasing occur when using void pointer intervening
Asked Answered
I

2

7

Does the following example violates strict aliasing rule?

In file a.c

extern func_takes_word(uint32_t word);

void func(void *obj, size_t size_in_words)
{
    for (int i = 0; i < size_in_words; i++)
        func_takes_word(*(((uint32_t *)obj)+i)); // <--- Here
}

in file b.c

struct some_struct
{
    uint32_t num_0;
    uint32_t num_1;
    uint32_t num_2;
};

extern void func(void *obj, size_t size_in_words);

void some_func(void)
{
    struct some_struct stc = {0, 1, 59}; // assume no padding
    func((void *)&stc, sizeof(struct some_struct)/sizeof(uint32_t));
}

One can say that there is a violation because I send a struct some_struct pointer to func which cast it to an uint32_t pointer than access the value.

But, since func takes a void pointer, and since func is in a different compilation unit than the caller, the compiler cannot "see" such a violation.


And what about the following example, according to my understanding there is no violation and it's completely strict aliasing compliant:

extern func_takes_word(uint32_t word);

void func(void *obj, size_t size_in_words)
{
    uint32_t word;

    for (int i = 0; i < size_in_words; i++)
    {
        // instead of calling memcpy, (or using union type punning) 
        // for learning purpose
        *(char *)&word = *((char *)obj+ (sizeof(uint32_t) * i)); 
        *(((char *)&word) + 1) = *((char *)obj+ (sizeof(uint32_t) * i)+1);
        *(((char *)&word) + 2) = *((char *)obj+ (sizeof(uint32_t) * i)+2);
        *(((char *)&word) + 3) = *((char *)obj+ (sizeof(uint32_t) * i)+3);
        func_takes_word(word);
    }
}

Am I correct?

Illiquid answered 4/9, 2019 at 16:31 Comment(0)
S
6

This is mostly a duplicate question, but I am going to write out an answer to it anyway, because I can't find an earlier answer that discusses the specific issues of casting through void * and separate compilation.

First off, let us imagine a simpler version of your code:

#include <stddef.h>
#include <stdint.h>

extern void func_takes_word(uint32_t word);

struct __attribute__((packed, aligned(_Alignof(uint32_t)))) some_struct
{
    uint32_t num_0;
    uint32_t num_1;
    uint32_t num_2;
};

void some_func(void)
{
    struct some_struct stc = {0, 1, 59};

    for (size_t i = 0; i < sizeof(struct some_struct) / sizeof(uint32_t); i++)
        func_takes_word(*(((uint32_t *)&stc) + i));
}

(The GCC __attribute__((packed, aligned(...))) annotation is present only to exclude the possibility of any problems due to padding or misalignment. Everything I say below would still be true if you took it out.)

According to the most straightforward interpretation of C2011 as written, this code does violate the "strict aliasing" rules (N1570: 6.2.7 and 6.5p6,7). The type struct some_struct is not compatible with the type uint32_t. Therefore, taking the address of an object with declared type struct some_struct, casting the resultant pointer to type uint32_t *, adding a nonzero offset, and dereferencing the cast pointer, has undefined behavior. It really is that simple. (EDIT: If the pointer is not offset, the dereference has well-defined behavior, because of a special case rule hiding in section 6.7.2p15 which I totally forgot about. Thanks to dbush for pointing this out.)

Many people angrily resist this interpretation of the standard and insist that the committee must have meant something else, because there are millions, if not billions, of lines of "legacy" C code out there that do exactly the above and expect it to work. Not to mention that it's unclear how you could do anything useful whatsoever with offsetof under this interpretation. But the text really does say this, there's no other plausible interpretation, and the wording of the relevant sections of the standard has been mostly unchanged since the original 1989 ANSI C. I think we have to assume that the committee's lack of interest in changing the text, for thirty years now, despite several formal requests for clarification or correction, means it says what they wanted it to say.


Now, regarding casting through void * and/or splitting up the operations so that the original "effective type" of the object is not visible to the code that performs the dereference: These make no difference. Your original pair of translation units still has undefined behavior.

Casts through void * make no difference because the rules in section 6.5.p6 say nothing about intermediate casts. They only talk about the "effective type" of the actual object in memory, and the type of the lvalue expression used to access the object. So, it doesn't matter what types the pointer may have had in between the time when the object's address was taken, and the time when the pointer was dereferenced (as long as none of the casts destroy information, which is guaranteed not to happen for casts from object types to void * and back).

Splitting the operations up, so that the original "effective type" of the object is not visible (statically) to the code that performs the dereference, makes no difference because the C standard places no limits whatsoever on the sophistication of the analysis the compiler is allowed to perform before deciding whether an access is allowed. In particular, an implementation that tags every byte of memory with its "effective type", and performs runtime checks on every dereference, has been explicitly endorsed by the committee (not in the text of the standard, but in DR responses, I don't remember how long ago this was and WG14's website is not very searchable). An implementation is also allowed to do arbitrarily aggressive inlining and interprocedural analysis, during translation phase 8 ("link-time optimization") as well as phase 7. Collapsing your original program into my "simpler version" is well within the capabilities of current-generation whole-program optimizing compilers.


As pointed out in the comments on the question, you may be able to rely on knowledge of how sophisticated a specific implementation's optimizer is, or on an implementation's overt extensions (e.g. __attribute__((noinline))) to control whether or not you get machine code that behaves as intended despite the undefined behavior. The C standard even explicitly licenses you to do this, by defining a distinction between a "conforming program" and a "strictly conforming program" (N1570: section 4). A program that relies on one particular implementation's treatment of undefined behavior can still be conforming but is not strictly conforming, and its authors have to be aware that it might break when ported to a different implementation (including, perhaps, a newer version of the same compiler).

Stoical answered 6/9, 2019 at 15:17 Comment(7)
A pointer to a struct can be safely converted to a pointer to its first member as per 6.7.2.1p15, so *(uint32_t *)obj or ((uint32_t *)obj)[0] is valid but ((uint32_t *)obj)[1] is not.Deteriorate
@Deteriorate Thanks, I totally forgot about 6.7.2.1p15. Answer edited.Stoical
@dbush: Where does the Standard say that an aggregate can have its storage accessed using an lvalue of member type? N1570 6.5p7 provides that an aggregate's member can be accessed using an lvalue of containing aggregate type, but requiring that compilers support the reverse in all corner cases would needlessly impair optimizations. Instead, the Standard relies upon compilers to support lvalue accesses in cases relevant to their intended purposes, regardless of whether the Standard requires them to or not.Paniagua
Can you share a reference to An implementation is also allowed to do arbitrarily aggressive inlining and interprocedural analysis, during translation phase 8 ("link-time optimization") ? I didn't find any occurrence in the standardIlliquid
@user2162550 That's hard; it's allowed because nothing says it isn't allowed. The best you're going to get within the text of the standard is section 5.1.2.3, I think, but you have to read very carefully, paying attention to both what is said and what isn't said, and think about the implications of what is and isn't considered to be "observable behavior."Stoical
Doesnt this imply that you cannot implement 'the' memcpy function in C without running into UB?Jardiniere
@Jardiniere There is a special case for char * that makes the most straightforward implementation of memcpy well-defined. However, if you want to write a fancier version that accesses memory in bigger chunks, there's no way to do that in strictly conforming C.Stoical
P
-2

The rules given in N1570 6.5p7 partition accesses into two categories:

  1. Those which all conforming implementations would be required to process, in all cases, in a manner consistent with the definition and descriptions of "object" and "stored".

  2. Those which conforming implementations may or may not process in such fashion, at their leisure, presumably taking into account upon the needs of their customers.

The authors only expected this partition to be relevant in situations where an implementation's customers might not expect to need a certain construct, but where they might want to use code which does need it. There are many constructs which pretty much everyone would agree compilers should support, but which actually fall into the second category above. Contrary to what the authors of clang and gcc seem to believe, the failure of the Standard to mandate support for a construct cannot reasonably be viewed as passing any judgment as to whether most (if not all) compilers should support it anyway.

The way the Standard is written, even something like:

struct S { int x[1]; } s;
int test1(void)
{
  s.x[0] = 1;
}

which is downright tame compared to your example falls in the second category above. Further, relatively little code would rely upon a compiler given something something close to your example such as:

struct S { int x[1], y[1], z[1]; } s;
int test2(int index)
{
  s.y[0] = 1;
  s.x[index] = 2;
  return s.y[0];
}

to allow for the possibility that the access to s.x[index] might affect s.y[0]. Code which needs to access memory in "interesting" ways generally does so with constructs that could be easily recognized by any compiler that cared to look for them, e.g. given something closer to your example:

struct S { int x[1], y[1], z[1]; } s;
int test3(int index)
{
  s.y[0] = 1;
  ((int*)&s)[index] = 2;
  return s.y[0];
}

it would seem unlikely that programmers would cast a struct S* to an int* absent an intention to access it in "unusual" fashion, and thus a compiler that treats such a cast as an indication that it should allow for such things would be unlikely to block useful optimizations.

The Standard makes no distinction between test2 and test3, and I'm not aware of anything in the clang nor gcc documentation that would do so either. Although current versions of gcc seems to make such a distinction, and although current clang misses the optimization for both functions, I would not rely upon either compiler to support test3 or other functions requiring similar semantics meaningfully unless or until they explicitly document such support.

Paniagua answered 7/9, 2019 at 20:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.