Unexpected optimization of strlen when aliasing 2-d array
Asked Answered
F

5

36

Here is my code:

#include <string.h>
#include <stdio.h>

typedef char BUF[8];

typedef struct
{
    BUF b[23];
} S;

S s;

int main()
{
    int n;

    memcpy(&s, "1234567812345678", 17);

    n = strlen((char *)&s.b) / sizeof(BUF);
    printf("%d\n", n);

    n = strlen((char *)&s) / sizeof(BUF);
    printf("%d\n", n);
}

Using gcc 8.3.0 or 8.2.1 with any optimization level except -O0, this outputs 0 2 when I was expecting 2 2. The compiler decided that the strlen is bounded to b[0] and therefore can never equal or exceed the value being divided by.

Is this a bug in my code or a bug in the compiler?

This isn't spelled out in the standard clearly, but I thought the mainstream interpretation of pointer provenance was that for any object X, the code (char *)&X should generate a pointer that can iterate over the whole of X -- this concept should hold even if X happens to have sub-arrays as internal structure.

(Bonus question, is there a gcc flag to turn off this specific optimization?)

Frisian answered 8/11, 2019 at 2:35 Comment(30)
I've heard there were evil professors out in the wild that would pose questions of this type, but I always considered the likelihood more akin to the chance of meeting Bigfoot -- until now... So much packed into so few lines.. It's a good one.Weissberg
@DavidRankin-ReinstateMonica Likewise... the theoretical has become the realityFrisian
@DavidRankin-ReinstateMonica I think that you have your 8 and 23 switched. Its an array of 23 arrays of 8.Spireme
@AviBerger - very correct your are. I new dyslexia would appear at some point. (along with senility...)Weissberg
edit - Where I"m stuck, is the address of the struct will be the address of its first member. You are guaranteed no initial padding. So the address of the struct will be the address of the char [23][8] array. Which on access will be char (*)[8], with the same address of s or s.b. But taking the address again &s.b will have the type pointer to char (*)[8] while &s would be a pointer to struct S. Now s.b on access should/would be subject to 6.3.2.1(p3) resulting in the bounds of char (*)[8] being limited to b[0]. But that's as far as I get :)Weissberg
I also think the key here is to remove the (char *) casts and think about whether the use of strlen() is on a compatible type. That's why I don't think there would be a compiler flag to deal with the specific optimization. But my certainty there is why this has been posted as a comment and not an answer.Weissberg
This isn't spelled out in the standard clearly When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. If this is not clear then I don't even know...Erinnerinna
@Frisian When the division / sizeof(BUF) is skipped, what is your output? 16 16?Campney
Ref: My gcc 7.4.0 reports 2 2 under various options.Campney
@LanguageLawyer so you're saying it's clearly a compiler bug? (&s.b being a pointer to the 23x8 array subobject)Frisian
Closely related: https://mcmap.net/q/341648/-dereferencing-an-out-of-bound-pointer-that-contains-the-address-of-an-object-array-of-array/963864Cyrene
very interesting. here godbolt.org/z/yRwErF you can see it gives right result, but in here godbolt.org/z/ZCjbBt it shows result of zero. we can take it to extreme in here godbolt.org/z/dMcrdy and it still result in zero. Any ideas?Fraya
and even this godbolt.org/z/vfsSn7 is very weird. still compiler gives result of zero.Fraya
Obviously, s and s.b are not at the same address. How come that's an optimization rather than an alignment question?Straightout
@Straightout the standard guarantees they are at the same address (struct cannot have initial padding)Frisian
@Straightout moreover, take this for example godbolt.org/z/9YQgru where the example uses only &s.b, still result is zeroFraya
@DavidRankin-ReinstateMonica "resulting in the bounds of char (*)[8] being limited to b[0]. But that's as far as I get" I think that nails it. since s.b is limited to b[0] it is limited to 8 characters, and hence two options: (1) out-of-bound access in case there are 8 non-null characters, which is UB, (2) there is a null character, in which the len is less than 8, hence dividing by 8 gives zero. So putting together (1)+(2) compiler can use the UB to give same result to both casesFraya
@user2162550 I don't see where "s.b is limited to b[0]" is specified by the standard (the compiler clearly treats it as such but the passage quoted by Language Lawyer seems to explicitly say that s.b is not limited to any subobject)Frisian
I've seen a few other gcc8 strlen bugs (externals.io/message/103041), but their solution of -fno-optimize-strlen didn't work here.Aiaia
You could indicate gcc to generate the assembly output to see what it is really doing.Preexist
@Preexist it replaces n = strlen((char *)&s.b) / sizeof(BUF); with setting n to 0Frisian
is there a gcc flag to turn off this specific optimization? -fno-builtin-strlen seems to fix it.Subsume
I don't see a reason to get 0 2 nor I get. &s is the same as &s.b and they both evaluate to the same value.Spartan
Not sure if this answers the question, but the string isn't 17 chars long, it's 16. Also, your buffers aren't initialised, so your char arrays will end with garbage, which will confuse strlen.Epigone
@AdamJRichardson global variables are zero-initialized, and string literals end in a null terminatorFrisian
Oh, that's neat. Didn't know that. Thanks. The string length point is still valid though.Epigone
@AdamJRichardson the 17 is intentional, to copy the 16 digits plus the null terminatorFrisian
Ok. Well, that's me out of ideas then. :)Epigone
@KamilCuk, found another flags wich affect this behaviour. Check my answer please.Keli
@Aiaia that -fno-optimize-strlen didn't work. But other flags did. See my aswer.Keli
K
1

I checked this and it reproduced with -O1 on gcc 8.3, so I just opened list of gcc optimization flags here and started experimenting with them one by one. It turned out that disabling only sparse conditional constant propagation with -fno-tree-ccp made the problem disappear (oh luck, I planned to test couples of flags if testing one by one gives no result).

Then I switched to -O2 but did not erase -fno-tree-ccp flag. It reproduced again. I said "OK" and just started testing additional -O2 flags. It again appeared that disabling single Value Range Propagation additionaly leads to intended 2 2 output. I then erased that first -fno-tree-ccp flag, but it started reproducing again. So for -O2 you can specify -O2 -fno-tree-ccp -fno-tree-vrp to make yor program work as expected.

I did not erase these flags, but switched to -O3 then. Problem did not reproduced.

So both of these two optimization techniques in gcc 8.3 lead to such a strange behaviour (maybe they use something common internally):

  • Sparse conditional constant propagation on trees
  • Value Range Propagation on trees

I'm not pro in all that stuff to explain what and why is happening there, maybe someone else could explain. But for sure you can specify -fno-tree-ccp -fno-tree-vrp flags to disable these optimizaton techniques for your code to work as expected.

“The harder I work, the luckier I get.” – Samuel Goldwyn

Edit

As @KamilCuk noted in question comments, -fno-builtin-strlen leads to inteded behaviour too, so most probably there is a compiler bug in combination of built-in strlen and another optimization, that is intended to cut off dead code, statically determine possible expression values and propagate constants through a program. I thought compiler most probably mistakenly considered something, that determines string length in its strlen implementation (maybe in combination with integer division and/or two-dimensional arrays) as dead code and cut it off or calculated it as 0 at compile time. So I decided to play a little bit with the code to check the theories and eliminate other possible "participants" of the bug. I came to this minimal example of the behaviour, which confirmed my thoughts:

int main()
{
    // note that "7" - inner arrays size, you can put any other number here
    char b[23][7]; // local variable, no structs, no typedefs
    memcpy(&b[0][0], "12345678123456781234", 21);

    printf("%d\n", strlen(&b[0][0]) / 8); // greater than that "7" !!!
    printf("%d\n", strlen(&b[0][0]) / 7);
    printf("%d\n", strlen(&b[0][0]) / 6); // less than that "7" !!!
    printf("%d\n", strlen(&b[0][0])); // without division
}

0

0

3

20

I think we can consider this a bug in gcc.

I think -fno-builtin-strlen is better solution for the problem, as it works for all optimization levels alone and built-in strlen seems to be less powerful optimization technique, especially if your program doesn't use strlen() a lot. Still -fno-tree-ccp -fno-tree-vrp is also an option.

Keli answered 16/6, 2020 at 11:10 Comment(2)
Thanks, useful info. At this stage my feeling is that it is an incorrect optimization (i.e. bug) in gcc 8) . But good to have that workaround of using those particular flags .Frisian
@Frisian edited the answer after reading some comments.Keli
B
0

The way you defined the struct confused me at first because I don't think I've ever tried creating a type of an array. Doing it that way can be dangerous too because the if someone tried to pass it in to a function, they might think they're passing by value but would actually pass by reference. Regardless of style, if I needed to create a type like that, I would do something like:

//typedef char BUF[8];

//do it this way instead
typedef struct
{
    char x[8];
} BUF;

typedef struct
{
    BUF b[23];
} S;

If I define it that way, then it returns the expected value either way. See it here.

Braunschweig answered 13/6, 2020 at 16:52 Comment(4)
Your change makes the code fundamentally different. If you don't like the typedef then rewrite the code using char[8] instead of BUF everywhereFrisian
Yeah, I agree; it is different. Just playing around with different gcc versions, it looks like gcc improperly optimizes it in v8.x but didn't care in v7.x or v9.x. The way I implemented it works through all three and still preserves (from what I understand) the intent of the memory mapping (23x 8 byte blocks). It's an interesting find, for sure, and I learned something new about typedef'ing arrays.Braunschweig
@Braunschweig I reproduced it without typedef. See my answerKeli
@Braunschweig and also without structs.Keli
H
-2

There are some issues that I can see and they can be affected by how the compiler decides to layout memory.

    n = strlen((char *)&s.b) / sizeof(BUF);
    printf("%d\n", n);

In the above code s.b is a 23 entry array of an array of 8 characters. When you refer to just s.b you are getting the address of the first entry in the 23 byte array (and the first byte in the 8 character array). When the code says &s.b, this is asking for the address of the address of the array. Under the covers, the compiler is more than likely generating some local storage, storing the address of the array in there and supplying the address of the local storage to strlen.

You have 2 possible solutions. They are:

    n = strlen((char *)s.b) / sizeof(BUF);
    printf("%d\n", n);

or

    n = strlen((char *)&s.b[0]) / sizeof(BUF);
    printf("%d\n", n);

I also tried to run your program and demonstrate the issue, but both clang and the version of gcc I have with any -O options still worked as you expected. For what it's worth, I'm running clang version 9.0.0-2 and gcc version 9.2.1 on x86_64-pc-linux-gnu).

Halftimbered answered 1/2, 2020 at 21:54 Comment(0)
L
-3

I think this is may be a bug in gcc. I found a couple solutions, but the easiest seems to be to create a proxy function with the noinline attribute. Then you're not losing out on any other optimizations, just the ones related to strlen.

int  __attribute__ ((noinline)) _strlen(char *x) { return strlen(x); }
#define strlen _strlen

int main(){
    int n;

    memcpy(&s, "1234567812345678", 17);
    n = strlen((char *)&s.b) / sizeof(BUF);
    printf("%d\n", n);

    n = strlen((char *)&s) / sizeof(BUF);
    printf("%d\n", n);
}

You can see the output in the Compiler Explorer. https://godbolt.org/z/U2L9us

Logger answered 6/5, 2020 at 23:5 Comment(1)
You can do that in much cleaner way specifying -fno-builtin-strlen, as @Subsume commented.Keli
P
-5

There are errors in the code.

 memcpy(&s, "1234567812345678", 17);

for example, is risky, even though s starts with b should be:

 memcpy(&s.b, "1234567812345678", 17);

The second strlen() has also errors

n = strlen((char *)&s) / sizeof(BUF);

for example, should be:

n = strlen((char *)&s.b) / sizeof(BUF);

The string s.b, if copied correctly, should be 17 letters long. Not sure how structs are stored in memory, if they are aligned. Have you checked that s.b actually contains the 17 characters copied?

So a strlen(s.b) should show 17

The printf only shows integer numbers, as %d is integer, and the variable n is declared to be an integer. sizeof(BUF), should be 8

So a 17 divided by 8 (17/8) should print 2 as n is declared as integer. As memcpy was used to copy data to s and not to s.b, I would guess that as this has to do with memory alignments; assuming it is a 64 bit computer, than there can be 8 characters on one memory address.

For instance, lets assume that someone has called a malloc(1), than the next "free space" are not aligned...

The second strlen call, shows the correct number, as the string copy was done to the s struct instead of to s.b

Prefiguration answered 29/1, 2020 at 22:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.