Super weird segfault with gcc 4.7 -- Bug?
Asked Answered
L

2

5

Here is a piece of code that I've been trying to compile:

#include <cstdio>

#define N 3

struct Data {
    int A[N][N];
    int B[N];
};

int foo(int uloc, const int A[N][N], const int B[N])
{
    for(unsigned int j = 0; j < N; j++) {
        for( int i = 0; i < N; i++) {
            for( int r = 0; r < N ; r++) {
                for( int q = 0; q < N ; q++) {
                   uloc += B[i]*A[r][j] + B[j];
                }
            }
        }
    }
    return uloc;
}

int apply(const Data *d)
{
    return foo(4,d->A,d->B);
}

int main(int, char **)
{
    Data d;
    for(int i = 0; i < N; ++i) {
        for(int j = 0; j < N; ++j) {
            d.A[i][j] = 0.0;
        }
        d.B[i] = 0.0;
    }

    int res = 11 + apply(&d);

    printf("%d\n",res);
    return 0;
}

Yes, it looks quite strange, and does not do anything useful at all at the moment, but it is the most concise version of a much larger program which I had the problem with initially.

It compiles and runs just fine with GCC(G++) 4.4 and 4.6, but if I use GCC 4.7, and enable third level optimizations:

g++-4.7 -g -O3 prog.cpp -o prog

I get a segmentation fault when running it. Gdb does not really give much information on what went wrong:

(gdb) run
Starting program: /home/kalle/work/code/advect_diff/c++/strunt 

Program received signal SIGSEGV, Segmentation fault.
apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
25      int apply(const Data *d)
(gdb) bt
#0  apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
#1  0x00000000004004cc in main () at src/strunt.cpp:34

I've tried tweaking the code in different ways to see if the error goes away. It seems necessary to have all of the four loop levels in foo, and I have not been able to reproduce it by having a single level of function calls. Oh yeah, the outermost loop must use an unsigned loop index.

I'm starting to suspect that this is a bug in the compiler or runtime, since it is specific to version 4.7 and I cannot see what memory accesses are invalid.

Any insight into what is going on would be very much appreciated.

It is possible to get the same situation with the C-version of GCC, with a slight modification of the code.

My system is:

Debian wheezy Linux 3.2.0-4-amd64 GCC 4.7.2-5


Okay so I looked at the disassembly offered by gdb, but I'm afraid it doesn't say much to me:

Dump of assembler code for function apply(Data const*):
   0x0000000000400760 <+0>: push   %r13
   0x0000000000400762 <+2>: movabs $0x400000000,%r8
   0x000000000040076c <+12>:    push   %r12
   0x000000000040076e <+14>:    push   %rbp
   0x000000000040076f <+15>:    push   %rbx
   0x0000000000400770 <+16>:    mov    0x24(%rdi),%ecx
=> 0x0000000000400773 <+19>:    mov    (%rdi,%r8,1),%ebp
   0x0000000000400777 <+23>:    mov    0x18(%rdi),%r10d
   0x000000000040077b <+27>:    mov    $0x4,%r8b
   0x000000000040077e <+30>:    mov    0x28(%rdi),%edx
   0x0000000000400781 <+33>:    mov    0x2c(%rdi),%eax
   0x0000000000400784 <+36>:    mov    %ecx,%ebx
   0x0000000000400786 <+38>:    mov    (%rdi,%r8,1),%r11d
   0x000000000040078a <+42>:    mov    0x1c(%rdi),%r9d
   0x000000000040078e <+46>:    imul   %ebp,%ebx
   0x0000000000400791 <+49>:    mov    $0x8,%r8b
   0x0000000000400794 <+52>:    mov    0x20(%rdi),%esi

What should I see when I look at this?


Edit 2015-08-13: This seem to be fixed in g++ 4.8 and later.

Lura answered 30/1, 2014 at 15:25 Comment(5)
Yes, its a bug. Look at the generated assembly of around where it segfaults.Eritrea
I knew you could omit argument names in a function prototype, but I've never seen anyone omit the names in an actual function definition before.St
looks like it fails with 4.6 either: coliru.stacked-crooked.com/a/1d50e7c5360796e6Ence
To me this seems very clear cut: arrays at every other scope but namescape scope are uninitialized. Accessing them is undefined behavior. Your code behaves accordingly. Notice that on gcc-4.8/clang-4.3 the output of your program is not a segfault but plain random.Lederman
@Lederman initialization added now. The program behaves identically.Lura
E
2

It indeed and unfortunately is a bug in gcc. I have not the slightest idea what it is doing there, but the generated assembly for the apply function is ( I compiled it without main btw., and it has foo inlined in it):

_Z5applyPK4Data:
        pushq   %r13
        movabsq $17179869184, %r8
        pushq   %r12
        pushq   %rbp
        pushq   %rbx
        movl    36(%rdi), %ecx
        movl    (%rdi,%r8), %ebp
        movl    24(%rdi), %r10d

and exactly at the movl (%rdi,%r8), %ebp it will crashes, since it adds a nonsensical 0x400000000 to $rdi (the first parameter, thus the pointer to Data) and dereferences it.

Eritrea answered 30/1, 2014 at 16:2 Comment(0)
S
6

You never initialized d. Its value is indeterminate, and trying to do math with its contents is undefined behavior. (Even trying to read its values without doing anything with them is undefined behavior.) Initialize d and see what happens.


Now that you've initialized d and it still fails, that looks like a real compiler bug. Try updating to 4.7.3 or 4.8.2; if the problem persists, submit a bug report. (The list of known bugs currently appears to be empty, or at least the link is going somewhere that only lists non-bugs.)

St answered 30/1, 2014 at 15:31 Comment(13)
Wouldnt cause a segfault, or in other words, initializing will cause the segfault tooEritrea
@PlasmaHH: The compiler is free to observe that d is never initialized, assume that it will never be used, and not bother to actually allocate it somewhere the program is allowed to read. If the program still segfaults when d is initialized, then it's a compiler bug.St
@user2357112: As I said, it segfaults with initialization too. Also what the compiler is free to do and what a sane compiler usually does are two things. The OP is specifically asking for an implementation behaviour, not for a language lawyers POV. Looking at the generated assembly immediately reveals that its a bug, and gets the OP somewhere.Eritrea
It segfaults whether initialized or not. I removed the initialization for brevity.Lura
@kalj: Show us the version where it's initialized. Perhaps you initialized it wrong.St
Oh, and make sure that the code you show us actually demonstrates the error when run. If you remove anything for brevity, make sure the problem doesn't disappear with those parts removed.St
You might want to clarify that not d is not initialized, but its member arrays and that even trying to read those variables is undefined behavior.Lederman
To all of you :d is default-initialised and is a perfectly valid object. Accessing d.A and d.B is also perfectly safe. But the values inside those arrays are rubbish, because it is what was left in the memory at that place. There is no undefined behavior at all !!Bisk
@Davidbrcz: C++03, section 8.5 part 9 says otherwise. Non-static PODs with no initializer get an indeterminate initial value. If C++11 contradicts this, a standard reference would be appreciated.St
@user2357112: The value is indeterminate, but where is written that doing calculations with indeterminate int values is UB?Eritrea
@ user2357112 Yeah, indeterminate initial value. That is what I meat by rubbish. There is still no undefined behaviour. You don't what you are going to read, but it is safe to read it.Bisk
I cant edit my last comment, nut I might be wrong after all =/Bisk
For everyone who still thinks the initialization is a problem, I've added it now. An it behaves exactly the same as before.Lura
E
2

It indeed and unfortunately is a bug in gcc. I have not the slightest idea what it is doing there, but the generated assembly for the apply function is ( I compiled it without main btw., and it has foo inlined in it):

_Z5applyPK4Data:
        pushq   %r13
        movabsq $17179869184, %r8
        pushq   %r12
        pushq   %rbp
        pushq   %rbx
        movl    36(%rdi), %ecx
        movl    (%rdi,%r8), %ebp
        movl    24(%rdi), %r10d

and exactly at the movl (%rdi,%r8), %ebp it will crashes, since it adds a nonsensical 0x400000000 to $rdi (the first parameter, thus the pointer to Data) and dereferences it.

Eritrea answered 30/1, 2014 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.