Different Static Global Variables Share the Same Memory Address
Asked Answered
G

3

26

Summary

I have several C source files that all declare individual identically named static global variables. My understanding is that the static global variable in each file should be visible only within that file and should not have external linkage applied, but in fact I can see when debugging that the identically named variables share the same memory address.

It is like the static keyword is being ignored and the global variables are being treated as extern instead. Why is this?

Example Code

foo.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someFooFunc(void) {
    myVar = VALUE_B;
}

bar.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someBarFunc(void) {
    myVar = VALUE_C;
}

baz.c:

/* Private variables -----------------------------------*/
static myEnumType myVar = VALUE_A;

/* Exported functions ----------------------------------*/
void someBazFunc(void) {
    myVar = VALUE_D;
}

Debugging Observations

  1. Set breakpoints on the myVar = ... line inside each function.
  2. Call someFooFunc, someBarFunc, and someBazFunc in that order from main.
  3. Inside someFooFunc myVar initially is set to VALUE_A, after stepping over the line it is set to VALUE_B.
  4. Inside someBarFunc myVar is for some reason initally set to VALUE_B before stepping over the line, not VALUE_A as I'd expect, indicating the linker may have merged the separate global variables based on them having an identical name.
  5. The same goes for someBazFunc when it is called.
  6. If I use the debugger to evaluate the value of &myVar when at each breakpoint the same address is given.

Tools & Flags

Toolchain: GNU ARM GCC (6.2 2016q4)

Compiler options:

arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mlong-calls -O1 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -ffreestanding -fno-move-loop-invariants -Wall -Wextra  -g3 -DDEBUG -DTRACE -DOS_USE_TRACE_ITM -DSTM32L476xx -I"../include" -I"../system/include" -I"../system/include/cmsis" -I"../system/include/stm32l4xx" -I"../system/include/cmsis/device" -I"../foo/inc" -std=gnu11 -MMD -MP -MF"foo/src/foo.d" -MT"foo/src/foo.o" -c -o "foo/src/foo.o" "../foo/src/foo.c"

Linker options:

arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mlong-calls -O1 -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -ffreestanding -fno-move-loop-invariants -Wall -Wextra  -g3 -T mem.ld -T libs.ld -T sections.ld -nostartfiles -Xlinker --gc-sections -L"../ldscripts" -Wl,-Map,"myProj.map" --specs=nano.specs -o ...
Groundwork answered 28/6, 2017 at 12:26 Comment(8)
This might be some name mangling issue in the debugger, causing it to trick you. Instead of trusting the debugger, try to print the variables' addresses and values from inside their respective translation units.Jemie
Well, it is possible that identical naming of your variables in different modules screw up the debugger symbol resolution. Consider taking a look at someFooFunc, someBarFunc and someBazFunc assembler codes - this might give you a hint on either these variables actually share the same address (which should not be true).Karrah
Why do you use the C frontend for compiling, but g++ for linking?Coverlet
Make your program so that the behaviour would differ if the variables did or did not have separate storage, and confirm output by running the program. Perhaps the linker detects that it can do what it's doing because it doesn't affect the program.Measles
I'm using the GCC ARM Embedded toolchain, called from Eclipse running the GNU ARM Eclipse plug-ins. This setup uses arm-none-eabi-g++ for linking.Groundwork
Which addresses do you get qualifying the variable names with the filenames?Coverlet
Note that there are features of C++ that would solve your problem for you, and that mostly using the C subset of C++ may work for your code base.Shopworn
GDB's manual on program variables describe how to resolve the particular variable. One is bar.c::myVar and the other is foo.c::myVar. Also, recommends using -gstabs if available and hopefully you don't have a class foo with member c.Mismanage
K
23

NOTE: I do understand that OP's target platform is ARM, but nevertheless I'm still posting an answer in terms of x86. The reason is, I have no ARM backend in handy, while the question is not limited to a particular architecture.

Here's a simple test stand. Note that I'm using int instead of custom enum typedef, since it should not matter at all.

foo.c

static int myVar = 1;

int someFooFunc(void)
{
        myVar += 2;
        return myVar;
}

bar.c

static int myVar = 1;

int someBarFunc(void)
{
        myVar += 3;
        return myVar;
}

main.c

#include <stdio.h>

int someFooFunc(void);
int someBarFunc(void);

int main(int argc, char* argv[])
{
        printf("%d\n", someFooFunc());
        printf("%d\n", someBarFunc());
        return 0;
}

I'm compiling it on x86_64 Ubuntu 14.04 with GCC 4.8.4:

$ g++ main.c foo.c bar.c
$ ./a.out
3
4

Obtaining such results effectively means that myVar variables in foo.c and bar.c are different. If you look at the disassembly (by objdump -D ./a.out):

000000000040052d <_Z11someFooFuncv>:
  40052d:       55                      push   %rbp
  40052e:       48 89 e5                mov    %rsp,%rbp
  400531:       8b 05 09 0b 20 00       mov    0x200b09(%rip),%eax        # 601040 <_ZL5myVar>
  400537:       83 c0 02                add    $0x2,%eax
  40053a:       89 05 00 0b 20 00       mov    %eax,0x200b00(%rip)        # 601040 <_ZL5myVar>
  400540:       8b 05 fa 0a 20 00       mov    0x200afa(%rip),%eax        # 601040 <_ZL5myVar>
  400546:       5d                      pop    %rbp
  400547:       c3                      retq

0000000000400548 <_Z11someBarFuncv>:
  400548:       55                      push   %rbp
  400549:       48 89 e5                mov    %rsp,%rbp
  40054c:       8b 05 f2 0a 20 00       mov    0x200af2(%rip),%eax        # 601044 <_ZL5myVar>
  400552:       83 c0 03                add    $0x3,%eax
  400555:       89 05 e9 0a 20 00       mov    %eax,0x200ae9(%rip)        # 601044 <_ZL5myVar>
  40055b:       8b 05 e3 0a 20 00       mov    0x200ae3(%rip),%eax        # 601044 <_ZL5myVar>
  400561:       5d                      pop    %rbp
  400562:       c3                      retq   

You can see that the actual addresses of static variables in different modules are indeed different: 0x601040 for foo.c and 0x601044 for bar.c. However, they are associated with a single symbol _ZL5myVar, which really screws up GDB logic.

You can double-check that by means of objdump -t ./a.out:

0000000000601040 l     O .data  0000000000000004              _ZL5myVar
0000000000601044 l     O .data  0000000000000004              _ZL5myVar

Yet again, different addresses, same symbols. How GDB will resolve this conflict is purely implementation-dependent.

I strongly believe that it's your case as well. However, to be double sure, you might want to try these steps in your environment.

Karrah answered 28/6, 2017 at 12:44 Comment(2)
printf is not easily available to me (cross compiling for an embedded target), but storing the return values from someFooFunc and someBarFunc and checking them with the debugger gives me 3 and 4 as in your example, so as you said, it looks like it's the debugger getting confused by the identical variable names. Thanks for your help!Groundwork
@Groundwork well, I used printf as an example, and you figured out the rest :) In fact, printf is just a fancy decorator, as you might see I did not use it to obtain any low-level information. Cheers!Karrah
C
3

so.s make the linker happy

.globl _start
_start: b _start

one.c

static unsigned int hello = 4;
static unsigned int one = 5;
void fun1 ( void )
{
    hello=5;
    one=6;
}

two.c

static unsigned int hello = 4;
static unsigned int two = 5;
void fun2 ( void )
{
    hello=5;
    two=6;
}

three.c

static unsigned int hello = 4;
static unsigned int three = 5;
void fun3 ( void )
{
    hello=5;
    three=6;
}

first off if you optimize then this is completely dead code and you should not expect to see any of these variables. The functions are not static so they dont disappear:

Disassembly of section .text:

08000000 <_start>:
 8000000:   eafffffe    b   8000000 <_start>

08000004 <fun1>:
 8000004:   e12fff1e    bx  lr

08000008 <fun2>:
 8000008:   e12fff1e    bx  lr

0800000c <fun3>:
 800000c:   e12fff1e    bx  lr

If you dont optimize then

08000000 <_start>:
 8000000:   eafffffe    b   8000000 <_start>

08000004 <fun1>:
 8000004:   e52db004    push    {r11}       ; (str r11, [sp, #-4]!)
 8000008:   e28db000    add r11, sp, #0
 800000c:   e59f3020    ldr r3, [pc, #32]   ; 8000034 <fun1+0x30>
 8000010:   e3a02005    mov r2, #5
 8000014:   e5832000    str r2, [r3]
 8000018:   e59f3018    ldr r3, [pc, #24]   ; 8000038 <fun1+0x34>
 800001c:   e3a02006    mov r2, #6
 8000020:   e5832000    str r2, [r3]
 8000024:   e1a00000    nop         ; (mov r0, r0)
 8000028:   e28bd000    add sp, r11, #0
 800002c:   e49db004    pop {r11}       ; (ldr r11, [sp], #4)
 8000030:   e12fff1e    bx  lr
 8000034:   20000000    andcs   r0, r0, r0
 8000038:   20000004    andcs   r0, r0, r4

0800003c <fun2>:
 800003c:   e52db004    push    {r11}       ; (str r11, [sp, #-4]!)
 8000040:   e28db000    add r11, sp, #0
 8000044:   e59f3020    ldr r3, [pc, #32]   ; 800006c <fun2+0x30>
 8000048:   e3a02005    mov r2, #5
 800004c:   e5832000    str r2, [r3]
 8000050:   e59f3018    ldr r3, [pc, #24]   ; 8000070 <fun2+0x34>
 8000054:   e3a02006    mov r2, #6
 8000058:   e5832000    str r2, [r3]
 800005c:   e1a00000    nop         ; (mov r0, r0)
 8000060:   e28bd000    add sp, r11, #0
 8000064:   e49db004    pop {r11}       ; (ldr r11, [sp], #4)
 8000068:   e12fff1e    bx  lr
 800006c:   20000008    andcs   r0, r0, r8
 8000070:   2000000c    andcs   r0, r0, r12

08000074 <fun3>:
 8000074:   e52db004    push    {r11}       ; (str r11, [sp, #-4]!)
 8000078:   e28db000    add r11, sp, #0
 800007c:   e59f3020    ldr r3, [pc, #32]   ; 80000a4 <fun3+0x30>
 8000080:   e3a02005    mov r2, #5
 8000084:   e5832000    str r2, [r3]
 8000088:   e59f3018    ldr r3, [pc, #24]   ; 80000a8 <fun3+0x34>
 800008c:   e3a02006    mov r2, #6
 8000090:   e5832000    str r2, [r3]
 8000094:   e1a00000    nop         ; (mov r0, r0)
 8000098:   e28bd000    add sp, r11, #0
 800009c:   e49db004    pop {r11}       ; (ldr r11, [sp], #4)
 80000a0:   e12fff1e    bx  lr
 80000a4:   20000010    andcs   r0, r0, r0, lsl r0
 80000a8:   20000014    andcs   r0, r0, r4, lsl r0

Disassembly of section .data:

20000000 <hello>:
20000000:   00000004    andeq   r0, r0, r4

20000004 <one>:
20000004:   00000005    andeq   r0, r0, r5

20000008 <hello>:
20000008:   00000004    andeq   r0, r0, r4

2000000c <two>:
2000000c:   00000005    andeq   r0, r0, r5

20000010 <hello>:
20000010:   00000004    andeq   r0, r0, r4

there are three hello variables created (you should notice by now that there is no reason to start up the debugger this can all be answered by simply examining the compiler and linker output, the debugger just gets in the way)

 800000c:   e59f3020    ldr r3, [pc, #32]   ; 8000034 <fun1+0x30>

 8000034:   20000000    andcs   r0, r0, r0

 8000044:   e59f3020    ldr r3, [pc, #32]   ; 800006c <fun2+0x30>

 800006c:   20000008    andcs   r0, r0, r8

 800007c:   e59f3020    ldr r3, [pc, #32]   ; 80000a4 <fun3+0x30>

 80000a4:   20000010    andcs   r0, r0, r0, lsl r0

20000000 <hello>:
20000000:   00000004    andeq   r0, r0, r4

20000008 <hello>:
20000008:   00000004    andeq   r0, r0, r4

20000010 <hello>:
20000010:   00000004    andeq   r0, r0, r4

each function is accessing its own separate version of the static global. They are not combined into one shared global.

Calandracalandria answered 28/6, 2017 at 14:26 Comment(0)
D
3

The answers thus far have demonstrated that it should work as written, but the actual answer is only in the comments so I will post it as an answer.

What you’re seeing is a debugger artifact, not the real situation. In my experience, this should be your first guess of any truely wierd observation within the debugger. Verify the observation in the actual running program before going on. E.g. an old fashioned debug printf statement.

Dextrocular answered 28/6, 2017 at 19:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.