What are data segment initializers?
Asked Answered
D

2

3

I'm constructing a linker script for a bare metal application running on the STM32F3Discovery board. It uses the startup code from the CMSIS drivers located in the STM32Cube_FW_F3 package, precisely the stm32f303xc.s file.

The above file, a fragment of which is pasted below, has a reference to _sidata:

/* start address for the initialization values of the .data section.
defined in linker script */
.word   _sidata
/* start address for the .data section. defined in linker script */
.word   _sdata
/* end address for the .data section. defined in linker script */
.word   _edata
/* start address for the .bss section. defined in linker script */
.word   _sbss
/* end address for the .bss section. defined in linker script */
.word   _ebss

The references to the start and end of the data and bss sections are self explanatory, on the other hand I'm unable to find anything about data segment initializers. It is used directly after the SP is set post-reset.

stm32f303xc.s

    .section    .text.Reset_Handler
    .weak   Reset_Handler
    .type   Reset_Handler, %function
Reset_Handler:
  ldr   sp, =_estack    /* Atollic update: set stack pointer */

/* Copy the data segment initializers from flash to SRAM */
  movs  r1, #0
  b LoopCopyDataInit

CopyDataInit:
    ldr r3, =_sidata
    ldr r3, [r3, r1]
    str r3, [r0, r1]
    adds    r1, r1, #4

LoopCopyDataInit:
    ldr r0, =_sdata
    ldr r3, =_edata
    adds    r2, r0, r1
    cmp r2, r3
    bcc CopyDataInit
    ldr r2, =_sbss
    b   LoopFillZerobss
/* Zero fill the bss segment. */
FillZerobss:
    movs    r3, #0
    str r3, [r2], #4

LoopFillZerobss:
    ldr r3, = _ebss
    cmp r2, r3
    bcc FillZerobss

/* Call the clock system intitialization function.*/
    bl  SystemInit
/* Call static constructors */
    bl __libc_init_array
/* Call the application's entry point.*/
    bl  main

Which fragment of memory should _sidata point at and how is it related to the data segment?

Dipole answered 17/11, 2019 at 16:28 Comment(0)
N
3

The data segment is going to be located in RAM. Since RAM does not hold its contents on power loss, the initial values of the data segment have to be copied from flash on startup. A copy of the initial content of the .data segment is located at the _sidata label for this purpose; the startup code copies it into the actual data segment.

Negatron answered 17/11, 2019 at 16:41 Comment(6)
So essentially _sidata should point at the load address of the .data section, which is in the ROM?Dipole
@Dipole No. _sidata is a copy of the data section located in ROM. The actual .data section is located in RAM.Negatron
@bartlomiej.n, yes the load address of the .data, but no not in ROM: the data is mutable so has to be loaded into RAMSeamanlike
@Negatron What should the _sidata symbol point at then in the linker script, since .data is located in RAM? Or should the address be hardcoded in the application itself, based on where it is located in ROM?Dipole
@Dipole I believe I have already answered this question. The linker places a copy of the data segment into ROM which the initialisation code then copies into RAM where the actual data segment resides. The start address of this copy is _sidata. How the linker is instructed to do that depends on what linker you use. Please read the linker script and the documentation provided by your vendor for details. Note that nothing in this is hardcoded. The address _sidata is computed by the linker based on where that copy ends up in ROM.Negatron
The first question/assertion here is pedantically correct. _sidata points to the load address. The load address or LMA is the ROM address. The 'virtual' address or VMA is the one in RAM. It is because the linker document was written with the concept of an OS original copy (LMA) and a MMU mapped address for the application (VMA). However, these names don't make sense for Flash/ROM and RAM applications. So the correct answer is don't think of it that way... referring to GNU ld.Ambiversion
A
3

The answer is found in the GNU linker manual under the topics VMA and LMA which stand for 'virtual memory address' and 'load memory address'. For init data (non-zero), we need a copy to initialize with. This is placed in flash, via the linker script with the following stanza,

  /* Used by the startup to initialize data */
  _sidata = LOADADDR(.data);

  /* Initialized data sections into "RAM" Ram type memory */
  .data : 
  {
    . = ALIGN(4);      /* NOT Alignment, this does nothing. */
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */

    . = ALIGN(4);      /* Alignment of section size (and next section start). */
    _edata = .;        /* define a global symbol at data end */
    
  } >RAM AT> FLASH

The 'AT> FLASH' says that the initial data is place in a FLASH section and the LOADADDR() is a function to get this address (LMA). The section is placed in >RAM so that all code references to these variables will be fixed up to use the 'working' address (the term VMA).

_sidata, _sdata and _edata are all variables declared in the linker file. They are available as addresses in 'C' or assembler code.

on the other hand I'm unable to find anything about data segment initializers

Hopefully, the above explained that. As well, the 'RAM' version of the linker file includes these variables and also copies the data to itself. So, the STM32 authors also seem to be confused.


This code is very suspect. Lets start with,

 .word   _sidata

This is making space for data that uses the global address _sidata. It exists in the linker command file. The real use should be .extern _sidata, but this is the default. The whole leading part of this file does nothing?


/* Copy the data segment initializers from flash to SRAM */  
  movs  r1, #0         ; r1 is a counter up to size. 
  b  LoopCopyDataInit
CopyDataInit:
  ldr  r3, =_sidata    ; reload flash pointer
  ldr  r3, [r3, r1]    ; add count to flash pointer and get value
  str  r3, [r0, r1]    ; store value to RAM
  adds  r1, r1, #4     ; increment counter
    
LoopCopyDataInit:
  ldr  r0, =_sdata     ; (re)load destination
  ldr  r3, =_edata     ; (re)load end destination
  adds  r2, r0, r1     ; add count to start
  cmp  r2, r3          ; are we at end?
  bcc  CopyDataInit    ; loop.
 

This code is completely inefficient and convoluted; the loop body is nine instructions and continually re-calculates values. It has five memory accesses. The gnu linker manual give a formulae to do this in 'C' and the same symbols can easily be used in the gnu assembler.

.extern _sidata   /* Source of init data in flash. */
.extern _sdata    /* Target/start of init data in RAM. */
.extern _edata    /* End of init data in RAM. */

ldr r0, =_sidata
ldr r1, =_sdata
ldr r2, =_edata

/** Validate parameters. */
cmp   r1, r2       /* Zero size */
it    ne
cmpne r0, r1       /* Src is dest */
beq   2f           /* Skip it. */

1: /* init data copy loop */
ldr r3, [r0], #4  /* Load from flash and update source pointer. */
str r3, [r1], #4  /* Store to RAM and update dest pointer. */
cmp r1, r2
blo 1b
2:                /* exit */

It is quite easy to see that some padding and alignment would allow the inner looped (four instructions and two memory accesses) to be unrolled and/or converted to ldm and stm. The whole STM32 code set seems Mikey mouse the more I use it. The code alternative above would be produced with a compiler as per the gnu ld manual, but using uint32_t pointers instead of 'char' as the data is aligned. Ie, the loop,

extern uint32_t _etext, _data, _edata;
uint32_t *src = &_etext;
uint32_t  *dst = &_data;
/* ROM has data at end of text; copy it.  */
while (dst < &_edata)
  *dst++ = *src++;

The general topic would be ARM memcpy() optimization. As we have control of the linker script, guarantees about source alignment and size can be enforced via the linker script to avoid head/tail alignment issues. In the linker script I have, this alignment is four bytes.

Ambiversion answered 12/4, 2022 at 22:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.