I'm writing C
code and compile it for the PowerPC
architecture. That said C
code contains floating point variable constants which I want to be placed in the .text
section instead of .rodata
so the function code is self-contained.
The problem with this is that in PowerPC
, the only way to move a floating point value into a floating point register is by loading it from memory. It is an instruction set restriction.
To convince GCC
to help me, I tried declaring the floats as static const
. No difference. Using pointers, same results. Using __attribute__((section(".text")))
for the function, same results and for each floating point variable individually:
error: myFloatConstant causes a section type conflict with myFunction
I also tried disabling optimizations via #pragma GCC push_options
#pragma GCC optimize("O0")
and #pragma GCC pop_options
. Plus pretending I have an unsigned int
worked:
unsigned int *myFloatConstant = (unsigned int *) (0x11000018);
*myFloatConstant = 0x4C000000;
Using the float:
float theActualFloat = *(float *) myFloatConstant;
I still would like to keep -O3
but it again uses .rodata
so a potential answer would include which optimization flag causes the floats to be placed in .rodata
since starting from -O1
this is happening?
Best case scenario would be that I can use floats "normally" in the code plus maximum optimizations and they never get placed in .rodata
at all.
What I imagine GCC
to possibly do is placing the float constant in-between the code by mixing data and code, loading from that place into a floating point register and continue. This is possible to write manually I believe but how to make GCC
do that? Forcing the attribute per variable causes the error from above but technically this should be feasible.
man gcc
and the POWER-msdata
option in particular. On the GCC dev mailing list, someone mentioned that adding-G 0
to gcc options "fixes" this; could you try that and report whether that makes gcc do what you prefer? – Catachresis.text
section near your function isn't inherently bad, and doesn't waste anything if they're in separate cache lines. If it's in the cache-line after, maybe L2 prefetch will even bring in the data before it's demand-loaded. – Exmoor