What does GDB return when do print &"some string literal" for C++ source?
Asked Answered
M

1

6

In GDB, I can do the following and get address of a string literal:

(gdb) p &"aaa"
$3 = (char (*)[4]) 0x614c20

As I understand, a string literal is a rvalue with no symbol to which it binds, but why can I get its address?

What kind of address is it? Is there somewhere that all string literals are stored? But there are infinite string literals, does it mean the string literal is "created and put in memory" on-demand?

Mapping answered 23/9 at 1:26 Comment(11)
String literal is lvalue.Cherise
Many thanks to your comments mate, I will correct edit the question.Mapping
Maybe because a debugger isn’t a C++ compiler?Ensilage
Adding to the question: There is also no guarantee that there aren't multiple string literal objects in the program that all have the value "aaa". So does gdb in that case fail to evaluate the expression or what does the address mean in that case?Rhesus
Also to make the question clearer: p &"aaa" in gdb produces a result even in a program that doesn't contain any "aaa" literal and it seems that each time it is evaluated it produces a different address.Rhesus
Maybe GDB is calling the program’s malloc under the hood to allocate a new string. The address looks like a heap address, so this is plausible.Leastways
String literals: Where do they go? and Where in memory are string literals ? stack / heap? malloc etc and Storage of String Literals in memory c++Tayyebeb
@Tayyebeb This is a question about gdb's behavior in C++ programs, not about the C++ language. None of the suggested duplicates are relevant at all.Rhesus
"there are infinite string literals..." No there are not.Tayyebeb
@Rhesus "is it somewhere all the string literals are stored?..." is answered by the dupes.Tayyebeb
@Tayyebeb GDB will happily let you do print &"blah blah this string is definitely not in the program blah blah". The resulting address will contain that string, as you can check with x/s. The real answer is that GDB is creating the string literals by writing to the debuggee’s memory directly.Leastways
L
11

GDB calls malloc inside the debuggee to allocate space for the string literal, then writes the string literal to the freshly allocated memory chunk and returns the address of it.

It achieves this by building a fake stack frame which calls into malloc() and which is set up to return to a breakpoint instruction. It then silently resumes the program, letting the call to malloc complete, and then retakes control of the program when the program returns to the breakpoint.

You can prove that this is happening by setting a breakpoint on malloc itself, then running print &"foo". GDB will hit the breakpoint with the following message:

The program being debugged stopped while in a function called from GDB.
Evaluation of the expression containing the function
(malloc) will be abandoned.
When the function is done executing, GDB will silently stop.

If you’re curious, you can check the value of the first argument ($rdi on x86-64) and verify that it’s equal to 4 (length of the string literal + 1). You can also check the stack trace to see the fake breakpoint return address.

This behaviour might be surprising; after all, one doesn’t normally expect print &"foo" to execute code in the debuggee. If you want to disable this behaviour, run set may-call-functions off. After running this, an expression like print &"foo" will yield Cannot call functions in the program: may-call-functions is off.. print statements that don’t require executing debuggee functions will continue to work normally.

Leastways answered 23/9 at 2:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.