How does the linux kernel avoid the stack overwriting the text (instructions)?
Asked Answered
C

1

0

I was curious about how the kernel prevents the stack from growing too big, and I found this Q/A:

Q: how does the linux kernel enforce stack size limits?

A: The kernel can control this due to the virtual memory. The virtual memory (also known as memory mapping), is basically a list of virtual memory areas (base + size) and a target physically memory area that the kernel can manipulate that is unique to each program. When a program tries to access an address that is not on this list, an exception happens. This exception will cause a context switch into kernel mode. The kernel can look up the fault. If the memory is to become valid, it will be put into place before the program can continue (swap and mmap not read from disk yet for instance) or a SEGFAULT can be generated.

In order to decide the stack size limit, the kernel simply manipulates the virtual memory map. - Stian Skjelstad

But I didn't quite find this answer satisfactory. "When a program tries to access an address that is not on this list, an exception happens." - But wouldn't the text section (instructions) of the program be part of the virtual memory map?

Counteraccusation answered 10/1, 2022 at 18:8 Comment(12)
As a small side-question about terminology: When I say "text section", that does literally mean "the machine code from the instructions of the base program", right? And does that include the code from shared memory (libraries), since those are mapped in by the runtime linker?Counteraccusation
The .text section is mapped read-only in normal user-space processes. And at the opposite end of virtual address space, so there are lots of unused pages in the way before a stack-clash could happen. (Some of those pages intentionally left unmapped as guard pages below the stack-growth region). See Linux process stack overrun by local variables (stack guarding) for more about stack-growth limits and stuff.Royo
The question is about kernel (e.g. stack on interrupts) or about user programs? In any case, kernel can set the limit of stack (stack is in the SS segment, so you can set different limits). You get a fault (or double fault) if you go behind the limitsHalfbound
You tagged this [linux-kernel]; are you talking about the kernel's own stack? (Actually it uses per-task kernel stacks, separated with at least one guard page).Royo
@GiacomoCatenazzi: x86 Linux doesn't use SS segment limits, it uses paging. In long mode, the SS segment limit is hard-wired to no limit, because OSes weren't using it so there was no need to find a way to extend it to 64-bit.Royo
sorry everyone, to clarify: I'm asking about how the kernel enforces the stack size of user programs.Counteraccusation
By refusing to allocate new stack memory past the growth limit set by ulimit -s. How is Stack memory allocated when using 'push' or 'sub' x86 instructions?. Or for thread stacks (not the main thread), it's just a normal mmap allocation with no growth; the only lazy allocation is physical pages to back the virtual ones.Royo
Is there a limit of stack size of a process in linux mentions the existence of a stack growth limit. But I think my answer that I linked in the last comment covers that, as well as how the limit manifests in practice.Royo
@PeterCordes the 2 answers you just linked, as usual, taught me far more than I realized I didn't know. But basically, if I understand correctly, trying to write to a read-only part of memory (.text) would throw an error (if one were to exceed the limit), but also guard pages allow the kernel to lazily allocate more memory, so it's not only that the stack grows upward (well, downward to lower memory addresses, I suppose), it also gets expanded downward as more pages of memory are dynamically allocated to support stack size growth beyond the original allocation.Counteraccusation
Yeah more or less. The key point for this question is that there's a growth limit that will stop the stack from getting anywhere near .text. (And the guard pages make sure there's a segfault if the stack does overflow past the growth limit.) Only if you set ulimit -s unlimited could you maybe grow the stack into some other mapping, if Linux truly does allow unlimited growth in that case without reserving a guard page as you approach another mapping.Royo
@PeterCordes any reason to not just copy-paste your comments into the answer section, or close with those dupe targets?Neile
@MarcoBonelli: Yeah, thought about doing that. I sometimes answer in comments on a naive question or one that could be answered many different ways, to see if that's what the querent was looking for, or if they need to clarify what they're asking. (Especially if I only briefly skimmed the question and aren't sure I'm answering what they're trying to ask.) In this case probably yeah, it looks like the dust has settled.Royo
R
1

I'm asking about how the kernel enforces the stack size of user programs.

There's a growth limit, set with ulimit -s for the main stack, that will stop the stack from getting anywhere near .text. (And the guard pages below that make sure there's a segfault if the stack does overflow past the growth limit.) See How is Stack memory allocated when using 'push' or 'sub' x86 instructions?. (Or for thread stacks (not the main thread), stack memory is just a normal mmap allocation with no growth; the only lazy allocation is physical pages to back the virtual ones.)

Also, .text is a read+exec mapping of the executable, so there's no way to modify it without calling mprotect first. (It's a private mapping, so doing so would only affect the pages in memory, not the actual file. This is how text relocations work: runtime fixups for absolute addresses, to be fixed up by the dynamic linker.)

The actual mechanism for limiting growth is by simply not extending the mapping and allocating a new page when the process triggers a hardware page fault with the stack pointer below the existing stack area. Thus the page fault is an invalid one, instead of a soft aka minor for the normal stack-growth case, so a SIGSEGV is delivered.


If a program used alloca or a C99 VLA with an unchecked size, malicious input could make it jump over any guard pages and into some other read/write mapping such as .data or stuff that's dynamically allocated.

To harden buggy code against that so it segfaults instead of actually allowing a stack clash attack, there are compiler options that make it touch every intervening page as the stack grows, so it's certain to set off the "tripwire" in the form of an unmapped guard page below the stack-growth limit. See Linux process stack overrun by local variables (stack guarding)

If you set ulimit -s unlimited could you maybe grow the stack into some other mapping, if Linux truly does allow unlimited growth in that case without reserving a guard page as you approach another mapping.

Royo answered 10/1, 2022 at 21:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.