How does TLB differentiate between entries of different Page tables?
Asked Answered
S

1

1

Since different processes have their own Page table, How does the TLB cache differentiate between two page tables? Or is the TLB flushed every time a different process gets CPU?

Spinner answered 1/12, 2021 at 15:59 Comment(1)
Related: Is that TLB contains only entries for a single process? has a brief answer that mentions process context IDs. But not quite a duplicate.Thuggee
T
3

Yes, setting a new top-level page table phys address (such as x86 mov cr3, rax) invalidates all existing TLB entries1, or on other ISAs possibly software would need to use additional instructions to ensure safety. (I'm guessing about that, I only know how x86 does it).
Some ISAs do purely software management of TLBs, in which case it would definitely be up to software to flush all or at least the non-global TLB entries on context switch.

A more recent CPU feature allows us to avoid full invalidations in some cases. A context ID gives some extra tag bits with each TLB entry, so the CPU can keep track of which page-table they came from and only hit on entries that match the current context. This way, frequent switches between a small set of page tables can keep some entries valid.

On x86, the relevant feature is PCID (Process Context ID): When the OS sets a new top-level page-table address, it's associated with a context ID number. (maybe 4 bits IIRC on current CPUs). It's passed in the low bits of the page-table address. Page-tables have to be page aligned so those bits are actually unused; this feature repurposes them to be a separate bitfield, with CR3 bits above the page-offset used normally as the physical page-number.

And the OS can tell the CPU whether or not to flush the TLB when it loads a new page table, for either switching back to a previous context, or recycling a context-ID for a different task. (By setting the high bit of the new CR3 value, mov cr, reg manual entry.)

x86 PCID was new in 2nd-gen Nehalem: https://www.realworldtech.com/westmere/ has a brief description of it from a CPU-architecture PoV.

Similar support I think extends to HW virtualization / nested page tables, to reduce the cost of hypervisor switches between guests.

I expect other ISAs that have any kind of page-table context mechanism work broadly similarly, with it being a small integer that the OS sets along with / as part of a new top-level page-table address.


Footnote 1: Except for "global" ones where the PTE indicates that this page will be mapped the same in all page tables. This lets OSes optimize by marking kernel pages that way, so those TLB entries can stay hot when the kernel context-switches user-space tasks. Both page tables should actually have valid entries for that page that do map to the same phys address, of course. On x86 at least, there is a bit in the PTE format that lets the CPU know it can assume the TLB entry is still valid across different page directories.

Thuggee answered 1/12, 2021 at 16:36 Comment(6)
Re Footnote 1, I suppose that Meltdown put a damper on the global TLB entry feature. With KPTI I think the kernel now gets its own PCID. Or maybe every process gets two PCIDs, one for userspace and one for kernel? Haven't checked.Hoofbeat
It's interesting that Intel's manuals don't seem to contain any kind of a warning that the global TLB feature might be dangerous.Hoofbeat
@NateEldredge: Yes, right, on CPUs without a HW fix for Meltdown, the kernel can't safely use global pages anymore. (Hopefully it still does on non-Intel CPUs, although some non-x86 ISAs have some affected CPUs.) And yeah, it uses a PCID for the kernel. Not sure exactly how it manages copy_from_user / copy_to_user; if it was a single PCID across all tasks it would need to invalidate those user pages before and/or after using? Hmm, that might explain some small-buffer read benchmarks I was playing with a while ago.Thuggee
@NateEldredge: I'm sure there are published errata for Meltdown on all CPUs affected by it. Unlike Spectre, it's easy to fix in new HW, so it's not an ongoing problem. (anandtech.com/show/13450/… shows CFL-refresh with HW mitigation). It's major enough that it would be worth warning about in the manual, though, at least while existing CPUs with it are still widespread. In general they don't clutter the general ISA manual with per-CPU errata stuff, but OTOH most errata aren't as serious or security-relevant as that.Thuggee
(At least classic Meltdown should be an easy fix, just force the load result to 0 as well as marking it as fault-if-reaching-retirement). The other meltdown-related vulns (MSD in general) that aren't dependent on loads that should fault are I think orthogonal to using global page table entries, so the kernel using global pages again shouldn't make them more dangerous. (Or maybe I'm forgetting something; been a while since I looked at those vulnerabilities and the newest varieties.)Thuggee
Thanks a lot! Your footnote also answered my future question about shared physical addresses.Spinner

© 2022 - 2024 — McMap. All rights reserved.