x86-64 page table Global bit
Asked Answered
A

1

8

Every PTE (page table entry) in this setting has a G-bit (G = Global), which controls the scope of the physical page mapped by this entry.

If the G-bit is set, then the entry is global to all processes and they can all access the physical page it maps, subject to other access rights. If the G-bit is zero, then the entry is not global but private to a process. [The kernel sets the G-bit for its pages, but prevents user-mode access by disabling the U-bit (U = user-mode) on its pages.]

If the G-bit is set on a user-mode PTE - the one that has its U-bit set - isn't it a security breach as every process on the system can now access the page the PTE maps ?

Am I missing something ? Is there a way to set the G-bit on a user-mode PTE but make it global only among a group of trusted processes AND not all processes on the system ? Can we have both G and U bit set in a PTE ?

Abolish answered 1/2, 2017 at 18:19 Comment(1)
Update on this: the Meltdown vulnerability defeats the U/S bit for reading on some CPUs, and various other speculative-exec and microarchitectural vulnerabilities make it a good idea not to keep the kernel pages mapped while user-space is running. So modern Linux doesn't, even on HW without Meltdown vulnerabilities. See discussion in comments on How does TLB differentiate between entries of different Page tables? for more. (Linux uses 2 PCIDs per process on x86 to switch between the with/without kernel version of the page tables for a process withoutMalefactor
D
2

Yes, on the x86 the G-bit is only useful when there is some other type of control (such as restricting it to Ring 0, which is what the kernel uses) or on an unprotected operating system1.

Think of the G-bit as an optimization for system calls: the kernel maps its pages as global so no TLB flush needs to happen. You still need TLB flushes on context switches between processes, but these are often a couple of orders of magnitude less common than kernel<->usermode switches.

You could imagine a scenario where G pages are useful for user processes, such as shared memory: switching between two processes wouldn't need to invalidate the TLB entries for the shared memory if the kernel was aware of this and used a G==1 mapping for both processes. TLB refills aren't actually that bad these days though because modern x86 caches a lot of the table entries even beyond the TLB to allow quick refills.

I don't think that setting the G and U bit is disallowed, but the kernel isn't going to actually set it up that way.

As a final note, you could actually imagine a read-only global mapping being useful, for something like the vdso mechanism. All processes would map that page, but couldn't modify it, and the kernel updates it as needed. Of course, I can't see how to actually make this work, since the kernel would need write access, and there doesn't seem to be a way to express "readonly for ring 3, r/w for ring 0" in the page table. Perhaps the kernel could use another mapping for this page, but I'm not sure if that's legal: having a mapping that overrides a "G" mapping (since if the G mapping is in the TLB, the CPU may never see the overriding mapping).


1 Technically it could be useful on a single user operating system where all user-mode processes have the same privileges, but the kernel is still protected from user-mode, but AFAIK that model doesn't really exist in contemporary OSes.

Diastole answered 1/2, 2017 at 18:44 Comment(9)
Kernel can map the VDSO page data into kernel space and point it at the same physical memory as the pages mapped into user space. Both can have different restrictions allowing the kernel read/write privs on the data and restricting it to read only data for usermode.Goutweed
Oh yeah, of course, the kernel just uses a different virtual address to access it and everything is good. Anyway, the VDSO use case is probably not a super interesting one since it's only one page per process. @MichaelPetchDiastole
I guess one interesting question is: can processes running under the same user read each other's memory space today in Linux? If not, is against the security model if they could? If it was OK, then you could imagine an optimization where G pages were used for userspace processes, which wouldn't have to be cleared when the context switch was to another process from the same user. I guess it could be useful for some applications that use a ton of distinct worker processes and shared memory...Diastole
Prior to the CR4.PCIDE = 1 era, the G bit was just a way to prevent flushing some TLB entries when moving to CR3. A process could access every page mapped (if it had the rights): global or non-global. No security breach. Even with CR4.PCIDE = 1, PCIDs are used as a software controlled way to manage the TLB caches, they are related to process isolation only indirectly: it is more a feature that prevents the OS from flushing the TLBs on every context switch. So the G bit is more related to caching than to security. It just takes any mapping left in the TLBs to flaws an OS, global or not.Agley
@MargaretBloom - when you say "moving to CR3", do you mean "when changing the value of CR3"? Yes, a process could "access every page mapped", but my understanding of the G bit is that a process could access such a page even if not mapped! Since the entry stays in the TLB, the CPU won't even check if the page is mapped since it hits in the TLB. So effectively it allows use of pages which aren't mapped for the current process. I don't see that much of a distinction between "isolation" and "TLB management" - they really go hand in hand.Diastole
@Diastole Yes, that's basically the point :) With CR4.PCIDE the semantic of the G bit changed a bit (no pun intended). I was pointing that out because I believe the OP thought that the CPU is somewhat aware of the OS concept of "process" and that process isolation is achieved with G = 0 in the PTE while it's more about caching than that. Anyway, it's just wording and interpretation.Agley
I think BeeOnRope is right saying "process could access such a page even if not mapped". Check "4.10.3.2 Using the Paging-Structure Caches to Translate Linear Addresses" in "Intel 64 and IA-32 Architectures. .. Volume 3" It says "If the processor finds a TLB entry that is for the page number of the linear address and that is associated with the current PCID (or which is global), it may use the physical address, access rights, and other attributes from that entry." That could be a memory access to physical page that is not referenced in address translation structures of current PCID.Martell
@AlexP. - thanks for finding the reference in the doc. I assumed it must work that way, because once you leave the leave an entry in the TLB the processor is just going to use it. The fast-path is a TLB hit, so there will not be any slow checks of permissions, etc. That's why the OS has to carefully manage the TLB when scheduling processes, etc. So that's why the any mappings with G=1 would have to be very carefully managed.Diastole
Think of the G-bit as an optimization for system calls: Nope, without Meltdown mitigation, the kernel doesn't have to write CR3 on system calls so no TLB invalidation is necessary. The U/S bit allows the kernel to leave kernel pages mapped while user-space is running. (And in fact kernel entry points have to be mapped or else an interrupt handler or syscall would page-fault). The G bit is an optimization for context switches, avoiding TLB misses right after writing CR3 while still in kernel space, by letting the kernel keep itself globally mapped across all user-space processes.Malefactor

© 2022 - 2024 — McMap. All rights reserved.