When exactly does a TLB shootdown happen?
It happens when the operating system or hypervisor requests it.
At the ISA level, certain operations can perform TLB shootdowns (see the Intel manual V3 4.10.4 and AMD manual V2 5.5.2), thereby invalidating one or more TLB entries in one or more local or remote TLB caches (those of other logical cores of the same CPU and all other kinds of processors that have TLBs and share the same physical memory address space).
Note also that any paging structure entry can be cached even if it has not been accessed by any retired instruction. This can happen due to speculative execution or MMU prefetching. Therefore, in general, any entry can be cached or invalidated at any time. Of course, there are specific guarantees given so that the MMU caches can be managed and kept coherent with in-memory paging structures.
Who performs the actual TLB shootdown? Is it the kernel(if so, where
can I find the code that performs the flushing?) or is it the CPU(if
so, what triggers the action) or is it both(the kernel executes an
instruction which causes an interrupt, which in turns causes the CPU
to perform the TLB shootdown)
As I said before, the CPU itself can invalidate any entry any time. In addition, software with current privilege level (CPL) = 0 can perform any of the operations related to TLB management.
An Introduction to TLB Invalidation in the Linux Kernel
The Linux kernel defines TLB-invalidation functions that are architecture-dependent (/arch/x86/mm/tlb.c) and functions that are architecture-dependent (/arch/x86/include/asm/tlbflush.h). That's because different architectures offer wildly different mechanisms for managing the TLBs. To see some examples of when the Linux kernel performs TLB invalidations, refer to the tlb_flush_reason
enum (comments are mine):
enum tlb_flush_reason {
// The memory descriptor structure mm of the current process is about to change.
// This occurs when switching between threads of different processes.
// Note that when mm changes, the ASID changes as well (CR3[11:0]).
// I'd rather not discuss when context switches occur because it's a whole different topic.
// TLB shootdown only occurs for the current logical core.
// The kernel sometimes can optimize away TLB flushes on a process-context switch.
TLB_FLUSH_ON_TASK_SWITCH,
// Another logical core has sent a request to the current logical core
// to perform a TLB shootdown on its TLB caches.
// This occurs due to a KVM hypercall. See TLB_REMOTE_SEND_IPI.
TLB_REMOTE_SHOOTDOWN,
// Occurs when one or more pages have been recently unmapped.
// Affects only the local TLBs.
TLB_LOCAL_SHOOTDOWN,
// This occurs when making changes to the paging structures.
// Affects only the local TLBs.
TLB_LOCAL_MM_SHOOTDOWN,
// Occurs when the current logical core uses a KVM hypercall to request
// from other logical cores to perform TLB shootdowns on their respective TLBs.
TLB_REMOTE_SEND_IPI,
// This equals to the number of reasons. Currently not used.
NR_TLB_FLUSH_REASONS,
};
There are other cases where the kernel flushes TLBs. It's hard to make a complete list and I don't think anyone has made a list like that.
The Linux kernel implements a lazy TLB flushing technique. The basic idea is that when paging structures of a process are modified, the kernel attempts to delay TLB shootdowns to the point when a thread from that process is about to be scheduled to execute in use-mode.
The Linux kernel currently uses one of the following four methods to flush the TLBs associated with the current logical core when required:
- Write to CR3 the current value of CR3. While this does not change the value in CR3, it instructs the logical core to flush all non-global TLB entries that have the same PCID as the one in CR3.
- Disable CR4.PGE, then write to CR4 the current value of CR4, and then reenable CR4.PGE. This has the effect of flushing all TLB entries for all PCIDs and global entries. This method is not used if INVPCID is supported.
- Invalidate TLB entries for a given PCID and virtual address using the INVPCID instruction type 0.
- Invalidate all TLB entries including globals and all PCIDs using the INVPCID instruction type 2.
Other types of INVPCID are currently not used.
Related: Do the terms tlb shootdown and tlb flush refer to the same thing.
Other than software-initiated invalidations of TLB entries, the Intel manual Volume 3 Section 4.10.2.2 for the P6 microarchitecture and most later microarchitectures:
Processors need not implement any TLBs. Processors that do implement
TLBs may invalidate any TLB entry at any time. Software should not
rely on the existence of TLBs or on the retention of TLB entries.
There is no such statement in the AMD manual as far as I know. But also no guarantees regarding TLB entires retention are given, so we can conclude the same statement for AMD processors.