TLB misses vs cache misses?
Asked Answered
L

3

26

Could someone please explain the difference between a TLB (Translation lookaside buffer) miss and a cache miss?

I believe I found out TLB refers to some sort of virtual memory address but I wasn't overly clear what this actually meant?

I understand cache misses result when a block of memory (the size of a cache line) is loaded into the (L3?) cache and if a required address is not held within the current cache lines- this is a cache miss.

Loathe answered 4/5, 2012 at 9:34 Comment(1)
Understand paging https://mcmap.net/q/13918/-how-does-x86-paging-work , and see the TLB section.Vagrom
H
33

Well, all of today's modern operating systems use something called virtual memory. Every address generated by CPU is virtual. There are page tables that map such virtual addresses to physical addressed. And a TLB is just a cache of page table entries.

On the other hand L1, L2, L3 caches cache main memory contents.

A TLB miss occurs when the mapping of virtual memory address => physical memory address for a CPU requested virtual address is not in TLB. Then that entry must be fetched from page table into the TLB.

A cache miss occurs when the CPU requires something that is not in the cache. The data is then looked for in the primary memory (RAM). If it is not there, data must be fetched from secondary memory (hard disk).

Hutcheson answered 4/5, 2012 at 9:39 Comment(4)
What is the purpose of the virtual memory address and what does it actually refer to, still the main memory (RAM)?Loathe
Well, earlier 32 bit CPUs used to generate 32 bit addresses ~4GB of addressable memory. But RAM amount was often less than 4GB. Now, if you have 1GB RAM, you cannot run a program that needs more than 1GB of addressable memory. So why not fool the program into believing you have 4GB RAM? Where only a fraction of program data resides in RAM and the rest in hard drive. That is what virtual memory does. You can read all about it in this Wikipedia article. Current 64 bit CPUs also use virtual memory. Only the virtual address length is now 48 bit.Hutcheson
Do page tables and TLB store mappings for cache, too? Or just the main memory?Furnary
Again, Page table resides in RAM and hence could be cached. So, a TLB miss can also access cacheDecarbonize
P
2

The following sequence after loading first instruction address (i.e. virtual address) in PC makes concept of TLB miss and cache miss very clear.

The first instruction • Accessing the first instruction

  • Take the starting PC
  • Access iTLBwith the VPN extracted from PC: iTLBmiss
  • Invoke iTLBmiss handler
  • Calculate PTE address
  • If PTEsare cached in L1 data and L2 caches, look them up with PTE address: you will miss there also
  • Access page table in main memory: PTE is invalid: page fault
  • Invoke page fault handler
  • Allocate page frame, read page from disk, update PTE, load PTE in iTLB, restart fetch • Now you have the physical address

  • Access Icache: miss

  • Send refill request to higher levels: you miss everywhere
  • Send request to memory controller (north bridge)
  • Access main memory
  • Read cache line
  • Refill all levels of cache as the cache line returns to the processor
  • Extract the appropriate instruction from the cache line with the block offset • This is the longest possible latency in an instruction/data access

source https://software.intel.com/en-us/articles/recap-virtual-memory-and-cache

Paramount answered 2/5, 2014 at 17:26 Comment(0)
T
0

As the HOW of both the processes are mentioned. On the note of performance, a cache miss does not necessarily stall the CPU. A small number of cache misses can be tolerated using algorithmic pre-fetching techniques. A TLB miss however causes the CPU to stall till the TLB has been updated with the new address. In other words prefetching can mask a cache miss but not a TLB miss.

Tosspot answered 20/6, 2014 at 7:54 Comment(1)
This is not strictly true. With out-of-order execution and hardware page table walking (x86, ARM, some MIPS Release 5, etc.) a TLB miss may not immediately stall the processor. Furthermore, academic papers have proposed hardware prefetch for TLBs. It would also be possible for a processor to prefetch TLB entries with an ordinary memory prefetch instruction.Sundog

© 2022 - 2024 — McMap. All rights reserved.