TLB structure in intel
Asked Answered
B

1

6

I started from Patterson & Hennessy book with basic definitions and then followed the intel programming reference documents for more information about TLB.

From the intel documents i got to know the high level design of TLB. such as line size, associativity and levels of caching. But in need a detailed explanation of how TLB caching works with respect to cache misses and its replacement mechanisms in modern CPU. What pages moves
to L2 TLB from L1 TLB ? how many pages can a single entry in TLB address? How many entries are present in TLB ? (In particular DTLB)

Any Information or references will be of great help to me. (If this is not the proper forum for this question, please suggest the right one)

Thank you.

Braze answered 21/2, 2016 at 23:9 Comment(5)
You might enjoy lwn.net/Articles/379748 even if it doesn't answer all the questions you posed.Ouse
A TLB doesn't have "lines", it has entries. One entry maps one virtual page to one physical page. TLB misses are separate from L1 cache misses. (There's no obvious reason why it would be impossible for a line to still be hot in L1 D cache even though the translation for that line has been evicted from the TLB.)Mithgarthr
David Kanter's writeup of Haswell mentions the TLBs a bit, but doesn't go into the replacement policy for TLB entries. I think the L2 DTLB (8-way associative) is a victim cache for entries evicted from the L1 DTLB. In his SnB writeup, he said L1 DTLB was fully associative, but now he says 4-way? Entries move from L2 DTLB to L1 DTLB when there's an L1 DTLB miss and it's present in the L2 DTLB. A few of the links on the x86 tag wiki might be useful, but prob. only David Kanter's Haswell writeup.Mithgarthr
What exactly does the L1 DTLB cache? To be specific does each entry in L1 DTLB cache a data line in cache memory (data line fetched) or the entire page of the fetching data. What does L2 TLB cache, is it page entries?Braze
TLBs cache page-table entries, from the page tables defined by the OS.Mithgarthr
F
6

TLB can be called a translation cache and thus, its functioning is almost as that of on-chip caches, e.g., the tradeoffs of exclusive/inclusive hierarchy, multi/single-level, private/shared are same as that in cache. Same for associativity, page size, etc.

One TLB entry only maps one virtual page to physical page, but the page size can be varied, e.g., instead of 4kB, a processor can use 2MB or 2GB, which is called a superpage or hugepage. Or a processor can use multiple page sizes.

Since you are asking for reference, see my survey paper on TLB which answers all these questions and reviews 85+ papers. Specifically, section 2 of the paper references papers that discuss TLB designs in commercial processors.

Footle answered 3/11, 2016 at 5:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.