What is paging?
Asked Answered
T

3

6

Paging is explained here, slide #6 :

http://www.cs.ucc.ie/~grigoras/CS2506/Lecture_6.pdf

in my lecture notes, but I cannot for the life of me understand it. I know its a way of translating virtual addresses to physical addresses. So the virtual addresses, which are on disks are divided into chunks of 2^k. I am really confused after this. Can someone please explain it to me in simple terms?

Taxis answered 11/5, 2011 at 23:37 Comment(2)
I inherently distrust anything written with Comic Sans; it'd be my inclination to find better documentation in the first place. WikiPedia's Paging article is pretty good, and has links to more articles about the topic. It might be a little disorganized (why are there a paging and virtual memory articles in the first place, rather than one article with both? Oh well.) but it is written with full sentences and examples (in contrast to powerpoint slidestacks).Albatross
I agree, the notes leave a lot to be desired...not least the comic sans. I think I understand the concept mostly now - the virtual memory is split into pages of 2^k, then an address is created. The lower n-k bits of the address are used to point to a page table entry, and the upper n-k bits are an offset to the page. So you concatenate the two of them together to get mapped on to the physical memory. You can also split the lower n-k bits to form a page table index which can then be index into with the next split into the n-k bits...ahh...it all makes sense now! I think?Taxis
M
15

Paging is, as you've noted, a type of virtual memory. To answer the question raised by @John Curtsy: it's covered separately from virtual memory in general because there are other types of virtual memory, although paging is now (by far) the most common.

Paged virtual memory is pretty simple: you split all of your physical memory up into blocks, mostly of equal size (though having a selection of two or three sizes is fairly common in practice). Making the blocks equal sized makes them interchangeable.

Then you have addressing. You start by breaking each address up into two pieces. One is an offset within a page. You normally use the least significant bits for that part. If you use (say) 4K pages, you need 12 bits for the offset. With (say) a 32-bit address space, that leaves 20 more bits.

From there, things are really a lot simpler than they initially seem. You basically build a small "descriptor" to describe each page of memory. This will have a linear address (the address used by the client application to address that memory), and a physical address for the memory, as well as a Present bit. There will (at least usually) be a few other things like permissions to indicate whether data in that page can be read, written, executed, etc.

Then, when client code uses an address, the CPU starts by breaking up the page offset from the rest of the address. It then takes the rest of the linear address, and looks through the page descriptors to find the physical address that goes with that linear address. Then, to address the physical memory, it uses the upper 20 bits of the physical address with the lower 12 bits of the linear address, and together they form the actual physical address that goes out on the processor pins and gets data from the memory chip.

Now, we get to the part where we get "true" virtual memory. When programs are using more memory than is actually available, the OS takes the data for some of those descriptors, and writes it out to the disk drive. It then clears the "Present" bit for that page of memory. The physical page of memory is now free for some other purpose.

When the client program tries to refer to that memory, the CPU checks that the Present bit is set. If it's not, the CPU raises an exception. When that happens, the CPU frees up a block of physical memory as above, reads the data for the current page back in from disk, and fills in the page descriptor with the address of the physical page where it's now located. When it's done all that, it returns from the exception, and the CPU restarts execution of the instruction that caused the exception to start with -- except now, the Present bit is set, so using the memory will work.

There is one more detail that you probably need to know: the page descriptors are normally arranged into page tables, and (the important part) you normally have a separate set of page tables for each process in the system (and another for the OS kernel itself). Having separate page tables for each process means that each process can use the same set of linear addresses, but those get mapped to different set of physical addresses as needed. You can also map the same physical memory to more than one process by just creating two separate page descriptors (one for each process) that contain the same physical address. Most OSes use this so that, for example, if you have two or three copies of the same program running, it'll really only have one copy of the executable code for that program in memory -- but it'll have two or three sets of page descriptors that point to that same code so all of them can use it without making separate copies for each.

Of course, I'm simplifying a lot -- quite a few complete (and often fairly large) books have been written about virtual memory. There's also a fair amount of variation among machines, with various embellishments added, minor changes in parameters made (e.g., whether a page is 4K or 8K), and so on. Nonetheless, this is at least a general idea of the core of what happens (and it's still at a high enough level to apply about equally to an ARM, x86, MIPS, SPARC, etc.)

Bibliography

[Note: I haven't read all of these in their entirety, so I can't really vouch for their necessarily being great books.]

General References

  • Virtual Memory: a Clear and Concise Reference, Gerardus Blokdyk; ISBN 0655330224
  • Architectural and Operating System Support for Virtual Memory; Abhishek Bhattacharjee and Daniel Lustig; ISBN 1627056025

OS References

  • What Makes It Page? The Windows 7 Virtual Memory Manager; Enrico Martignetti; ISBN 1479114294
  • The Design and Implementation of the FreeBSD Operating System; Marshall McKusick, George Neville-Neil and Robert Watson; ISBN 0321968972

CPU References

[Note: these links are likely to go stale. Sorry, but not much help for that.]

Martel answered 12/5, 2011 at 0:20 Comment(2)
Could you please provide names for those a few complete books on virtual memory or otherwise list other references?Bicorn
@YuhaoHanHarry: I've added a short bibliography. Of those, What Makes it Page is probably the "lightest" introduction to the subject area, and even though the specifics it gives are about Windows 7, most of what it covers is more broadly applicable. The FreeBSD book covers a lot outside of virtual memory, but has good coverage of virtual memory along with the rest.Martel
B
3

Simply put, its a way of holding far more data than your address space would normally allow. I.e, if you have a 32 bit address space and 4 bit virtual address, you can hold (2^32)^(2^4) addresses (far more than a 32 bit address space).

Billingsley answered 11/5, 2011 at 23:52 Comment(4)
I understand this, and thanks very much for the response. Can you tell me if my summary(above) is a correct understanding of the situation?Taxis
Basically yes, it might help (for simplicity's sake) to imagine a virtual address composed to two parts: an address and a page number, where the address is limited to a certain number of bits, and the address it lengthened by the number of bits in the page number.Billingsley
Thank you very much! So, to summarise, the virtual address space is divided into pages of 2^k. The upper n-k bits form the page number, lower are an offset to the page. The page number is used as an index to get the Page Table Entry which is added to the offset to map to the physical address(page frame). To make it more efficient, you can split the page number into two, and have the first ten bits index into a page directory, and the second 10 point to the PTE in this directory. Sorry for restating this, it's just with all the pages it can be very confusing. Is this correct?Taxis
@John, I think you've got it correct, but actual implementations are very complicated; the standard Linux virtual memory implementation uses a tree of depth four to hierarchically manage virtual memory -> physical memory lookups. I know that article goes into way too much depth for most people :) but it surely won't leave you wanting for details.Albatross
D
-1

Paging is a storage mechanism that allows OS to retrieve processes from the secondary storage into the main memory in the form of pages. In the Paging method, the main memory is divided into small fixed-size blocks of physical memory, which is called frames. The size of a frame should be kept the same as that of a page to have maximum utilization of the main memory and to avoid external fragmentation.

Danseuse answered 27/3, 2021 at 13:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.