On x86, does enabling paging cause an "unconditional jump" (since EIP is now a virtual address)?

O

3

7

When paging is enabled by setting the paging bit in CR0 to 1, all pointers (including EIP) are now interpreted as virtual rather than physical addresses. Unless the region of memory which the CPU is currently executing from is "identity mapped" (virtual addresses are mapped to identical physical addresses), it seems that this would cause the CPU to do what amounts to an "unconditional jump" -- it should start executing code from a different (physical) address.

Does this actually happen? It seems it would be very tricky to get OS startup code to work reliably with this behavior. Or do all protected-mode OSs identity-map their own kernel code?

Omnipotence answered 22/6, 2015 at 5:37 Comment(2)

I don't really get why it would cause any jumps or how it would be a problem to get the startup code to work. Paging is in the memory controller, OS sets it up as it pleases and it can map the page it's currently running to have the same physical and virtual address. Or not, if it so chooses, in which case it will "jump" in physical memory but not virtual. – Consolute 22/6, 2015 at 5:54

With help from the link provided by @YannVernier, I have found that before turning on paging, the Linux kernel builds a page table which maps the kernel code into 2 different ranges of virtual memory addresses: one identical to the physical addresses where it is loaded, and one where it eventually wants to run from. Slightly later in the startup process, it switches to different page tables with only the 2nd range of mappings, not the 1st. This avoids the problem which I asked about. I wonder how NT does it? – Omnipotence 22/6, 2015 at 14:38

L

3

Yes and No

Yes, in the informal sense, since now the MMU do a translation from virtual to linear addresses and since the CPU fetches virtual addresses. If we switch on paging when executing an instruction at address 4000h, assuming the next instruction is at 4003h, it is possible that 4003h is translated into 8003h so actually making a jump from 4000h to 8003h. So we have to map the page we are currently executing into or we won't know where the CPU will execute code from.

No, in the technical sense this is not a jump since the CPU does not see any jump instruction with all its side effects (like discarding OoO instructions) and furthermore the CPU access the memory only after the whole cache hierarchy missed meaning that you could still be executing instructions from 4003h even if the page is mapped to a different address.

So, do or don't we need an identity map?

Yes, we need it. Not a full identity map, I usually only (identity) map the pages 7 and 8 (corresponding to linear range 7000h-8fffh) for example.

Comparing enabling paging with enabling Protected Mode you can see how different they are. Paging takes effect immediately, so you need to create all the page tables before you activate it and you need at least one identity page to handle your current running code without relying on the caches.
Enabling protected mode instead is more "easy", you can even create the the GDT entries after you entered protected mode and you can control when to make first use of it by changing a segment register (usually CS with a jump).

Actually you don't strictly need an identity page if you know what you are doing (say by duplicating your code or by using some hardware memory aliasing) but this is very context specific in the general case it just makes things uselessly complicated.

Leroi answered 22/6, 2015 at 20:44 Comment(4)

Thanks! When you say "I usually only identity map pages 7 and 8", does that mean that you write x86 operating systems on a regular basis? – Omnipotence 23/6, 2015 at 15:15

@AlexD When I want to experiment with hardware (CPU new features, system components) I usually write a small boot program. More than developing whole OSes (which is a big task) I write small parts of OS to consolidate theory, to discover or just for fun :) – Leroi 23/6, 2015 at 15:20

That is awesome. I would like to see your GitHub projects, if you have any. – Omnipotence 23/6, 2015 at 15:21

@AlexD ahaha I have started when GitHub didn't exist and I only had a 56Kbps PSTN connection (thereby limiting the access to the internet to few hours a week). I have a lot of file/project on an HD, with few comments and strange names :) I then was used to write a text (and only text) file with my annotations, I'll give you that all but I wrote everything in italian – Leroi 23/6, 2015 at 15:25

G

3

This does not require a full identity map; for instance, Linux actually drops the startup code completely once it is finished running. The relevant code is in pmjump.S, where flat mode (32-bit identity map) is used and a jump is performed immediately after enabling protected mode. Notably that jump is written in machine code form due to the switch into 32-bit mode. From there, it proceeds via startup_32 to set up page tables. I'm not certain if the unconditional jumps after status changes are fully required (for instance, 32-bit real mode was an unplanned side effect of not doing things like this as expected).

Greaten answered 22/6, 2015 at 6:48 Comment(3)

This was interesting. But after studying pmjump.S, I can see it enables memory protection, but not paging. You can say it runs in "flat mode", in that all the segment registers have a base of 0. But this question is about what happens when paging is enabled. – Omnipotence 22/6, 2015 at 7:20

duartes.org/gustavo/blog/post/kernel-boot-process should have a better overview. I'm not intimately familiar with this section. – Greaten 22/6, 2015 at 7:47

On kernel 4.2, the identity map is set at: arch/x86/kernel/head_64.S – Chromatic 29/10, 2015 at 15:16

L

3

Yes and No

Yes, in the informal sense, since now the MMU do a translation from virtual to linear addresses and since the CPU fetches virtual addresses. If we switch on paging when executing an instruction at address 4000h, assuming the next instruction is at 4003h, it is possible that 4003h is translated into 8003h so actually making a jump from 4000h to 8003h. So we have to map the page we are currently executing into or we won't know where the CPU will execute code from.

No, in the technical sense this is not a jump since the CPU does not see any jump instruction with all its side effects (like discarding OoO instructions) and furthermore the CPU access the memory only after the whole cache hierarchy missed meaning that you could still be executing instructions from 4003h even if the page is mapped to a different address.

So, do or don't we need an identity map?

Yes, we need it. Not a full identity map, I usually only (identity) map the pages 7 and 8 (corresponding to linear range 7000h-8fffh) for example.

Comparing enabling paging with enabling Protected Mode you can see how different they are. Paging takes effect immediately, so you need to create all the page tables before you activate it and you need at least one identity page to handle your current running code without relying on the caches.
Enabling protected mode instead is more "easy", you can even create the the GDT entries after you entered protected mode and you can control when to make first use of it by changing a segment register (usually CS with a jump).

Actually you don't strictly need an identity page if you know what you are doing (say by duplicating your code or by using some hardware memory aliasing) but this is very context specific in the general case it just makes things uselessly complicated.

Leroi answered 22/6, 2015 at 20:44 Comment(4)

Thanks! When you say "I usually only identity map pages 7 and 8", does that mean that you write x86 operating systems on a regular basis? – Omnipotence 23/6, 2015 at 15:15

@AlexD When I want to experiment with hardware (CPU new features, system components) I usually write a small boot program. More than developing whole OSes (which is a big task) I write small parts of OS to consolidate theory, to discover or just for fun :) – Leroi 23/6, 2015 at 15:20

That is awesome. I would like to see your GitHub projects, if you have any. – Omnipotence 23/6, 2015 at 15:21

@AlexD ahaha I have started when GitHub didn't exist and I only had a 56Kbps PSTN connection (thereby limiting the access to the internet to few hours a week). I have a lot of file/project on an HD, with few comments and strange names :) I then was used to write a text (and only text) file with my annotations, I'll give you that all but I wrote everything in italian – Leroi 23/6, 2015 at 15:25

C

1

Empirical answer: comment out the paging identity map setup on this minimal paging example: https://github.com/cirosantilli/x86-bare-metal-examples/blob/24988411adf10cf9f6afd1566e35472eb8ae771a/paging.S#L79 and watch the OS break. So yes, it "jumps".

Chromatic answered 27/10, 2015 at 8:44 Comment(0)

Yes and No

So, do or don't we need an identity map?

Yes and No

So, do or don't we need an identity map?

Recommended topics

Hot tags