You need to setup several things before you attempt to enter protected mode:
Initialize a GDT in memory
You need a global descriptor table in memory. It needs room for at least these selectors:
- You need a ring0 32-bit code descriptor
- You need a ring0 32-bit data descriptor
- You need a GDT segment
- You need an IDT segment
- You need a TSS segment
- You probably want an LDT segment (every process should have an LDT that begins at the same linear address in every process, then the one LDT descriptor can handle every process, and paging will handle the switching).
In protected mode, a selector is an index into the GDT or LDT. Code and data descriptors tell the CPU the base address and length of the memory to use when a selector is loaded with that index.
The LGDT
instruction sets the GDTR
.
Initialize a TSS in memory
A TSS segment tells the CPU where you are going to store the TSS. Some of the functionally originally built into the TSS is marginally useful, since the context switch is faster if you do it manually. However, it is essential for one thing though: it stores the stack for the kernel to use when a process transitions from ring3 to ring0. The kernel cannot trust the caller at all. It cannot assume that the caller did not go crazy and corrupt the stack pointer. When transitioning from ring3 to ring0, the CPU loads the stack pointer from the TSS, and pushes the callers stack segment and offset onto the kernel stack, before pushing the code segment and offset return address.
The LTR
instruction loads the task register with a TSS segment.
Initialize an IDT in memory
The IDT lets the CPU lookup what to do when various events occur. The essential purpose is exception handling. The CPU implements exceptions as interrupts. The operating system must set up handlers for all of the exceptions.
The LIDT
instruction loads the IDTR
.
Hardware interrupts covered below.
If an exception occurs while processing an exception, a double fault exception occurs. If an exception occurs when processing a double fault, the CPU translates that into a shut-down message to the motherboard. Typical motherboards will reset the CPU when that happens, the BIOS will see that the reset was unexpected in its bootstrap start-up code and it will do a reboot.
Initialize the interrupt controller
Hardware devices also provide hardware interrupts (as opposed to the software interrupts mentioned earlier). Hardware interrupts occur when devices need service.
If you intend to support old machines, then you need code to use and handle the 8259 interrupt controller.
You need code to handle the interrupt, save the context, acknowledge the interrupt, and somehow call a driver or queue a work item somewhere to service the hardware.
The interrupt controller is set up to provoke the CPU to process an interrupt, when a hardware device asserts its interrupt control line (on ancient systems), or when a MSI interrupt packet reaches the CPU (on modern systems capable and configured to use MSI).
If you want maximum capabilities and need to support multiple processors, then you must...
Initialize the APIC
The APIC is exactly what the name says: Advanced Programmable Interrupt Controller.
The APIC allows complex control over prioritization, masking, and interprocessor communication. It is too large and complex to really cover it properly here.
Initialize paging
The paging is broken down into a two level lookup. The top level is called the page directory. The second level is called a page table.
Every page consists of 1024 32 bit page descriptors. The high 20 bits are the high 20 bits of the physical address for that page table entry. Lower bits contain several flags for permission and to let the OS detect usage of memory so it can be intelligently swapped/discarded/kept.
Each page directory entry describes the base address of one 4KB page table for that range of memory. Each entry of the page directory points to one page table which can have up to 4MB of memory mapped.
Each page descriptor of the page table describes the permissions to, access history, and base address of a 4KB range of memory.
So the operating system must allocate at least one 4KB page for the page directory, and at least one 4KB page for every 4MB of memory committed. Note that you may have sparse mappings where there are large regions where no memory exists and a page fault would occur if you accessed it.
You enable paging with the PG
bit of CR0
. The PDBR
control register (CR3) tells the CPU the physical address of the page directory.
Order
Initialize GDT, IDT, TSS (and allocate kernel stack memory, user stack memory (if needed), in memory.
Whack a GDT code and data entry at index 1 and 2 of the GDT memory, and set them to have zero base address, 4GB limit, ring0.
Set CR0 bit 0, the PE
or protection-enable bit.
The big jump
Immediately do a far jump to 0x10:next-instruction
where next-instruction is probably resolved in the linker to a label on the next line. (You can push can a far pointer on the stack and far jump indirect through it). You need to subtract (cs << 4) from the base address because the jump target is relative to the segment you are assembling at some arbitrary base, set in the real-mode cs
.
You must load all of the segment registers after entering protected mode, because the CPU does a bunch of permission checks and sets up several internal things in the CPU that are different in protected mode.
Tell the assembler!
Note that after that branch target, you suddenly need to start assembling instructions differently. Before the far jump, you were in real mode, but as soon as cs loaded, a whole lot of things changed in the CPU, and it actually changes the way it decodes instructions. It assumes 32-bit registers and addresses, and the address size prefix tells it to be 16-bit.
In real mode, it was the other way around, the address size or operand size prefix told it to be 32-bit. Therefore you need to use some kind of assembler directive to tell the assembler to reverse the usage of those prefixes and change various things to deal with 32-bit mode.
Obviously you need to setup the stack. You had to deal with linear addresses several times already, when setting up the descriptor addresses for LDT,IDT, etc.
Now you can setup the page directory and page tables, load the PBDR
.
Each page directory entry can be flagged to not be flushed when switching page tables. Typically kernel mode has the same mapping for every process.
Typically each process gets its own page directory, and it shares the kernel tables. Its user mode allocations are done to its own private page tables for the user memory range.
Although paging is not required, it enables a lot of really cool capabilities and protections. You probably want it.
After you enable paging and load the PDBR, you are, by every definition, completely in protected mode, and you have implemented a chunk of the core code to implement an operating system on the x86 architecture.