bootloader - switching processor to protected mode

Asked 6/3, 2011 at 15:55 Answered 18/2, 2012 at 13:57

Solved assembly operating-system x86 bootloader protected-mode

I'm having difficulties understanding how a simple boot loader works. The boot loader I'm talking about is the one from MITs course "Operating Systems Engineering".

First, let me show you a piece of assembly code the BIOS executes:

[f000:fec3]    0xffec3: lidtw  %cs:0x7908
[f000:fec9]    0xffec9: lgdtw  %cs:0x7948
[f000:fecf]    0xffecf: mov    %cr0,%eax
[f000:fed2]    0xffed2: or     $0x1,%eax
[f000:fed6]    0xffed6: mov    %eax,%cr0
[f000:fed9]    0xffed9: ljmpl  $0x8,$0xffee1

From the looks of it, This code sets up the interrupt table and the descriptor table and then turns on the protected mode.

Why do we go into protected mode in the BIOS? Shouldn't the bootloader run in real mode (btw - why does it need to run in real mode?)
I searched but didn't find anywhere exactly how the ljmpl instruction works, and is the difference between it and ljmp and regular jmp - I would appreciate if someone would point in the right direction.
Why do we perform the jump? What is the purpose of this instruction?

Moving on to the boot loader code -

# Switch from real to protected mode, using a bootstrap GDT
# and segment translation that makes virtual addresses
# identical to their physical addresses, so that the
# effective memory map does not change during the switch.
lgdt    gdtdesc
movl    %cr0, %eax
orl     $CR0_PE_ON, %eax
movl    %eax, %cr0

# Jump to next instruction, but in 32-bit code segment.
# Switches processor into 32-bit mode.
ljmp    $PROT_MODE_CSEG, $protcseg

It says that the processor is in real mode - but we just saw that the BIOS switches to protected mode... I'm confused - how can this be possible?
How do we switch to 32bit mode? What causes the processor to magically go into 32bit mode due to the ljmp instruction?

And another thing that I don't understand - when I track the execution of the bootloader with gdb I see the following instruction being executed (that's the ljmp instruction from the bootloader code):

ljmp   $0x8,$0x7c32

But when I looked in the .asm file I saw the following:

ljmp   $0xb866,$0x87c32

Totally lost here - How come the instruction written in the .asm file and the instruction executed are different? I have a hunch this has to do with the protected mode and how it translates the addresses but I don't really get it.

I would appreciate any help!

Netti answered 6/3, 2011 at 15:55 Comment(1)

Voting to close as too broad: too many questions in one. – Humectant 18/10, 2015 at 17:28

Some BIOS implementations go into protected mode before entering the bootloader. Most don't. It is possible that BIOS switches to protected mode for a short period and switches back before going to the bootloader, which would allow it to use some of the benefits of protected mode (such as 32 bit being the default address size). The reason that the bootloader should be in real mode is that most BIOS functions only work in real mode, so you need to be in real mode to use them.
ljmp specifies a code segment to switch to in addition to the address to jump to. They are so similar that (at least in GAS) the assembler will switch a jmp with 2 operands to a ljmp for you.
ljmp is one of the only ways to change the cs register. This needs to be done to activate protected mode, as the cs register needs to contain the selector for a code segment in the GDT. (In case you want to know, the other ways to change cs are far call, far return, and interrupt return)
See item 1. Either BIOS switched back to real mode, or this bootloader will not work with this BIOS.
See item 3. It changes cs to specify a 32 bit code segment, so the processor goes into 32 bit mode.
When you looked at the .asm file, the instruction was interpretted as if the address size was 32 bits, but GDB interpretted it as if the address size was 16 bits. The data at the address of the instruction would be 0xEA 32 7C 08 00 66 B8. EA is the long jump opcode. In a 32 bit address space, the address would be specified using the next four bytes, for an address of 0x87C32, but in a 16 bit address space, only 2 bytes are used, for an address of 0x7C32. The 2 bytes after the address specify the requested code segment, which would be 0xB866 in 32 bit mode and 0x0008 in 16 bit mode. The 0x66 B8 is the start of the next instruction, which is moving a 16 bit immediate value into the ax register, probably to set up the data segments for protected mode.

Leftist answered 6/3, 2011 at 22:30 Comment(0)

Why do we go into protected mode in the BIOS? Shouldn't the bootloader run in real mode (btw - why does it need to run in real mode?)

Protected mode simply offer a lot more feature than realmode: essentially Intel CPU's protection ring privilege mechanism (http://en.wikipedia.org/wiki/Ring_(computer_security), 32-bit mode execution etc.

I searched but didn't find anywhere exactly how the ljmpl instruction works, and is the difference between it and ljmp and regular jmp - I would appreciate if someone would point in the right direction.

ljmpl and ljmp is the same contextually here.

Why do we perform the jump? What is the purpose of this instruction?

This is required as documented in Intel manual, and documented inlined in the code shown below as well..

For the real-to-protected transition, it is implemented in stage2 bootloader here:

http://src.illumos.org/source/xref/illumos-gate/usr/src/grub/grub-0.97/stage2/asm.S#real_to_prot

974   /* load the GDT register */
975   DATA32  ADDR32  lgdt    gdtdesc
976 
977   /* turn on protected mode */
978   movl    %cr0, %eax
979   orl $CR0_PE_ON, %eax
980   movl    %eax, %cr0
981 
982   /* jump to relocation, flush prefetch queue, and reload %cs */
983   DATA32  ljmp    $PROT_MODE_CSEG, $protcseg
984

As u can see, each part of the code has a function, and ljmp is essentially to flush out the prefetch queue, as required in the Intel manual, I cannot remember where.

Taperecord answered 18/2, 2012 at 13:57 Comment(1)

The ljmp loads cs with a selector, which typically selects a descriptor in the GDT which is DPL=0, 32-bit code segment. Before that ljmp executes, you are still in a 16 bit code segment, regardless of whether PE is set. PE being set influences the behaviour of loading segment registers. It is the cs selector loading a descriptor that really changes the mode. – Waterhouse 19/3, 2018 at 6:49

Recommended topics

Hot tags