Transition from real to protected mode in the Linux kernel
Asked Answered
T

1

7

I am currently studying low level organization of operating systems. In order to achive that I am trying to understand how Linux kernel is loaded.

A thing that I cannot comprehend is the transition from 16-bit (real mode) to 32-bit (protected mode). It happens in this file.

The protected_mode_jump function performs various auxiliary calculations for 32-bit code that is executed later, then enables PE bit in the CR0 reguster

    movl    %cr0, %edx
    orb $X86_CR0_PE, %dl    # Protected mode
    movl    %edx, %cr0

and after that performs long jump to 32-bit code:

    # Transition to 32-bit mode
    .byte   0x66, 0xea      # ljmpl opcode
2:  .long   in_pm32         # offset
    .word   __BOOT_CS       # segment

As far as I understand in_pm32 is the address of the 32-bit function which is declared right below the protected_mode_jump:

    .code32
    .section ".text32","ax"
GLOBAL(in_pm32)
    # some code
    # ...
    # some code
ENDPROC(in_pm32)

The __BOOT_CS sector base is 0 (the GDT is set beforehand here), so that means that offset should be basically absolute address of the in_pm32 function.

That's the issue. During machine code generation the assembler/linker should not know the absolute address of the in_pm32 function, because it does not know where it will be loaded in the memory in the real mode (various bootloaders can occupy various amounts of space, and the real mode kernel is loaded just after a bootloader).

Moreover, the linker script (setup.ld in the same folder) sets the origin of the code as 0, so seems like in_pm32 address will be the offset from the beginning of the real mode kernel. It should work just fine with 16-bit code because CS register is set properly, but when long jump happens the CPU is already in protected mode, so a relative offset should not work.

So my question: Why does the long jump in Protected Mode (.byte 0x66, 0xea) sets the proper code position if the offset (.long in_pm32) is relative?

Seems like I am missing something really important.

Thill answered 21/1, 2017 at 10:24 Comment(6)
I am about to head to bed as it is 3:30am. I only saw the part about the JMP. That jump is a 32-bit FAR JMP that sets the CS selector. Setting the Protected Mode bit in CR0 actually puts you into a quasi 16-bit Protected Mode. To get into 32-bit protected mode you need to do a FAR JMP that takes a selector (__BOOT_CS) and an offset and jumps to it. __BOOT_CS should be a selector that points at a 32-bit code segment descriptor (likely with a base of 0 and limit of 0xffffffff) in the GDT. When that FAR JMP is complete it will be in 32-bit protected mode.Jessalin
In essence that FAR JMP is required to get from quasi 16-bit protected mode to 32-bit protected mode.Jessalin
Indeed, the __BOOT_CS selector base is 0. I suppose that it means that FAR JUMP should take the absolute address of desired function/label (because 0 + offset will be just offset). But FAR JUMP here is called with relative offset (.long in_pm32 is the address of the in_mp32 function from the beginning on the real mode kernel binary) - and I don't understand why at the end the in_pm32 function is executed. The far jump should be mismatched on 0x7C00 + bootloader_size bytes.Thill
After reviewing that code, I realized there is a step you are overlooking. First of all when compiled 2: .long in_pm32 # offset will compute in_pm32 relative to the beginning of the section in the linker script (which you seem to understand). That value changes at runtime with these instructions movw %cs, %bx shll $4, %ebx addl %ebx, 2f These instruction take the current CS register and convert it into a linear address and that is added to the value AT label 2f. 2f is in fact this 2: .long in_pm32 . At runtime that offset is being adjusted before the JMP is made.Jessalin
All that code was designed to be loaded into any real mode segment (relocatable). At runtime the CS register is used to find the linear address (physical in this case) where the code is running and adjusts the address in that JMP so that it is no longer relative to the beginning of the segment, but relative to the bottom of memory (which in turn makes it an absolute address). The calculation from realmode segment:offset to linear address is (segment<<4)+offset . In this case it is (CS<<4)+in_pm32 and that value being saved back to label 2 in memory which is the FAR JMP itself.Jessalin
You are right! I have overlooked that step. Rather, I have thought it does something for next stages of 32-bit code. I totally don't get used to a self modified code, so that trick is quite amazing for me. I would be glad to accept your answer if you will post it as a separate answer.Thill
J
7

It appears that your question really is about how the offset stored at the following line can possibly work since it is relative to the start of the segment, not necessarily the start of memory:

 2:  .long   in_pm32         # offset

It is true that in_pm32 is relative to the offset the linker script uses. In particular the linker script has:

. = 0;
.bstext     : { *(.bstext) }
.bsdata     : { *(.bsdata) }

. = 495;
.header     : { *(.header) }
.entrytext  : { *(.entrytext) }
.inittext   : { *(.inittext) }
.initdata   : { *(.initdata) }
__end_init = .;

.text       : { *(.text) }
.text32     : { *(.text32) } 

The Virtual Memory Address is set to zero (and subsequently 495), so one would think that anything in the .text32 section will have to be fixed in low memory. This would be a correct observation had it not been for these instructions in protected_mode_jump:

    xorl    %ebx, %ebx
    movw    %cs, %bx
    shll    $4, %ebx
    addl    %ebx, 2f

[snip]

    # Transition to 32-bit mode
    .byte   0x66, 0xea      # ljmpl opcode
2:  .long   in_pm32         # offset
    .word   __BOOT_CS       # segment

There is a manually encoded FAR JMP at the end that is used to set the CS selector to a 32-bit code descriptor to finalize the transition to 32-bit protected mode. But the key thing to observe are in these lines specifically:

    xorl    %ebx, %ebx
    movw    %cs, %bx
    shll    $4, %ebx
    addl    %ebx, 2f

This takes the value in CS and shifts it left by 4 bits (multiply by 16) and then adds it to the value stored at label 2f. This is the way you take a real mode segment:offset pair and convert it into a linear address (which is the same as a physical address in this case). Label 2f is in fact the offset in_pm32 in this line:

2:  .long   in_pm32         # offset

When those instruction are complete, the long word value in_pm32 in the FAR JMP will be adjusted (at run time) by adding the linear address of the current real mode code segment to the value in_pm32. This .long (DWORD) value will be replaced with (CS<<4)+in_pm32.

This code was designed to be relocatable to any real mode segment. The final linear address is computed at run time before the FAR JMP. This is in effect self-modifying code.

Jessalin answered 21/1, 2017 at 18:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.