Why do I get triple fault when trying to handle an exception on 286 but not on a modern CPU nor Bochs?
Asked Answered
T

1

6

I'm trying to initialize protected mode with exception handling on an AMD 286 system. I've debugged the code below on Bochs, and it works fine there. So does it when run on a Pentium 4 machine. But on the 286 it simply triple faults when it gets to the int3 instruction. The observable behavior is: if I comment out int3, I get the "OK" showed on the screen indefinitely, while with the code as is, the system reboots.

The code is to be compiled by FASM, and the binary put into a boot sector of an HDD or FDD. I'm actually running it from a 1.4M floppy.

 org 0x7c00
 use16

 CODE_SELECTOR     = code_descr - gdt
 DATA_SELECTOR     = data_descr - gdt

    ; print "OK" on the screen to see that we've actually started
    push     0xb800
    pop      es
    xor      di,di
    mov      ax, 0x0700+'O'
    stosw
    mov      ax, 0x0700+'K'
    stosw
    ; clear the rest of the screen
    mov      cx, 80*25*2-2
    mov      ax, 0x0720
    rep stosw

    lgdt     [cs:gdtr]
    cli
    smsw     ax
    or       al, 1
    lmsw     ax
    jmp      CODE_SELECTOR:enterPM
enterPM:
    lidt     [idtr]
    mov      cx, DATA_SELECTOR
    mov      es, cx
    mov      ss, cx
    mov      ds, cx

    int3     ; cause an exception
    jmp      $

intHandler:
    jmp      $

gdt:
    dq       0
data_descr:
    dw       0xffff     ; limit
    dw       0x0000     ; base 15:0
    db       0x00       ; base 23:16
    db       10010011b  ; present, ring0, non-system, data, extending upwards, writable, accessed
    dw       0          ; reserved on 286
code_descr:
    dw       0xffff     ; limit
    dw       0x0000     ; base 15:0
    db       0x00       ; base 23:16
    db       10011011b  ; present, ring0, non-system, code, non-conforming, readable, accessed
    dw       0          ; reserved on 286

gdtr:
    dw       gdtr-gdt-1
 gdtBase:
    dd       gdt

idt:
 rept 14
 {
    dw       intHandler
    dw       CODE_SELECTOR
    db       0
    db       11100111b    ; present, ring3, system, 16-bit trap gate
    dw       0            ; reserved on 286
 }
idtr:
    dw       idtr-idt-1
 idtBase:
    dd       idt

finish:
    db       (0x7dfe-finish) dup(0)
    dw       0xaa55

I suppose I'm using some CPU feature that the 286 doesn't support, but what exactly and where?

Trondheim answered 17/8, 2019 at 7:20 Comment(0)
G
6
  • In your protected mode code you have:

    lidt     [idtr]
    mov      cx, DATA_SELECTOR
    mov      es, cx
    mov      ss, cx
    mov      ds, cx
    

    This relies on DS being set to 0x0000 prior to entering protected mode (and the corresponding base address being 0 in the DS descriptor cache) prior to doing lidt [idtr]. That instruction has an implicit DS segment. Place the lidt instruction after you set the segment registers with 16-bit selectors, not before.

  • Although it didn't manifest itself as a bug on your hardware, in real mode your code also relies on CS being set to 0x0000 for the instruction lgdt [cs:gdtr]. CS being 0x0000 isn't guaranteed as it is very possible for some BIOSes to use a non zero CS to reach your bootloader. For example 0x07c0:0x0000 would also reach physical address 0x07c00 (0x07c0<<4+0x0000=0x07c00). In the real mode code I'd recommend setting DS to zero and using lgdt [gdtr].

  • Once in protected mode and before using the stack you should set SP. Interrupts will require the stack pointer being somewhere valid. Initializing it to 0x0000 would have the stack grow down from the top of the 64KiB segment. You shouldn't rely on it happening to point somewhere that won't interfere with your running system once in protected mode (ie. on top of your bootloader code/data).

  • Before using any of the string instructions like STOS/SCAS/CMPS/LODS you should ensure that the Direction Flag is set as you expect it. Since you rely on forward movement you should clear the Direction Flag with CLD. You shouldn't assume that the Direction Flag is clear upon entry to your bootloader.

Many of these issues are captured in my General Bootloader Tips in another Stackoverflow answer.

Guttate answered 17/8, 2019 at 8:2 Comment(2)
Actually yes, the code used to have jmp 0x0000:start at the beginning, which got stripped when I was making an MCVE, failing to notice where I was relying on CS==0. That's why I have lgdt [cs:gdtr] there.Trondheim
@Trondheim : in that case you are okay where you have set CS to 0 with a FAR JMP, I could only go by the code shown. No problem.Guttate

© 2022 - 2024 — McMap. All rights reserved.