Architecture and Design
From the point of view of protection, the x86 architecture is based on hierarchical rings, according to which all execution space delivered by processor is divided into four hierarchical protection domains, each of which have its own level of privileges assigned. This design assumes that most of the time code will be executed in the least privileged domain and sometimes services from the more privileged security domain will be requested and this services will preempt the less privileged activities onto the stack and then restore it in such a way that whole preemption will be invisible for the less privileged code.
The design of hierarchical protection domains state that the control can't be passed arbitrary between different security domains.
A gate is a feature of x86 architecture for control transfer from less privileged code segments to more privileged ones, but not vice versa. Furthermore, the point in the less privileged segment from where control will be passed can be arbitrary, but point in the more privileged segment to where control will be passed is strictly specified. Backward control passing to the less privileged segment is allowed only by means of IRET
instruction. In this regards Intel Software developer manual claims:
Code modules in lower privilege segments can only access modules operating at higher privilege segments by means of a tightly controlled and protected interface called a gate. Attempts to access higher privilege segments without going through a protection gate and without having sufficient access rights causes a general-protection exception (#GP
) to be generated.
In other words, a gate is a privileged domain entry point with required access rights and a target address. In that way all gates are similar and used for almost same purposes, and all gate descriptors contain DPL field, that used by processor to control access rights. But note, the processor checks the DPL of the gate only if the source of the call was a software CALL
, JMP
, or INT
instruction, and bypasses this check when the source of the call is a hardware.
Types of Gates
Despite the fact that all gates are similar, they have some differences because originally Intel engineers thought that different gates would be used for different purposes.
Task Gate
A Task Gate can be stored only in IDT and GDT and called by an INT
instruction. It is very special type of gate that differs significantly from the others.
Initially, Intel engineers thought that they would revolutionize multitasking by providing CPU based feature for task switching. They introduced TSS (Task State Segment) that hold registers state of the task and can be used for hardware task switching. There are two ways for triggering hardware task switching: by using TSS itself and by using Task Gate. To make hardware task switch you can use the CALL
or JMP
instructions. If I correctly understand, the main reason for the task gate introduction was to have the ability to trigger hardware task switches in reply to the interrupt arrival, because a hardware task switch can't be triggered by a JMP
to the TSS selector.
In reality, no one uses it nor the hardware context switching. In practice, this feature isn't optimal from the performance point of view and it isn't convenient to use. For example, taking into the account that TSS can be stored only in GDT and length of GDT can't be more then 8192, we can't have more then 8k tasks from the hardware point of view.
Trap Gate
A Trap Gate can be stored only in IDT and called by an INT
instruction. It can be considered as a basic type of gate. It just passes control to the particular address specified in the trap gate descriptor in the more privileged segment and nothing more.
Trap gates actively used for different purposes, that may include:
- system call implementation (for example Linux use
INT 0x80
and Windows use INT 0x2E
for this purposes)
- exception handling implementation (we haven't any reason to disable interrupts in the case of exception).
- interrupt handling implementation on machines with APIC (we can control kernel stack better).
Interrupt Gate
An Interrupt Gate can be stored only in IDT and called by an INT
instruction. It is the same as trap gate, but in addition interrupt gate call additionally prohibits future interrupt acceptance by automatic clearing of the IF flag in the EFLAGS register.
Interrupt gates used actively for interrupt handling implementation, especially on PIC based machines. The reason is a requirement to control stack depth. PIC doesn't have the interrupt sources priorities feature. Due to this by default, PIC disable only the interrupt that already on handling in processor. But another interrupts still can arrive in the middle and preempt the interrupt handling. So there is can be 15 interrupt handlers on the kernel stack in the same moment. As result kernel developers forced either to increase kernel stack size significantly that leads to the memory penalty or be ready to faced with sporadic kernel stack overflow. Interrupt Gate can provide guarantee that only one handler can be on the kernel stack in the same time.
Call Gate
A Call Gate can be stored in GDL and LDT and called by CALL
and JMP
instructions. Similar to trap gate, but in addition can pass number of parameters from the user mode task stack to kernel mode task stack. Number of parameters passed is specified in the call gate descriptor.
Call gates were never popular. There are few reasons for that:
- They can be replaced by trap gates (Occam's Razor).
- They not portable a lot. Other processors have no such features which means that support of call gates for system calls is a burden when porting the operating system as those calls must be rewriten.
- They're not too flexible, due to the fact that amount of the parameters that can be passed between stacks is limited.
- They're not optimal from the performance point of view.
At the end of 1990s Intel and AMD introduced additional instructions for system calls: SYSENTER
/SYSEXIT
(Intel) and SYSCALL
/SYSRET
(AMD). In contrast to the call gates, the new instructions provide a performance benefits and have found adoption.
Summary
I disagree with Michael Foukarakis. Sorry, but there are no any differences between interrupts and traps except affecting IF
flag.
In theory, each type of gate can serve as interface pointing to a segment with any level of privileges. On practice, in modern operating system in use only interrupt and trap gates, that is used in IDT for system calls, interrupts and exception handling and due to this all they serve as kernel entry point.
Any type of gate (including interrupt, trap and task) can be invoked in software by using an INT
instruction. The only feature that can prohibit user mode code access to particular gate is DPL. For example, when operating system builds IDT, regardless of the types of the particular gates, kernel setup DPL of the gates that will be used for hardware event handling to 0 and according to this access to this gates will be allowed only from the kernel space (that runs at most privileged domain), but when it setup gate for the system call, it set DPL to 3 to allow access to that gate from any code. In result, user mode task is able to make system call using gate with DPL = 3, but will catch General Protection Fault on attempt to call keyboard interrupt handler, for example.
Any type of gate in IDT can be invoked by hardware. People use interrupt gates for this hardware events handling only in cases when they want to achive some synchronization. For example to be sure that kernel stack overflow is impossible. For example, I have successfull experience of trap gates usage for hardware interrupt handling on the APIC based system.
Similar way, gate of any type in IDT can be called in software. The reason for the using trap gates for system call and exceptions is simple. No any reasons to disable interrupts. Interrupt disabling is a bad thing, because it increases interrupt handling latencies and increase probability of interrupt lost. Due to this no one won't disable them without any serious reason on the hands.
Interrupt handler usually written in strict reentrant style. In this way interrupt handlers usually share no data and can transparently preempt each other. Even when we need to mutually exclude concurrent access to the data in interrupt handler we can protect only the access to the shared data by using cli and sti instructions. No any reason to consider an whole interrupt handler as a critical section. No any reason to use interrupt gates, except desire to prevent possible kernel stack overflow on the PIC based systems.
The trap gates is a default solution for kernel interfacing. Interrupt gate can be used instead of trap gate if there is some serious reason for that.