How processor handles case of division by zero
Asked Answered
P

3

17

Curious what the processor/CPU does in general or let say, on intel cpu & Linux, when it executes a division by zero instruction. Also how the error is relayed to the application, so that it can log the error or notify the developer?

Thank you!

Peel answered 26/5, 2014 at 22:20 Comment(2)
Which processor? Which OS?Emileeemili
@OliCharlesworth Ah, just edited the question. Basically, I just want get a rough idea how the whole process work in general. We can assume it is linux on intel cpu.Peel
P
22

To answer in general terms, rather than going into gory details for Linux on x86_64, which are likely to obscure the concepts.

CPUs tend to throw an exception interrupt, on things like division by zero, or dereferencing a NULL pointer. These interrupts are trapped, like when hardware interrupts, halting execution of current program and return control to the OS, which then handles the event. Whilst the actions are very environment dependant, commonly the program may be terminated, all resources freed (memory, open files) and optionally core dumps/stack traces generated for debugging purposes on a developers system.

A runtime might be able to configure things so an exception handler is called, perhaps a scripting language wants to catch integer division by 0, or integer overflow and then throw a programming language exception or generate diagnostics to help the programmer understand where & why, it happened. Raising a signal, which may be caught by the application and handled, or lead to termination, is another traditional possibility.

On some RISC CPUs, software traps in OS would run to fix up misaligned data accesses, so reading memory would work, but with a performance penalty. In past, traps would sometimes be used to emulate, defined instructions but which were not implemented in hardware by a particular CPU model. I've also seen hardware memory errors logged, as OS initiates an ECC memory recovery operation, though that is handled differently on x86.

System calls, actually use the same mechanism to jump, from a user space application, into the OS kernel which then handles the event, hence the common term trap.

Petal answered 26/5, 2014 at 23:4 Comment(3)
Right, was trying to differentiate them from interrupts generated by peripherals, quick explanations can be so trickyPetal
@Petal "These interrupts are trapped, like when hardware interrupts, halting execution of current program and return control to the OS", so all these are handled by cpu, i.e. is implemented by hardware or firmware (since at this point, OS has no control), right?Peel
Yes, it's the hardware. The OS scheduler can regain control by scheduling a timer interrupt even if no other interrupts occur (peripherals or system calls), so really it just lends the CPU to the user program.Petal
A
6

Let me try to answer this a little differently. Every processor I have worked with defines an interrupt vector structure. On the Intel chips this structure is called the interrupt dispatch table (IDT). The interrupt vector is an array of pointers to functions. Each entry in the array corresponds to a specific event (interrupt or exception (fault or trap)).

The operating system sets up the functions (interrupt handler, exception handler) for each event. When a divide by zero occurs, it triggers an exception. The CPU responds by invoking the exception handler in the interrupt vector corresponding to a divide by zero. On the Pentium, this is the very first entry in the table.

Astound answered 27/5, 2014 at 0:49 Comment(0)
C
0

When a DIV0 error happens "in the CPU", then the application will not be able to log anything (unless some child processes are managing).

A DIV0 almost never happens in the CPU, it is caught by:

$  echo $(( 1/0 ))
bash: 1/0 : division by 0 (error token is "0 ")

This is not a proof - could be a log. But it starts with bash:, and is conscious about the space in the offending token "0 ".

On the Pentium, this is the very first entry in the table.

User33 finishes off his tight description with this. I landed here because of scheduler questions, where I used the example of a DIV0 in the CPU to illustrate a (sudden) "block" (better: halt) of that process. My point is: CPUs follow tradition and logic by refusing to divide by zero, including any cheap hacks. It is by definition a full stop. Shouldn't happen, but if, then no further instruction is processed. Instead that very special first entry is used. Special, because it is the most simple case why a CPU raises a "can't continue" exception.

Without protected mode, a div0 in the cpu is a system crash. But thanks to these special stowed-away tables and functions, in protected mode, the kernel/scheduler can restart the CPU and continue without the halted process. That process is dead and will be removed - another one, just like the first, has to be started ;)

Czarra answered 19/10, 2019 at 19:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.