Why am I able to perform floating point operations inside a Linux kernel module?
Asked Answered
N

4

19

I'm running on an x86 CentOS 6.3 (kernel v2.6.32) system.

I compiled the following function into a bare-bones character driver module as an experiment to see how the Linux kernel reacts to floating point operations.

static unsigned floatstuff(void){
    float x = 3.14;
    x *= 2.5;
    return x;
}

...

printk(KERN_INFO "x: %u", x);

The code compiled (which wasn't expecting) so I inserted the module and checked the log with dmesg. The log showed: x: 7.

This seems strange; I thought you couldn't perform floating point operations in the Linux kernel -- save some exceptions such as kernel_fpu_begin(). How did the module perform the floating point operation?

Is this because I'm on an x86 processor?

Nadbus answered 8/4, 2013 at 16:9 Comment(6)
Why wouldn't a kernel be able to do floating-point operations?Unpeg
Why are you so surprised? A kernel module is, after all, just another piece of code to be executed by the CPU. As long as it can execute the opcodes you throw at it, you're fine.Doenitz
Also, it's quite possible that the arithmetic is performed during the compilation and all that remains is a return 7;.Humes
@DanielFischer You are correct: the floating point operations were being optimized out. I now get the following error when I try to perform the operations: Unknown symbol _mulsf3. Is this the error I was expecting (that the floating point multiplication can't be performed)?Nadbus
This is already answered here: stackoverflow.com/questions/13886338/… The question is more of a not-so-correct statement the answer is explanatory. And you can do FP in kernel.Nomi
kernel_fpu_begin() / end is necessary to not break user-space FPU state. Without it, you can do FP in the kernel, but you will corrupt the FPU state of the current process. Linux does lazy FPU context saving, because some processes don't use the FPU or SSE registers at all. (More and more processes do use SSE, though.)Allophone
A
19

I thought you couldn't perform floating point operations in the Linux kernel

You can't safely: failure to use kernel_fpu_begin() / kernel_fpu_end() doesn't mean FPU instructions will fault (not on x86 at least).

Instead it will silently corrupt user-space's FPU state. This is bad; don't do that.

The compiler doesn't know what kernel_fpu_begin() means, so it can't check / warn about code that compiles to FPU instructions outside of FPU-begin regions.

There may be a debug mode where the kernel does disable SSE, x87, and MMX instructions outside of kernel_fpu_begin / end regions, but that would be slower and isn't done by default.

It is possible, though: setting CR0::TS = 1 makes x87 instructions fault, so lazy FPU context switching is possible, and there are other bits for SSE and AVX.


There are many ways for buggy kernel code to cause serious problems. This is just one of many. In C, you pretty much always know when you're using floating point (unless a typo results in a 1. constant or something in a context that actually compiles).


Why is the FP architectural state different from integer?

Linux has to save/restore the integer state any time it enters/exits the kernel. All code needs to use integer registers (except for a giant straight-line block of FPU computation that ends with a jmp instead of a ret (ret modifies rsp).)

But kernel code avoids FPU generally, so Linux leaves the FPU state unsaved on entry from a system call, only saving before an actual context switch to a different user-space process or on kernel_fpu_begin. Otherwise, it's common to return to the same user-space process on the same core, so FPU state doesn't need to be restored because the kernel didn't touch it. (And this is where corruption would happen if a kernel task actually did modify the FPU state. I think this goes both ways: user-space could also corrupt your FPU state).

The integer state is fairly small, only 16x 64-bit registers + RFLAGS and segment regs. FPU state is more than twice as large even without AVX: 8x 80-bit x87 registers, and 16x XMM or YMM, or 32x ZMM registers (+ MXCSR, and x87 status + control words). Also the MPX bnd0-4 registers are lumped in with "FPU". At this point "FPU state" just means all non-integer registers. On my Skylake, dmesg says x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.

See Understanding FPU usage in linux kernel; modern Linux doesn't do lazy FPU context switches by default for context switches (only for kernel/user transitions). (But that article explains what Lazy is.)

Most processes use SSE for copying/zeroing small blocks of memory in compiler-generated code, and most library string/memcpy/memset implementations use SSE/SSE2. Also, hardware supported optimized save/restore is a thing now (xsaveopt / xrstor), so "eager" FPU save/restore may actually do less work if some/all FP registers haven't actually been used. e.g. save just the low 128b of YMM registers if they were zeroed with vzeroupper so the CPU knows they're clean. (And mark that fact with just one bit in the save format.)

With "eager" context switching, FPU instructions stay enabled all the time, so bad kernel code can corrupt them at any time.

Allophone answered 1/11, 2017 at 17:39 Comment(2)
The article you referenced is a little outdated. In particular, support for lazy mode was completely removed from the kernel. So the default eager mode is the only mode now.Langland
https://patchwork.kernel.org/patch/9362413/.Langland
D
8

Don't do that!

In kernel-space FPU mode is disabled due to several reasons:

  • It allows Linux to run in architectures that do not have FPU
  • It avoids to save and restore the whole set of registers every kernel/user-space transition (it may double the time of context switch)
  • Basically all of the kernel functions use integers also for representing decimal numbers -> you don't probably need floating point
  • In Linux, preemption is disabled when kernel-space is running in FPU mode
  • Floating point numbers are evil and may generate very bad unexpected behaviour

If you really want to use FP numbers (and you should not) you must use the kernel_fpu_begin and kernel_fpu_end primitives to avoid to break user-space registers, and you should take in account all of the possible problems (security included) in dealing with FP numbers.

Din answered 1/11, 2017 at 13:39 Comment(0)
F
2

Not sure where this perception is coming from. But the kernel executes on the same processor as the user mode code, and therefore has access to the same instruction set. If the processor can do floating point (directly or by a co-processor), the kernel can too.

Maybe you are thinking of cases where floating point arithmetic is emulated in software. But even so, it would be available in kernel (well, unless disabled somehow).

I am curious, where is this perception coming from? Maybe I am missing something.

Found this. Seems to be a good explanation.

Fenestra answered 8/4, 2013 at 16:57 Comment(4)
Perhaps my question is a bit misleading. I understand that the FPU can execute these floating point instructions (i.e. the machine code itself is system agnostic), but I'm confused on how to get my C code to compile without the GCC errors about undefined symbols such as __fixunssfsi when I'm compiling the kernel module. I suspect this is just GCC depending on helper routines in a library the kernel excludes, so how do I get around this so that the correct machine code is generated -- the processor supports floating point after-all.Nadbus
Let me add that I am aware floating point registers are not saved; I don't particularly care about this trashing a userland program since I'm purely experimenting with code to get a better understanding of the behavior.Nadbus
I figured it out; I needed to pass this compiler flag to GCC: -mhard-float.Nadbus
The point is that the floating point state is not correctly saved & restored from inside the kernel, e.g. when scheduling tasks (it is saved only from inside application's point of view).Fechner
I
1

OS kernel may simply turn the FPU off in kernel mode.

While FPU operation, while floating point operation kernel will turn the FPU on and after that turn off the FPU.

But you can not print it.

Incomprehensible answered 21/4, 2016 at 9:47 Comment(1)
Good answer: See this SO answer. It explains, that the FPU may be disabled due to performance reasons.Marhtamari

© 2022 - 2024 — McMap. All rights reserved.