Handling an undefined instruction in the kernel
Asked Answered
L

1

7

So I'm playing around with reading system registers in the kernel and I've recently run into a bit of a roadblock.

In ARM64, certain system registers (e.g. OSECCR_EL1) are not always implemented. If they are implemented, then trying an mrs instruction is fine - nothing bad happens. But if they AREN'T implemented, then the kernel throws an Oops due to an undefined instruction.

This isn't unreasonable, however, as I'm inside a kernel module while running this mrs instruction, I don't see an easy way to recover from this oops, or even recognize that a particular system register read was going to fail in the first place.

Is there any easy way to identify beforehand whether a system register is valid, or at the very least, handle a kernel oops in a way that doesn't immediately stop my kernel module function execution?

Lisp answered 15/4, 2020 at 21:31 Comment(4)
Could you add the dmesg messages generated by the kernel oops that happens? That'd be helpful.Theressa
Have you checked this already?Brouwer
It looks like kernel exception handling tables may be the best way to do this, but that seems...difficult.Lisp
I have never tried that myself, but isn't it possible to trap SIGILL signal in your module and do something if the exception triggers?Brouwer
T
3

Since you say you are just "playing around", I'm going to suggest a kinda dirty, but pretty straightforward solution.

The Linux kernel for ARM has its own way of handling undefined instructions to emulate them, this is done through simple "undefined instruction hooks", defined in arch/arm64/include/asm/traps.h:

struct undef_hook {
    struct list_head node;
    u32 instr_mask;
    u32 instr_val;
    u64 pstate_mask;
    u64 pstate_val;
    int (*fn)(struct pt_regs *regs, u32 instr);
};

These hooks are added through the (unfortunately not exported) function register_undef_hook(), and removed through unregister_undef_hook().

To solve your problem, you have two options:

  1. Export both functions by modifying arch/arm64/kernel/traps.c adding the following two lines of code:

    // after register_undef_hook
    EXPORT_SYMBOL(register_undef_hook);
    
    // after unregister_undef_hook
    EXPORT_SYMBOL(unregister_undef_hook);
    

    Now recompile the kernel and the functions will be exported and available to be used in modules. You now have a way of easily handling undefined instructions how you want.

  2. Use kallsyms_lookup_name() to lookup the symbols at runtime directly from your module, without the need to re-compile the kernel. A bit messier, but probably easier and surely overall a faster solution.

For option #1, here's an example module that does exactly what you want:

// SPDX-License-Identifier: GPL-3.0
#include <linux/init.h>   // module_{init,exit}()
#include <linux/module.h> // THIS_MODULE, MODULE_VERSION, ...
#include <asm/traps.h>    // struct undef_hook, register_undef_hook()
#include <asm/ptrace.h>   // struct pt_regs

#ifdef pr_fmt
#undef pr_fmt
#endif
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

static void whoops(void)
{
    // Execute a known invalid instruction.
    asm volatile (".word 0xf7f0a000");
}

static int undef_instr_handler(struct pt_regs *regs, u32 instr)
{
    pr_info("*gotcha*\n");

    // Just skip over to the next instruction.
    regs->pc += 4;

    return 0; // All fine!
}

static struct undef_hook uh = {
    .instr_mask  = 0x0, // any instruction
    .instr_val   = 0x0, // any instruction
    .pstate_mask = 0x0, // any pstate
    .pstate_val  = 0x0, // any pstate
    .fn          = undef_instr_handler
};

static int __init modinit(void)
{
    register_undef_hook(&uh);

    pr_info("Jumping off a cliff...\n");
    whoops();
    pr_info("Woah, I survived!\n");

    return 0;
}

static void __exit modexit(void)
{
    unregister_undef_hook(&uf);
}

module_init(modinit);
module_exit(modexit);
MODULE_VERSION("0.1");
MODULE_DESCRIPTION("Test undefined instruction handling on arm64.");
MODULE_AUTHOR("Marco Bonelli");
MODULE_LICENSE("GPL");

For option #2, you can just modify the above code adding the following:

#include <linux/kallsyms.h> // kallsyms_lookup_name()

// Define two global pointers.
static void (*register_undef_hook_ptr)(struct undef_hook *);
static void (*unregister_undef_hook_ptr)(struct undef_hook *);

static int __init modinit(void)
{
    // Lookup wanted symbols.
    register_undef_hook_ptr   = (void *)kallsyms_lookup_name("register_undef_hook");
    unregister_undef_hook_ptr = (void *)kallsyms_lookup_name("unregister_undef_hook");

    if (!register_undef_hook_ptr)
        return -EFAULT;

    // ...

    return 0;
}

static void __exit modexit(void)
{
    if (unregister_undef_hook_ptr)
        unregister_undef_hook_ptr(&uh);
}

Here's the dmesg output:

[    1.508253] testmod: Jumping off a cliff...
[    1.508781] testmod: *gotcha*
[    1.509207] testmod: Woah, I survived!

Some notes

  • The above example sets the undef_hook instruction/pstate masks/values to 0x0, this means that the hook will be called for any undefined instruction that is executed. You probably want to limit this to msr XX,YY, and you should be able to do it like this:

    // didn't test these, you might want to double-check
    .instr_mask  = 0xfff00000,
    .instr_val   = 0xd5100000,
    

    Where 0xfff00000 matches everything except the operands (according to the manual, page 779 of the PDF). You can look at the source code to see how these values are checked to decide whether to call the hook or not, it's pretty straightforward. You can also check the instr value that is passed to the hook: pr_info("Instr: %x\n", instr).

    From the comments it seems that the above isn't quite right, I don't really know much about ARM to give a correct answer for those values off the top of my head, but it should be easy to fix.

  • You can look at the struct pt_regs to see how it's defined. You probably only want to skip the instruction and maybe print something, in that case what I did in the above example should be enough. You could potentially change any register value though if you wanted to.

  • Tested on Linux kernel v5.6, qemu-system-aarch64.

Theressa answered 16/4, 2020 at 6:49 Comment(10)
Yeah I discovered this technique awhile back - I was trying to avoid modifying the kernel, but maybe I just have to byte the bullet.Lisp
Ah. Well. While we're in the process of "playing around" I think I might have another technique. While they're not exported by the kernel, those symbols ARE visible in /proc/kallsyms. I know this isn't an "official" way of handling this, but I could fetch the addresses from there, call them utterly manually, and thus avoid export while still getting the desired feature.Lisp
@Lisp I thought about that but that seemed even weirder. I'll add that method to my answer though, you're totally right, why exclude that possibility? I am also looking if there's a way to do this with kprobes (there should be, though I've never used those) or similar stuff, in that case I'll create another answer.Theressa
YAY that worked, although the instr_mask and instr_val are not working quite right somehow. Changing them to 0x0 "fixed" it (and probably broke something else).Lisp
@Lisp can you post the disassembly of your instruction? Just objdump -d mymodule.ko and look for it. I'd like to see which bytes that gets compiled into.Theressa
For anyone stumbling across this in the future who needs it, the solution was to use kallsyms_lookup_name to fetch the addresses of register_undef_hook and unregister_undef_hook, assign them to function pointers, then call them identically to how Marco did up above.Lisp
Marco, the instruction should be compiling to 0x400630d5 - NOTE this is big endian.Lisp
@Lisp ah I was assuming little endian, I almost got that right. You can try with the filter .instr_mask = 0xff, .instr_val = 0xd5.Theressa
Okay I will try that - to be clear, the processor is an LE processor, I just typed the instruction into chat in BE format. Once you leave the realm of integers, I find LE to be more confusing than anything else.Lisp
@Lisp Edited my answer to address all of the comments. Also, a good way to find out the right value is to just pr_info("Instr: %x\n", instr) in the hook function.Theressa

© 2022 - 2024 — McMap. All rights reserved.