Isolate Kernel Module to a Specific Core Using Cpuset
Asked Answered
L

4

23

From user-space we can use cpuset to actually isolate a specific core in our system and execute just one specific process to that core.

I'm trying to do the same thing with a kernel module. So I want the module to get executed in an isolated core. In other words: How do I use cpuset's from inside a kernel module? *

Using linux/cpuset.h in my kernel module doesn't work. So, I have a module like this:

#include <linux/module.h>
#include <linux/cpuset.h>

...
#ifdef CONFIG_CPUSETS
    printk(KERN_INFO, "cpusets is enabled!");
#endif
cpuset_init(); // this function is declared in cpuset.h
...

When trying to load this module I get (in dmesg) the following message cpusets is enabled!. But I also receive the message Unknown symbol cpu_init (err 0).

Similarly, I tried using sched_setaffinity from linux/sched.h in order to move all running procceses to a specific core and then run my module to an isolated core. I got the same error mesage: Unknown symbol sched_setaffinity (err 0). I guess I got the "unknown symbols" because those functions have no EXPORT_SYMBOL in the kernel. So I went and tried to call the sys_sched_setaffinity system call (based on this question) but again got this mesage: Unknown symbol sys_sched_setaffinity (err 0)!

Furthermore, I am not looking for a solution that uses isolcpus, which is set while booting. I would like to just load the module and afterwards the isolationo to occur.

  • (More precise, I want its kernel threads to execute in isolated cores. I know that I can use affinity to bind threads to specific cores, but this does not guarantee me that the cores are going to be isolated by other processes running on them.)
Lehmbruck answered 29/3, 2016 at 15:39 Comment(6)
For bind specific kernel thread to specific core, you may use kthread_bind or set_cpu_allowed_ptr. To disallow other processes to use specific core, you need to configure schedule somehow. E.g., as described here.Fragmental
To add to @Tsyvarev's comment: isolcpus (kernel parameter -- see Documentation/kernel-parameters.txt) "can be used to specify one or more CPUs to isolate from the general SMP balancing and scheduling algorithms. You can move a process onto or off an "isolated" CPU via the CPU affinity syscalls or cpuset."Road
@Fragmental and @Gil Hamlton. Thanks for your answers. I'm aware of kthread_bind. Regarding the proposed solutions: isolcpus and maxcpus, from my understanding those parameters should be used while booting. I would like to isolate without restarting, actually, isolation should happen only when the module is loaded, so I feel this is not the way to go.Lehmbruck
Hi! Could you please post the output you get when you compile your module? Thanks @foobarJola
You said: "I guess I got the "unknown symbols" because those functions have no EXPORT_SYMBOL in the kernel". It does make sense, however I don't understand why it's available in cat /proc/kallsyms | grep cpuset_init. Even more confusing is finding that cpuset_init stands as undefined (U) in the output of nm module.ko.Jola
I've updated my answer with a working module, does it help solve your issue? If this doesn't help and the bounty's over ping me anyway I love kernel related stuff like this.Upcast
U
14

So I want the module to get executed in an isolated core.

and

actually isolate a specific core in our system and execute just one specific process to that core

This is a working source code compiled and tested on a Debian box using kernel 3.16. I'll describe how to load and unload first and what the parameter passed means.

All sources can be found on github here...

https://github.com/harryjackson/doc/tree/master/linux/kernel/toy/toy

Build and load the module...

make
insmod toy param_cpu_id=2

To unload the module use

rmmod toy

I'm not using modprobe because it expects some configuration etc. The parameter we're passing to the toy kernel module is the CPU we want to isolate. None of the device operations that get called will run unless they're executing on that CPU.

Once the module is loaded you can find it here

/dev/toy

Simple operations like

cat /dev/toy

create events that the kernel module catches and produces some output. You can see the output using dmesg.

Source code...

#include <linux/module.h>
#include <linux/fs.h>
#include <linux/miscdevice.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Harry");
MODULE_DESCRIPTION("toy kernel module");
MODULE_VERSION("0.1"); 
#define  DEVICE_NAME "toy"
#define  CLASS_NAME  "toy"

static int    param_cpu_id;
module_param(param_cpu_id    , int, (S_IRUSR | S_IRGRP | S_IROTH));
MODULE_PARM_DESC(param_cpu_id, "CPU ID that operations run on");

//static void    bar(void *arg);
//static void    foo(void *cpu);
static int     toy_open(   struct inode *inodep, struct file *fp);
static ssize_t toy_read(   struct file *fp     , char *buffer, size_t len, loff_t * offset);
static ssize_t toy_write(  struct file *fp     , const char *buffer, size_t len, loff_t *);
static int     toy_release(struct inode *inodep, struct file *fp);

static struct file_operations toy_fops = {
  .owner = THIS_MODULE,
  .open = toy_open,
  .read = toy_read,
  .write = toy_write,
  .release = toy_release,
};

static struct miscdevice toy_device = {
  .minor = MISC_DYNAMIC_MINOR,
  .name = "toy",
  .fops = &toy_fops
};

//static int CPU_IDS[64] = {0};
static int toy_open(struct inode *inodep, struct file *filep) {
  int this_cpu = get_cpu();
  printk(KERN_INFO "open: called on CPU:%d\n", this_cpu);
  if(this_cpu == param_cpu_id) {
    printk(KERN_INFO "open: is on requested CPU: %d\n", smp_processor_id());
  }
  else {
    printk(KERN_INFO "open: not on requested CPU:%d\n", smp_processor_id());
  }
  put_cpu();
  return 0;
}
static ssize_t toy_read(struct file *filep, char *buffer, size_t len, loff_t *offset){
  int this_cpu = get_cpu();
  printk(KERN_INFO "read: called on CPU:%d\n", this_cpu);
  if(this_cpu == param_cpu_id) {
    printk(KERN_INFO "read: is on requested CPU: %d\n", smp_processor_id());
  }
  else {
    printk(KERN_INFO "read: not on requested CPU:%d\n", smp_processor_id());
  }
  put_cpu();
  return 0;
}
static ssize_t toy_write(struct file *filep, const char *buffer, size_t len, loff_t *offset){
  int this_cpu = get_cpu();
  printk(KERN_INFO "write called on CPU:%d\n", this_cpu);
  if(this_cpu == param_cpu_id) {
    printk(KERN_INFO "write: is on requested CPU: %d\n", smp_processor_id());
  }
  else {
    printk(KERN_INFO "write: not on requested CPU:%d\n", smp_processor_id());
  }
  put_cpu();
  return 0;
}
static int toy_release(struct inode *inodep, struct file *filep){
  int this_cpu = get_cpu();
  printk(KERN_INFO "release called on CPU:%d\n", this_cpu);
  if(this_cpu == param_cpu_id) {
    printk(KERN_INFO "release: is on requested CPU: %d\n", smp_processor_id());
  }
  else {
    printk(KERN_INFO "release: not on requested CPU:%d\n", smp_processor_id());
  }
  put_cpu();
  return 0;
}

static int __init toy_init(void) {
  int cpu_id;
  if(param_cpu_id < 0 || param_cpu_id > 4) {
    printk(KERN_INFO "toy: unable to load module without cpu parameter\n");
    return -1;
  }
  printk(KERN_INFO "toy: loading to device driver, param_cpu_id: %d\n", param_cpu_id);
  //preempt_disable(); // See notes below
  cpu_id = get_cpu();
  printk(KERN_INFO "toy init called and running on CPU: %d\n", cpu_id);
  misc_register(&toy_device);
  //preempt_enable(); // See notes below
  put_cpu();
  //smp_call_function_single(1,foo,(void *)(uintptr_t) 1,1);
  return 0;
}

static void __exit toy_exit(void) {
    misc_deregister(&toy_device);
    printk(KERN_INFO "toy exit called\n");
}

module_init(toy_init);
module_exit(toy_exit); 

The code above contains the two methods you asked for ie isolation of CPU and on init run on an isolated core.

On init get_cpu disables preemption ie anything that comes after it will not be preempted by the kernel and will run on one core. Note, this was done kernel using 3.16, your mileage may vary depending on your kernel version but I think these API's have been around a long time

This is the Makefile...

obj-m += toy.o

all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Notes. get_cpu is declared in linux/smp.h as

#define get_cpu()   ({ preempt_disable(); smp_processor_id(); })
#define put_cpu()   preempt_enable()

so you don't actually need to call preempt_disable before calling get_cpu. The get_cpu call is a wrapper around the following sequence of calls...

preempt_count_inc();
barrier();

and put_cpu is really doing this...

barrier();
if (unlikely(preempt_count_dec_and_test())) {
  __preempt_schedule();
}   

You can get as fancy as you like using the above. Almost all of this was taken from the following sources..

Google for... smp_call_function_single

Linux Kernel Development, book by Robert Love.

http://derekmolloy.ie/writing-a-linux-kernel-module-part-2-a-character-device/

https://github.com/vsinitsyn/reverse/blob/master/reverse.c

Upcast answered 2/4, 2016 at 23:36 Comment(1)
After kernel_thread, the thread can clamp to a given CPU using cpuset* calls. But, to force all other threads to not use this CPU would [probably] need a custom change to do_fork that would be [hugely] problematic. An "good enough" workaround would be for the given thread to set the R/T scheduler and set an R/T priority of 11 [a safe value--go higher only if you dare]. In practice, this will yield the desired effect as almost anything else won't [be able to] run because the thread will semi-monopolize the CPU with the high(er) priorityOrthopedic
L
2

You pointed in your question:

I guess I got the "unknown symbols" because those functions have no EXPORT_SYMBOL in the kernel

I think this is the key point of your problem. I see you're including the file linux/cpuset.h which defines the method: cpuset_init among others. However, both during compilation and using the command nm we can see indicators pointing us that this function is not available:

Compiling:

root@hectorvp-pc:/home/hectorvp/cpuset/cpuset_try# make
make -C /lib/modules/3.19.0-31-generic/build M=/home/hectorvp/cpuset/cpuset_try modules 
make[1]: Entering directory '/usr/src/linux-headers-3.19.0-31-generic'
  CC [M]  /home/hectorvp/cpuset/cpuset_try/cpuset_try.o
  Building modules, stage 2. 
  MODPOST 1 modules 
  WARNING: "cpuset_init" [/home/hectorvp/cpuset/cpuset_try/cpuset_try.ko] undefined!
  CC      /home/hectorvp/cpuset/cpuset_try/cpuset_try.mod.o
  LD [M]  /home/hectorvp/cpuset/cpuset_try/cpuset_try.ko
make[1]: Leaving directory '/usr/src/linux-headers-3.19.0-31-generic'

See the WARNING: "cupset_init" [...] undefined!. And using nm:

root@hectorvp-pc:/home/hectorvp/cpuset/cpuset_try# nm cpuset_try.ko
0000000000000030 T cleanup_module
                 U cpuset_init
                 U __fentry__
0000000000000000 T init_module
000000000000002f r __module_depends
                 U printk
0000000000000000 D __this_module
0000000000000000 r __UNIQUE_ID_license0
000000000000000c r __UNIQUE_ID_srcversion1
0000000000000038 r __UNIQUE_ID_vermagic0
0000000000000000 r ____versions

(Note: U stands for 'undefined')

However, I've been exploring the kernel's symbols as follow:

root@hectorvp-pc:/home/hectorvp/cpuset/cpuset_try# cat /proc/kallsyms | grep cpuset_init
ffffffff8110dc40 T cpuset_init_current_mems_allowed
ffffffff81d722ae T cpuset_init
ffffffff81d72342 T cpuset_init_smp

I see it's exported but it isn't available in /lib/modules/$(uname -r)/build/Module.symvers. So you're right.

After further investigation I found it's actually defined in:

http://lxr.free-electrons.com/source/kernel/cpuset.c#L2101

This is the function you need to call as it is available in the kernel space. Thus you won't need access to the user space.

The work around I found to make the module able to call this symbols is reported in the second answer of this question. Notice that you don't need to include linux/cpuset.h anymore:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
//#include <linux/cpuset.h>
#include <linux/kallsyms.h>


int init_module(void)
{
        static void (*cpuset_init_p)(void);
        cpuset_init_p = (void*) kallsyms_lookup_name("cpuset_init");
        printk(KERN_INFO "Starting ...\n");
        #ifdef CONFIG_CPUSETS
            printk(KERN_INFO "cpusets is enabled!");
        #endif
        (*cpuset_init_p)();
        /* 
         * A non 0 return means init_module failed; module can't be loaded. 
         */
        return 0;
}

void cleanup_module(void)
{
        printk(KERN_INFO "Ending ...\n");
}

MODULE_LICENSE("GPL");

I compiled it successfully and installed with insmod. Bellow is the output I got in dmesg:

[ 1713.738925] Starting ...
[ 1713.738929] cpusets is enabled!
[ 1713.738943] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[ 1713.739042] BUG: unable to handle kernel paging request at ffffffff81d7237b
[ 1713.739074] IP: [<ffffffff81d7237b>] cpuset_init+0x0/0x94
[ 1713.739102] PGD 1c16067 PUD 1c17063 PMD 30bc74063 PTE 8000000001d72163
[ 1713.739136] Oops: 0011 [#1] SMP 
[ 1713.739153] Modules linked in: cpuset_try(OE+) xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter ip_tables x_tables nf_nat nf_conntrack br_netfilter bridge stp llc pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) aufs binfmt_misc cfg80211 nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek intel_rapl snd_hda_codec_generic iosf_mbi snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_controller snd_hda_codec snd_hwdep coretemp kvm_intel amdkfd kvm snd_pcm snd_seq_midi snd_seq_midi_event amd_iommu_v2 snd_rawmidi radeon snd_seq crct10dif_pclmul crc32_pclmul snd_seq_device aesni_intel ttm aes_x86_64 drm_kms_helper drm snd_timer i2c_algo_bit dcdbas mei_me lrw gf128mul mei snd glue_helper ablk_helper
[ 1713.739533]  cryptd soundcore shpchp lpc_ich serio_raw 8250_fintek mac_hid video parport_pc ppdev lp parport autofs4 hid_generic usbhid hid e1000e ahci psmouse ptp libahci pps_core
[ 1713.739628] CPU: 2 PID: 24679 Comm: insmod Tainted: G           OE  3.19.0-56-generic #62-Ubuntu
[ 1713.739663] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A03 09/17/2013
[ 1713.739693] task: ffff8800d29f09d0 ti: ffff88009177c000 task.ti: ffff88009177c000
[ 1713.739723] RIP: 0010:[<ffffffff81d7237b>]  [<ffffffff81d7237b>] cpuset_init+0x0/0x94
[ 1713.739757] RSP: 0018:ffff88009177fd10  EFLAGS: 00010292
[ 1713.739779] RAX: 0000000000000013 RBX: ffffffff81c1a080 RCX: 0000000000000013
[ 1713.739808] RDX: 000000000000c928 RSI: 0000000000000246 RDI: 0000000000000246
[ 1713.739836] RBP: ffff88009177fd18 R08: 000000000000000a R09: 00000000000003db
[ 1713.739865] R10: 0000000000000092 R11: 00000000000003db R12: ffff8800ad1aaee0
[ 1713.739893] R13: 0000000000000000 R14: ffffffffc0947000 R15: ffff88009177fef8
[ 1713.739923] FS:  00007fbf45be8700(0000) GS:ffff88031dd00000(0000) knlGS:0000000000000000
[ 1713.739955] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1713.739979] CR2: ffffffff81d7237b CR3: 00000000a3733000 CR4: 00000000001407e0
[ 1713.740007] Stack:
[ 1713.740016]  ffffffffc094703e ffff88009177fd98 ffffffff81002148 0000000000000001
[ 1713.740052]  0000000000000001 ffff8802479de200 0000000000000001 ffff88009177fd78
[ 1713.740087]  ffffffff811d79e9 ffffffff810fb058 0000000000000018 ffffffffc0949000
[ 1713.740122] Call Trace:
[ 1713.740137]  [<ffffffffc094703e>] ? init_module+0x3e/0x50 [cpuset_try]
[ 1713.740175]  [<ffffffff81002148>] do_one_initcall+0xd8/0x210
[ 1713.740190]  [<ffffffff811d79e9>] ? kmem_cache_alloc_trace+0x189/0x200
[ 1713.740207]  [<ffffffff810fb058>] ? load_module+0x15b8/0x1d00
[ 1713.740222]  [<ffffffff810fb092>] load_module+0x15f2/0x1d00
[ 1713.740236]  [<ffffffff810f6850>] ? store_uevent+0x40/0x40
[ 1713.740250]  [<ffffffff810fb916>] SyS_finit_module+0x86/0xb0
[ 1713.740265]  [<ffffffff817ce10d>] system_call_fastpath+0x16/0x1b
[ 1713.740280] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 53 58 31 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 1c 00 00 00 c0 92 2c 7d c0 92 2c 7d a0 fc 69 ee 
[ 1713.740398] RIP  [<ffffffff81d7237b>] cpuset_init+0x0/0x94
[ 1713.740413]  RSP <ffff88009177fd10>
[ 1713.740421] CR2: ffffffff81d7237b
[ 1713.746177] ---[ end trace 25614103c0658b94 ]---

Despite the errors, I'd say I've answered your initial question:

How do I use cpuset's from inside a kernel module? *

Probably not in the most elegant way as I'm not an expert at all. You need to continue from here.

Regards

Leyba answered 7/4, 2016 at 15:40 Comment(0)
W
2

Using on_each_cpu() and filtering for the desired CPU works:

targetcpu.c

#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>

const static int TARGET_CPU = 4;

static void func(void *info){
    int cpu = get_cpu();
    if(cpu == TARGET_CPU){
        printk("on target cpu: %d\n", cpu);
    }
    put_cpu();
}

int init_module(void) {
    printk("enter\n");
    on_each_cpu(func, NULL, 1);
    return 0;
}

void cleanup_module(void) {
    printk("exit\n");
}

Makefile

obj-m += targetcpu.o

all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Weisburgh answered 17/12, 2019 at 16:21 Comment(0)
L
0

Did you try work_struct with

struct workqueue_attrs {
cpumask_var_t           cpumask;        /* allowed CPUs */
}

First of all cpu should be isolated via (for example cpu 0x1)

setenv bootargs isolcpus=\"0x1"\

and next

struct lkm_sample {
struct work_struct lkm_work_struct;
struct workqueue_struct *lkm_wq_struct;
...
};
static struct lkm_sample lkm_smpl;

static void work(struct work_struct *work)
{
struct lkm_sample *tmp = container_of(work, struct lkm_sample,     lkm_work_struct);
....
return;
}
static int __init lkm_init(void)
{
//see:     https://lwn.net/Articles/540999/
lkm_smpl.lkm_wq_struct = create_singlethread_workqueue("you_wq_name");
INIT_WORK(&lkm_smpl.lkm_wq_struct, work);
}

If you would like start (run __init) lkm on isolated cpu:

  1. setenv bootargs isolcpus=\"0x1"\

  2. lsmod helper_module.ko with

    call_usermodehelper_setup struct subprocess_info * call_usermodehelper_setup ( char * path, char ** argv, /*taskset 0x00000001 helper_application */ char ** envp, gfp_t gfp_mask, int (*init) (struct subprocess_info *info, struct cred *new), void (*cleanup) (struct subprocess_info *info), void * data); Use helper kernel module which should run userspace program (helper_application) via taskset and mask should be from isolcpus. Helper module should run only __init function() and return -1 because only one task: run userspace app on isolated cpu.

  3. Userspace helper application next should just: lsmod for goal_module.ko, goal_module should start on the same isolated cpu.

  4. Use workqueue to continue run isolated module on the isolated cpu.

Lona answered 4/4, 2016 at 5:56 Comment(3)
Check the comment section in my question. I'm not lookingo for a solution that uses isolcpus. I want to isolate only when I load the module.Lehmbruck
Would you like isolate CPU during runtime? For example you could for_each_process and in the loop get for each task_struct PID and CPU, compare CPU with isolated number and if true move right process to any other not isolated CPU. And next run work_struct on isolated CPU.Lona
To move task_struct form cpu look for example into: >static void __migrate_swap_task(struct task_struct *p, int cpu) >void set_task_cpu(struct task_struct *p, unsigned int cpu) To set cpu mast for process see: >cpumask_t cpus_allowed; in struct task_structLona

© 2022 - 2024 — McMap. All rights reserved.