Porting module to newer Linux kernel: Cannot allocate memory
Asked Answered
N

2

5

I have a quite big driver module that I am trying to compile for a recent Linux kernel (3.4.4). I can successfully compile and insmod the same module with a 2.6.27.25 kernel. GCC version are also different, 4.7.0 vs 4.3.0. Note that this module is quite complicated and I cannot simply go through all the code and all the makefiles.

When "inserting" the module I get a Cannot allocate memory with the following traces:

vmap allocation for size 30248960 failed: use vmalloc=<size> to increase size.
vmalloc: allocation failure: 30243566 bytes
insmod: page allocation failure: order:0, mode:0xd2
Pid: 5840, comm: insmod Tainted: G           O 3.4.4-5.fc17.i686 #1
Call Trace:
 [<c092702a>] ? printk+0x2d/0x2f
 [<c04eff8d>] warn_alloc_failed+0xad/0xf0
 [<c05178d9>] __vmalloc_node_range+0x169/0x1d0
 [<c0517994>] __vmalloc_node+0x54/0x60
 [<c0490825>] ? sys_init_module+0x65/0x1d80
 [<c0517a60>] vmalloc+0x30/0x40
 [<c0490825>] ? sys_init_module+0x65/0x1d80
 [<c0490825>] sys_init_module+0x65/0x1d80
 [<c050cda6>] ? handle_mm_fault+0xf6/0x1d0
 [<c0932b30>] ? spurious_fault+0xae/0xae
 [<c0932ce7>] ? do_page_fault+0x1b7/0x450
 [<c093665f>] sysenter_do_call+0x12/0x28
-- clip --

The obvious answer seems to be that the module is allocating too much memory, however:

  • I have no problem with the old kernel version, what ever size this module is
  • if I prune some part of this module to get a much lower memory consumption, I will get always the same error message with the new kernel
  • I can unload a lot of other modules, but it has no impact (and is it anyway relevant? is there a global limit with Linux regarding the total memory usage by modules)

I am therefore suspecting a problem with the new kernel not directly related to limited memory.

The new kernel is complaining about a vmalloc() of 30,000 KB, but with the old kernel, an lsmod gives me a size of 4,800 KB. Should these figures be directly related? Is it possible that something went wrong during the build and that it is just too much RAM being requested? When I compile the sections size of both .ko, I do not see big differences.

So I am trying to understand where the problem is from. When I check the dumped stack, I am unable to find the matching piece of code. It seems that the faulty vmalloc() is done by sys_init_module(), which is init_module() from kernel/module.c. But the code does not match. When I check the object code from my .ko, the init_module() code also does not match.

I am more or less blocked as I do not know the kernel well enough, and all the build system and the module loading is quite tough to understand. The error occurs before the module is loaded, as I suspect that some functions are missing and insmod does not report these errors at this point.

Noam answered 19/7, 2012 at 12:27 Comment(0)
D
5

I believe the allocation is done in layout_and_allocate, which is called by load_module. Both are static function, so they may be inlined, and therefore not on the stack.
So it's not an allocation done by your code, but an allocation done by Linux in order to load your code.

If your old kernel is 4.8MB and the new one is 30MB, it can explain why it fails.
So the question is why is it so large.

The size may be due to the amount of code (not likely that it has grown so much) or statically allocated data.
A likely explanation is that you have a large statically allocated array, whose size is defined in Linux. If the size has grown significantly, your array would grow.
A guess - an array whose size is NR_CPUS.

You should be able to use commands such as nm or objdump to find such an array. I'm not sure how exactly to do it however.

Dingy answered 19/7, 2012 at 13:14 Comment(0)
N
2

The problem was actually due to the debug sections in the module. The old kernel was able to ignore these sections, but the new one was counting them in the total size to allocate. However, when enabling the pr_debug() traces from module.c at loading time, these sections were not dumped with the others.

How to get rid of them and solve the problem:

objcopy -R .debug_aranges \
    -R .debug_info \
    -R .debug_abbrev \
    -R .debug_line \
    -R .debug_frame \
    -R .debug_str \
    -R .debug_loc \
    -R .debug_ranges \
    orignal.ko new.ko

It is also possible that the specific build files for this project were adding debug information "tailored" for the old kernel version, but when trying with a dummy module, I find exactly the same kind of debug sections appended, so I would rather suspect some policy change regarding module management in the kernel or in Fedora.

Any information regarding these changes are welcome.

Noam answered 20/7, 2012 at 11:58 Comment(1)
Thanks! I would never though about this unless your answer, where did you find this?Crossed

© 2022 - 2024 — McMap. All rights reserved.