Linking 32- and 64-bit code together into a single binary
Asked Answered
I

0

6

In a comment to this question, Unexpected behaviour in simple pointer arithmetics in kernel space C code, Michael Petch wrote, "The 64-bit ELF format supports 32-bit code sections."

I have a working program that includes both 32- and 64-bit code and switches between them. I have never been able to figure out how to link compiler-generated 32- and 64-bit code together without a linker error, so all the 32-bit code is written in assembly. As the project has become more complex, maintenance of the 32-bit assembly code has become more onerous.

Here is what I have:

test32.cc is compiled with -m32.
All the other source files are compiled without that flag and with -mcmodel=kernel.

In the linker script:

OUTPUT_FORMAT("elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)

In the Makefile:

LD := ld
LDFLAGS := -Map $(TARGET).map -n --script $(LDSCRIPT)
$(LD) $(LDFLAGS) -b elf32-x86-64 $(OBJS64) -b elf32-i386 $(OBJS32) -o $@

I get the error:

ld: i386 architecture of input file 'test32.o' is incompatible with i386:x86-64 output

Changing OUTPUT_ARCH to i386 causes similar errors from all the 64-bit object modules.

I'm using:
gcc 5.4.1
GNU ld (GNU Binutils for Ubuntu) 2.26.1

Iphigenia answered 25/3, 2018 at 6:12 Comment(17)
I said 32-bit and 64-bit sections, not 32-bit and 64-bit elf files. You don't link a 32-bit elf file and a 64-bit elf file. You use elf64 files to contain 32-bit and 64-bit code. Normally in these OSDev questions the 32-bit/64-bit sections of code are inside an assembly file where you can use (with NASM) the bits 32/bits 64 (.code32/code64 with AS) directive, but you still produce an elf64 object.Actinochemistry
Is this a 64-bit OS with 32-bit components? How much assembly code in 32-bit are you looking at? What type of functionality? I ask because I'd be curious just how onerous maintaining the 32-bit assembly code you have is. When running the 32-bit code do you swicth the processor out of longmode? Or is all the 32-bit code used before entering longmode and then not used after that?Actinochemistry
I'm just curious what type of code you are running if you are switching out of long mode to 32-bit protected mode (and then back to long mode?) that can't be done in long mode? Usually the 32-bit code is isolated to things done prior to entering long mode (are you using GRUB? GRUB2?). If I had an understanding of how the code is used I might give you some ideas.Actinochemistry
If you are using multiboot, and the 32-bit code is strictly for bootstrapping into long mode (including setting up paging etc), then I would build the bootstrap/multiboot code as an elf32 exeutable and then specify your 64-bit kernel as a module (GRUB/mltiboot loader will read the module into memory as is). What I would do is write a simple 64-bit ELF parser (you don't even need relocation which simplifies things). Then use that ELF parser to place the 64-bit ELF file (the kernel) into memory where you need it (much easier if you use power of paging). Then transfer control to the 64-bit codeActinochemistry
This allows a 32-bit ELF executable to bootstrap everything(can be written in mostly C), load the 64-bit ELF executable and transfer control to it. You can do this with a custom bootloader as well (if you aren't using a Multiboot loader). The entire idea is to keep the 32-bit bootstrap code in a 32-bit ELF executable, and the 64-bit code in an ELF64 executable.Actinochemistry
Using paging to place the 64-bit kernel into virtual space with your own ELF loader would require ensuring modules are loaded into memory by Multiboot/GRUB on page (4kb) boundaries by specifying it in the multiboot header / flags field (bit 0 of flags is to set module alignment)Actinochemistry
Sorry I misinterpreted your statement yesterday. Thanks for the interest and comments. The program is a hypervisor that boots from either Grub or EFI. Your description in your first comment is exactly how we do it now.Iphigenia
There is a total of about 2500 lines of assembly code. Most of it is startup code--BIOS interfacing, command-line parsing, memory layout--plus the code to start the APs. There is also some to shim between the x86-64 (system V) ABI and EFI. The only reason it's onerous is that I'm the only one on the team that can maintain it.Iphigenia
The only time it runs 32-bit code after switching to long mode is to call the EFI runtime on a system with a 32-bit EFI BIOS. It use a far call to compatibility mode. I don't think that code is used because I don't think we've ever shipped such a platform. That would need to be in assembly anyway, so it's not germane to this question.Iphigenia
The problem is that we keep adding features that have to be done before address space layout, which is done before relocation, which is done before setting up paging. Perhaps this startup sequencing could be changed, resulting in less assembly code.Iphigenia
Thanks for your suggestions. I will consider them. The need to support both Grub and EFI makes it more complicated, I think. On EFI, I don't think it will be possible to load two separate binaries, because the first one won't have any way to locate the second one. It could be on a local disk or flash drive, or over the network.Iphigenia
I realized while writing these comments that the problem isn't just 32-bit code. Even when booted from 64-bit EFI, the steps to set up paging, stack, BSS initialization, running static constructors, all depend on such things as command line options, getting the BIOS memory map, for example, so those things are done in assembly even when it is already in 64-bit mode at startup.Iphigenia
I'm not sure if this would work for you or not. Since you use mcmodel=small, and assuming you locate all your code and data < 2gb I wonder if you could compile your 64-bit C files to 64-bit ELF object files and then use objcopy -O elf32-i386 to convert each 64-bit object to elf32. Generate your 32-bit C code with -m32. Then link everything together (ld -melf_i386) as a 32-bit elf executable? Of course calling functions directly between the two wouldn't work, and would require a thunking layer (including dealing with mode changes, argument passing, ABI differences etc).Actinochemistry
Ooops most of the files are mcmodel=kernel (my mistake), not sure if the idea of above would still have merit.Actinochemistry
Maybe the opposite? Convert the 32-bit objects to elf64-x86-64?Iphigenia
You could do that (with mcmodel=small it made sense the other way). Not sure if relocation would work but objcopy will allow you to go from elf32 to elf64 as wellActinochemistry
That seems to work. (I mean, it links without errors and the code looks right.) Thanks!Iphigenia

© 2022 - 2024 — McMap. All rights reserved.