How can linux boot code be written in C?
Asked Answered
H

1

8

I'm a newbie to learning OS development. From the book I read, it said that boot loader will copy first MBR into 0x7c00, and starts from there in real mode.

And, example starts with 16 bit assembly code. But, when I looked at today's linux kernel, arch/x86/boot has 'header.S' and 'boot.h', but actual code is implemented in main.c.

This seems to be useful by "not writing assembly." But, how is this done specifically in Linux? I can roughly imagine that there might be special gcc options and link strategy, but I can't see the detail.

Headachy answered 22/4, 2017 at 21:11 Comment(10)
GCC doesn't generate usable 16-bit code. It can generate usable 32-bit code that can run in 16-bit real mode as long as your processor is a 386 (or higher). .S files are assembly files (you will see a fair amount of assembly just to get to the point of calling main in C. header.S can be used during compilation to act as real mode bootloaders. The real magic is to create a linker file in your case it is setup.ld` that properly builds the object with code and data in specific places. All of the C code has to be done to utilize BIOS routines or direct hardware access.Bursary
Most of the BIOS and hardware related functionality is in boot.hBursary
The code you're looking at isn't meant to booted directly by the BIOS, which is what loads the MBR into 0000:7c00 and jumps to it. It's meant to be loaded and executed by something like GRUB, which will have its own code in the MBR, If the code you're looking at is actually loaded by the BIOS it won't actually boot, telling you to instead use a boot loader.Azzieb
I have an example of a basic bootloader that is done entirely in C and inline assembly but like the Linux code will only run on 386(or later) processors. Most of the magic is in the linker script and the header files that provide some basic BIOS and port access. That can be found here: capp-sysware.com/misc/ircasm/gccboot-2stage . I do not recommend doing this unless you know exactly what you are doing. Openwatcom's C compilers are much nicer since they still support 16-bit real mode code generation (you just need an appropriate linker to generate the final binary image)Bursary
Since you are doing OS development, I assume you are looking to write your own and were drawing inspiration from Linux bootup code. If you intend to write your own bootloader and want to do it in C I'd recommend OpenWatcom C as it can generate proper 16-bit real mode code. Since simple bootloaders aren't generally all that big writing them in pure assembly isn't all that cumbersome either. If you don't want to write a bootloader and want to use something like GRUB then it can be done with a minimal of assembly language code.Bursary
@MichaelPetch I'm not sure what the purpose of your comments that were made in response to mine are. If you're not saying you disagree with my claim that the code isn't meant to be booted by the BIOS directly then what are you trying to say?Azzieb
@RossRidge : I have removed the earlier comments except the ones to the OP. Here is what I will say to you: There is in fact real mode code in main.c (in the Linux code base) that can be used to create realmode bootloaders (in general), however modern Linux kernels will toss up an error message when created as an MBR. I believe from the wording of this question that it is really an X-Y problem. I believe (and I hope the OP clarifies) that he is really asking whether a bootloader for OS development (which he does say he is doing) can be written with C.Bursary
He mentions GCC specifically. I surmise that he was trying to draw inspiration from the Linux code to create his own bootloader for his own OS. I think he was using Linux as an example code that seemed to be doing just that. I hope they try to clarify what they are really trying to askBursary
If you're starting your own OS development today, it will take several man-years to get anywhere, really. That means you shouldn't really bother with BIOS booting in the first place and go for UEFI booting, as BIOS boot mechanisms are mostly for legacy support and will likely be phased out by the time your OS "gets there". One of the effects of UEFI is that you don't need any 16-bit bootloader code in the first place.Slush
I appreciate to all comments. They were all constructive. Honestly, I only have college level OS class experience, but, I wanted to understand Linux better. I started from boot, but obviously I underestimated the subject difficulty. Somewhat I am looking for is, simpler version of linux kernel for education purpose. Anyway, I will rethink about boot process. Thank you for all who answered/commented.Headachy
B
9

I'm reading this question more as an X-Y problem. It seems to me the question is more about whether you can write a bootloader (boot code) in C for your own OS development. The simple answer is YES, but not recommended. Modern Linux kernels are probably not the best source of information for creating bootloaders written in C unless you have an understanding of what their code is doing.

If using GCC there are restrictions on what you can do with the generated code. In newer versions of GCC there is an -m16 option that is documented this way:

The -m16 option is the same as -m32, except for that it outputs the ".code16gcc" assembly directive at the beginning of the assembly output so that the binary can run in 16-bit mode.

This is a bit deceptive. Although the code can run in 16-bit real mode, the code generated by the back end uses 386 address and operand prefixes to make normally 32-bit code execute in 16-bit real mode. This means the code generated by GCC can't be used on processors earlier than the 386 (like the 8086/80186/80286 etc). This can be a problem if you want a bootloader that can run on the widest array of hardware. If you don't care about pre-386 systems then GCC will work.

Bootloader code that uses GCC has another downside. The address and operand prefixes that get get added to many instructions add up and can make a bootloader bloated. The first stage of a bootloader is usually very constrained in space so this could potentially become a problem.

You will need inline assembly or assembly language objects with functions to interact with the hardware. You don't have access to the Linux C library (printf etc) in bootloader code. For example if you want to write to the video display you have to code that functionality yourself either writing directly to video memory or through BIOS interrupts.

To tie it altogether and place things in the binary file usable as an MBR you will likely need a specially crafted linker script. In most projects these linker scripts have an .ld extension. This drives the process of taking all the object files putting them together in a fashion that is compatible with the legacy BIOS boot process (code that runs in real mode at 0x07c00).

There are so many pitfalls in doing this that I recommend against it. If you are intending to write a 32-bit or 64-bit kernel then I'd suggest not writing your own bootloader and use an existing one like GRUB. In the versions of Linux from the 1990s it had its own bootloader that could be executed from floppy. Modern Linux relies on third party bootloaders to do most of that work now. In particular it supports bootloaders that conform to the Multiboot specification

There are many tutorials on the internet that use GRUB as a bootloader. OS Dev Wiki is an invaluable resource. They have a Bare Bones tutorial that uses the original Multiboot specification (supported by GRUB) to boot strap a basic kernel. The Mulitboot specification can easily be developed for using a minimal of assembly language code. Multiboot compatible bootloaders will automatically place the CPU in protected mode, enable the A20 line, can be used to get a memory map, and can be told to place you in a specific video mode at boot time.


Last year someone on the #Osdev chat asked about writing a 2 stage bootloader located in the first 2 sectors of a floppy disk (or disk image) developed entirely in GCC and inline assembly. I don't recommend this as it is rather complex and inline assembly is very hard to get right. It is very easy to write bad inline assembly that seems to work but isn't correct.

I have made available some sample code that uses a linker script, C with inline assembly to work with the BIOS interrupts to read from the disk and write to the video display. If anything this code should be an example why it's non-trivial to do what you are asking.

Bursary answered 22/4, 2017 at 23:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.