![]() |
©1994-2004 Kevin Boone | ||||||||||||||||||||||
|
Home > Computing > Linux
Understanding the Linux boot process Last modified: Thu Jul 8 11:41:10 2004 This document explains in moderate detail what happens when a Linux system starts up. As far as possible, I have tried to separate features which are specific to the various Linux distributions from those that are generic. Where this isn't possible -- because the explanation would be too convoluted -- I have used the RedHat set-up as an example. In addition, I have tended to focus on the Intel/PC platform, for the same reason. To break the process into manageable pieces, I have broken it into four stages: the `firmware' stage, the `bootloader' stage, the `kernel' stage, and the `init' stage. These are my names, and they aren't necessarily used by other Linux users. Moreover, it isn't always easy to separate the `firmware' stage from the initial operations of the bootloader. On the PC platform, the firmware is so unintelligent that a separate (software) bootloader is required. On other platforms, notably Sparc machines, the firmware is quite sophisticated, and may be able to load a kernel directly. Stage 1 (firmware stage)The purpose of a bootloader is to get at least part of the operating system kernel into memory and running. After that, the kernel can take over the process. However, unless the bootloader is in firmware, to run the bootloader we must first retrieve it, from disk or wherever else it is stored. The purpose of the firmware stage, therefore, is to get a bootloader into memory and run it.On the Intel/PC platform, the firmware stage (which does not depend on the operating system) is governed by the BIOS. Most modern PCs (and other types of computer, of course) can boot from floppy disk, hard disk, or CD-ROM. It is common for Sparc-based systems to have built-in network bootloaders in firmware but, at present, this is unusual in the PC world. The BIOS typically provides a mechanism by which the operator can choose the devices that will be used to boot, and it will probably be prepared to try more than one if necessary. The process is slightly different for the different media types. Bootloader on floppy disk or hard diskThis is usually the simplest situation. On a floppy disk, the first sector is reserved as the boot sector. It must contain executable program code. The BIOS loads the boot sector into memory and then runs it. This process is largely the same whatever the hardware platform.The situation is similar for PC hard disks, except that it is conventional to divide the hard disk into partitions, and to provide a boot sector for each partition. In the world of DOS, the boot sector was, and remains, combined with the partition table; the partition table controls how much space is allocated to each partition. In addition to the partition boot sectors there is an overall boot sector/partition table called the `master boot record' (MBR). When booting from a hard disk formatted this way, the PC BIOS loads the MBR and executes it as a boot sector; the code in the MBR will then find which partition to boot from, and load and run the boot sector from that partition. Linux has no need to follow the convention of partitioning that is meaningful to DOS/Windows, but if the hard disk is to be used with more than one operating system then it is a good idea to. So, when booting from a hard disk the Linux bootloader can be placed in the MBR, or in a partition boot sector. In the latter case, it won't be the BIOS that will load the Linux bootloader, it will be the bootloader on the master boot record. Whether the boot disk is a hard disk or a floppy disk, the first stage of the boot process finds a boot sector, which will contain the Linux bootloader, and runs it. Bootloader on CD-ROMThe ability to boot from a CDROM has been commonplace on most platforms for some years. On some platforms a bootable CDROM has the same structure as a bootable hard disk: a boot sector followed by a load of data. A structure like this is unworkable for PCs, owing to limitations in the BIOS specification. Most modern PCs are, however, able to boot from a CDROM formatted according to the El Torito specification. This process is far more complex than it ought to be. Because the BIOS can't cope with a full-sized bootable Linux filesystem on a CDROM, El Torito requires that the CDROM be provided with an additional bootable filesystem. This filesystem is considered to be `outside' the normal data area of the CDROM, and won't be visible if the CDROM is mounted as a filesystem in the usual way. In fact, although the CDROM itself will normally be formatted with an ISO9660 filesystem, the El Torito bootable image can be of any filesystem type. In practise, the bootable image will be formatted as a floppy disk: a boot sector followed by a filesystem. When booting from the CDROM, the BIOS finds the bootable filesystem image, loads the boot sector, and makes the rest of the image available through BIOS calls just as it does for a floppy disk. As far as the bootloader is concerned, therefore, the BIOS treats a bootable CDROM as an ordinary CDROM with an `embedded' bootable floppy disk. Booting from CDROM is therefore just like booting from a floppy disk in practise. With Linux, this embedded floppy disk is usually formatted with anext2 filesystem. As with a floppy disk,
this filesystem will either become the root filesystem for
the next phase of the boot process, or will supply a
new, compressed filesystem which will be loaded into
memory as a `ramdisk' (see below).
The diagram below shows the structure of a typical Linux bootable
CD-ROM (but this isn't the only way to do it). The areas aren't to
scale, of course: the volume descriptors, etc., are only one sector
in length, but the filesystems will be many thousands of sectors.
Notice that there is a complete
Bootloader retrieved from networkThe problem with booting from a network is that the functionality must be supplied in firmware, because if there is no hard disk, there is no practical place to load network-boot software from. Most PCs do not contain firmware this sophisticated, although some network adaptors have this functionality. Sparc-based workstations generally do have network boot functionality -- in the OpenBoot firmware, and it is quite comprehensive. Note that there is nothing to stop a PC getting a bootloader with network capabilities from, say, a hard disk or CDROM and then using this to complete the boot process over the network. However, this is not network booting in the sense I am describing here.To get a bootloader via the network, the workstation must first of all decide where to get it from. This may be configurable at the firmware level or, more often, the workstation will issue a broadcast, and then select a boot server from the replies. Sun Sparc systems typically make a RARP request, broadcasting their hardware MAC address (`Ethernet address'). The reply from the server will contain the IP number assigned to the workstation, and that of the server itself. The workstation then uses the server's IP as the target for a TFTP download. Whether this download retrieves a network-aware bootloader, or a whole kernel, varies from one system to another. Some Sparc systems are able to TFTP a Linux kernel and load it, other require the retrieval of a network-aware bootloader which then retrieves the kernel (this is how Linux can be made to run on the Sun Javastation network appliance, which has somewhat stunted firmware). Stage 2 (bootloader stage)So we've got a bootloader into memory, from disk or network, and it can be executed. Its job will be to get the kernel into memory, again either from disk or network, and execute it. The bootloader will have to supply various vital pieces of information to the kernel, crucially the location of its root filesystem.There are a number of bootloaders available for Linux: on the Intel/PC platform we have LILO and GRUB; on Sparc we have SILO. LILO is probably the best known, and has existed since the earliest days of Linux. SILO is essentially the Sparc port of LILO. GRUB is a much more sophisticated proposition. LILOLILO is a very rudimentary, single-stage bootloader. It has little or no knowledge of Linux, and does not understand the structure of any filesystem. Instead, it reads from the disk using BIOS calls, supplying numerical values for the locations on disk of the files it needs. Where does it get these values from? It has no way to figure them out at run-time, so the LILO installer has to supply them in the form of a `map' file. The LILO installer is a utility calledlilo ; this utility reads
a configuration file and builds the map file from
it. The location of the map file is then supplied to
the boot sector that lilo installs.
The bootloading process with LILO thus looks something like this.
lilo ).
The LILO configuration file (usually /etc/lilo.conf )
takes the names of files and devices as its inputs, but
these names are never passed through to the boot sector being
created. The files and devices referenced are simply analysed
for their numerical offsets. For example, if lilo.conf
contains the line
root=/dev/cdromand /dev/cdrom is a symbolic link to the
real device file (perhaps /dev/hdc ), it
is important to understand that all lilo
will store is the major and minor device identifiers
of /dev/hdc . It is easy to imagine that
if the bootable filesystem you are building contains
a file called /dev/cdrom , and that is
a link to, say, /dev/hdd , then the root
filesystem will be found on /dev/hdd.
But it won't; LILO does not understand filesystems,
and the names in the configuration file are simply
rendered down to device IDs and file sector locations.
GRUBGRUB is a very different bootloader from LILO. It has a two-stage or three-stage operation, and has network boot capabilities (of course, the network boot facilities don't give you a way to get GRUB itself loaded: you'll still need network boot firmware).The additional sophistication of GRUB means that it can't easily fit into a single boot sector. It therefore uses a multiple-stage process to load successively larger amounts into memory. In so doing it becomes able to understand filesystems, so the kernel itself, and the other files GRUB uses, can be specified dynamically at boot time; there is no need for explicit numerical maps such as the ones that LILO uses. In brief, the GRUB boot process looks like this.
Multiple-boot machinesBecause Linux was designed to be able to co-exist with other operating systems, the bootloader should be able to boot other operating systems on a hard disk as well as Linux. In practise this is relatively straightforward, as each of the other operating systems will have its own boot sector. All the Linux boot loader has to do is to locate the appropriate boot sector, and execute it. After that, the process will be under the control of the other system's bootloader. LILO, GRUB, and SILO all have this functionality.Stage 3 (kernel stage)By the time this stage begins, the bootloader will have loaded the kernel into memory, configured it with the location of its root filesystem, and loaded the initial ramdisk, if supplied. How we proceed from here depends to a large extent on whether we are using an initial ramdisk or not.So why is an initial ramdisk such a big deal? Well, the concept arose from attempts to solve the problem of fitting a fully bootable Linux system onto a single floppy disk. The problem is that a Linux system that will boot as far as giving a shell, and offering a few basic utilities, needs about 8Mb -- far too much to fit onto a floppy. However, such a system will in practise compress down to about 2 Mb using gzip compression, so if the root filesystem could be compressed, we could get a working system in two standard floppies, or a single 2.88 Mb floppy. Another problem that had to be solved was that of booting from a floppy disk and then mounting a root filesystem from a device other than an IDE drive. SCSI drives were particularly problematic: if the kernel was compiled to included all the necessary drivers, it would not fit onto a floppy disk. However, the initial ramdisk technique allows the drivers to be supplied as loadable modules, which can be compressed. In outline, an initial ramdisk is a root filesystem that is unpacked from a compressed file. The boot loader will load the compressed version into memory, then the kernel uncompresses it and mounts it as the root filesystem. In this way we can get an 8 Mb root filesystem onto a 2.88 Mb file. Initial ramdisks are also useful on bootable CDROMs, because the bootable part of the CDROM is typically implemented as an `embedded' floppy disk. Stage 3a (common kernel stage)Whether or not we are using an initial ramdisk, the kernel will begin initializing itself and the hardware devices for which support is compiled in. The process will typically include the following steps.
kswapd and its
associates are not processes, they are kernel threads).
Conventionally this process is /sbin/init , although the
choice can be overridden by supplying the boot= parameter
to the kernel at boot time. The init process runs with
uid zero (i.e., as root ) and will be the parent of all
other processes.
Note that Stage 3b (ramdisk kernel stage)This stage is only relevant if we are using an initial ramdisk. In this case, the kernel won't involveinit ,
but will proceed as follows.
/linuxrc need not mount a new root filesystem over
the top of the ramdisk root, nor need it load init .
These activities are simply conventions. For example, in order
to boot a full Linux system from a CDROM, a workable proposition
is to retain the initial ramdisk as the root filesystem, and
have /linuxrc mount the CDROM at, say, /usr .
This allows the root filesystem to be read-write; if we mounted
the CDROM at / , the root filesystem would be read-only,
and we would have
to create a separate ramdisk and have a bunch of symbolic links
from the CDROM to parts of that ramdisk.
Similarly, a `rescue' disk -- floppy or CDROM -- would probably not
want to invoke
If we are using Stage 4 (init stage)By now the kernel is loaded, memory management is running, some hardware is initialized, and the root filesystem is in place. All subsequent operations are invoked -- directly or indirectly byinit .
This process takes its instructions -- again
by default -- from the file /etc/inittab.
inittab specifies at least three important pieces
of information.
rc.sysinit ) is run first, then the runlevel
scripts. The division of work between rc.sysinit
and the runlevel scripts is entirely a convention. If you are
building a custom Linux system you don't have to follow
this convention. In fact, you don't even have to run init
if it doesn't do what you need.
Stage 4a (rc.sysinit)This script or executable is responsible for all the one-off initialization of the system. Linux distributions differ in the distribution of work between this script and the runlevel scripts but, in general, the following initialization steps are likely to be carried out here.
Stage 4b (runlevel scripts)Let's assume that we will be entering runlevel 5 which, by convention, gives us a graphical login prompt under the X server. A typicalinittab will have entries
like this:
l5:5:wait:/etc/rc.d/rc 5 x:5:respawn:/etc/X11/prefdm -nodaemonThe first line says that on entry to runlevel 5, invoke a script called rc , passing the argument `5'.
The second line says that on entry to runlevel 5, run the
script /etc/X11/prefdm -nodaemon .
This latter script is somewhat beyond the scope of this
article, being in the realm of X display management. In
outline, prefdm is a script inserted by
the RedHat installer. It contains code that will launch
the X display manager selected by the user, either at install
time or using a configuration utility. The reason it works this
way is so that configuration utilities don't have to mess about
with inittab , which is a bad file to mess up if
you want your system to keep working. The X display manager
will typically invoke the X server (i.e., the graphical display)
on the local machine and give you a login prompt.
But back to the `real' boot process...
The script
So, for example, when entering runlevel 5, somewhere near the
beginning of the S12syslog startOn shutdown, somewhere towards the end of the shutdown process we will do K12syslog stopwhich is, in fact, an invocation of S12syslog stopInside the script S12syslog -- and most
of the other scripts in that directory -- you will find both
initialization and finalization code.
So what do these scripts do? Well, this depends on the runlevel,
and the distribution, and any customizations you have made.
A typical set of operations will included the following:
S99local . This is the conventional place to put
machine-specific initialization.
It is considered bad manners to customize any of the initialization
scripts that are supplied as part of a Linux distribution, simply
because other people who may have to manage the system will
have expectations about what is in them. Making arbitrary changes
here will defeat these expectations. However, everybody expects to
see machine-specific configuration in S99local .
GotchasIt should be clear that the boot process on a fully-featured Linux system is fairly complex. You can simplify it a great deal if you are building a custom Linux system, or if you just want your machine to start up faster. However, there are a few things to watch out for when constructing a custom boot process.
|
|