The Miracle of Booting

No, not that kind of booting.  I’m talking about the process by which a modern computer transforms from a lifeless hunk of sand and metal to an electronic servant ready to bow to your every whim.

This is highly PC-centric but I’ll try to throw in some old SPARC stuff too (and some VERY old AT&T mini stuff as well).

Step 0:  The system gets power

Plugged In

Plugged In

This is step 0 because in older systems, “getting power” meant zip – nothing happened until you moved the switch from “off” to “on” (or “diagnostics” as was the case in AT&T minis).  In modern (ATX) PCs and servers, a lot happens when the power supply(-ies) get(s) power.  When the (a) power supply senses power on its AC-input, it does an internal self-test and then begins supplying 5V on a standby rail to the motherboard.  Once the motherboard senses this 5V rail, it does a pre-startup check to make sure that it has a CPU and at least one stick of RAM, even if that RAM may be misconfigured.  In servers and high-end workstations, the motherboard checks for the presence of a second (third, fourth, etc.) power supply and initializes it in standby mode as well.  Battery-backed SCSI and RAID controllers begin hot-scanning their buses for any cold-boot devices.  The motherboard powers up the 5V standby rail on the USB controller, allowing any sensitive USB devices to accomplish their pre-init checks as well.  This is all before anyone pushes a power button.

Step 1:  Power-on

On

On

Back in the Day, systems didn’t have fancy “hot power on” buttons like they do now, thanks to ATX power.  They just had breakers that controlled all the system power.  To turn on the system, you simply moved the switch from “off” to “on.”

Nowadays, with ATX power supplies and motherboards, pushing the power button temporarily closes a circuit on the motherboard, signalling that you wish the system to change power state.  The power management controller then checks to see what state the system is currently in, then checks the information in the flash EEPROM to see what the desired state is.  In other words, it has to figure out what you intend to happen when you press the power button.  Since it’s in the “warm standby” state, the power management controller initiates the power-on logic.

Step 2:  POST

64Kx8 text, 8Kx1 RAM Flash EEPROM

64Kx8 text, 8Kx1 RAM Flash EEPROM

My favorite AT&T mini ever, the 3B2 had a 3-way power switch:  “Off,” “Diagnostics,” and “On.”  If you went straight to “On” from “Off,” the machine would get power and spin up all the disks and fans but wouldn’t boot (or even initialize any logic).  You had to move the switch to “Diagnostics” first and let it light the pretty orange “Diagnostics” LED.  It would then check the ALU, service processors, math coprocessors, FPU (if present), disk controllers, terminal controllers, modems, and finally probe for a system console.  When it all finished, it would print “Diagnostics Passed” on the system console and turn off the orange “Diagnostics” LED.  If all went well.  If not, it would print any error output to the terminal and blink the orange LED until someone did something about it.  But that almost never happened and 99% of the time we got the “Diagnostics Passed” message.  We were then able to move the switch to the “On” position, which would begin the boot process.

It took the 3B2 (at 22MHz) about 5-7 minutes to complete a POST (Power-On Self Test).  It takes modern systems less than 20 seconds (except servers or high-end workstations with multiple SCSI or RAID controllers).  The POST is a critical time for a system.  It’s when the system checks itself to see what kind of hardware it has and make sure everything it found is in a working condition.  The 3B2 and most other old minis had a separate POST controller (actually a separate daughterboard from the system backplane) that took care of the diagnostics.  Modern systems have the POST logic split between the power management controller and the system BIOS, both of which are built into the motherboard.

POSTing a modern desktop PC is pretty complicated but boils down to a few simple steps:

  1. Check system bus components (RAM, DMA Controller, Northbridge, power-management controller)
  2. The Northbridge’s own POST sequence includes detecting and initializing video adaptors, as well as sending a power-on signal to the disk controllers (which, in turn, send a power-up signal to the disks attached to them, which then do their own internal POSTs).
  3. The DMA Controller’s POST sequence involves writing a zero to all of the physical addresses in RAM.
  4. Read the BIOS configuration from the flash EEPROM.
  5. Configure the DMA Controller, disk controllers, USB controller, PCI bus, etc. based on the values in the system BIOS settings.
  6. Probe for disk drives (if the BIOS config says to).
  7. Probe for network cards and set them up (if the BIOS config says to).

During the POST (and until the kernel image is loaded), x86 CPUs are in Real Mode (behaving as an 8086 with 20-bit memory addressing).

UltraSPARC machines (32 and 64-bit) have the concept of “Standby Power” also but use an off-chip but on-board system management controller to handle the power-on signal.  Their CPUs don’t have a distinction between memory addressing modes or even floating point vs. integer math – MMU flags control how data is seen and manipulated (integer vs. float, signed vs. unsigned, little or bigendian (SPARC operates bigendian but supports littleendian ops for PCI buses)).  On a Sun SPARC machine, the POST happens pretty much the same way as a PC except that the BIOS on a PC is far simpler than the OpenBoot PROM on a Sun, which is almost like a mini-OS in itself, reminiscent of the good old days of UNIX-in-firmware on AT&T minis.

Step 3:  BIOS Steps Aside

And she put sweet nothings in all my .conf files.  It'll take me forever to get X working again.

And she put sweet nothings in all my .conf files. It'll take me forever to get X working again.

The System BIOS checks the settings saved in the Flash EEPROM to see what order it should try to boot the attached storage devices.  Back in the day, on my favorite old AT&T 3B2, it read the SCSI tape drive first automatically, then the SCSI floppy disk drive, then the first SCSI hard drive.  Even as late as, say, 1998, systems checked the first floppy drive, then the CD-ROM, then the first hard drive unless some other sequence had been configured in BIOS.  Nowadays, systems tend to default to the first hard drive most often, but any order of bootable devices is possible, including USB thumb drives, external hard drives, ethernet devices, and so on.  Most times, desktop PCs boot from hard disk.

Sun SPARC machines don’t auto-boot by default unless no system console is detected during POST.  If that’s the case, it checks what the boot file is in OBP and tries to load that.  If it fails, it just gives up.  If a system console (or monitor and keyboard) is detected during POST, it displays a firmware prompt on the console.  OBP being pretty complex, you can do lots of things like set the system clock, specify additional screens, configure network devices, specify a default boot file, order additional or more extensive hardware tests, or anything else a system firmware would be expected to do.  But most of the time, we only use OBP to tell the system to load a program and begin executing it.

BIOS’s Last Stand in POST mode is to locate the Master Boot Record (MBR) of the desired boot device and load it into memory.  On a hard drive, the MBR is always Sector 0 and is 512 bytes long.  The MBR contains just enough information to allow the system to locate the Stage 1 Bootloader and load it into memory.  After the MBR is loaded into memory at a fixed address (0x7c00 for PCs), the CPU is given the instruction to begin executing the code at that address.  At this point, the BIOS has finished executing its boot logic but has not yet been unloaded from memory.

Step 4:  Primary Boot

A Primary Boot in the Real World

A Primary Boot in the Real World

The first thing the MBR code does is check to make sure that it has been faithfully copied into RAM.  It checks at an address 510 bytes after the instruction base (0x7dfe on a PC) for the two magic bytes 0xAA and 0x55 with mark the end of the MBR.  If it finds these, then it assumes that the MBR was copied correctly.  The MBR’s only real job is to find the Secondary Bootloader, load it into memory, and start its execution.  Linux’s GRUB bootloader has a Stage 1 (primary boot) that looks like this.

Step 5:  Secondary Boot

A real life secondary boot

A real life secondary boot

The main job of a secondary bootloader is to find the operating system (kernel), load it into memory, and start it executing.

The Secondary Bootloader can do neat things like display a nice fancy colored background (by interacting with the BIOS VESA driver) with a menu of possible operating systems to load, accept arguments to pass to the OS kernel, and other things.  GRUB, Windows, Solaris, and OpenSolaris take full advantage of this.  LILO and the BSD bootloaders don’t.

Once the user has made a choice (if given one), Stage 2 finds the compressed kernel image file on the disk (or in LILO and BSD cases, just goes to the specified disk sector) and loads it into memory, making sure to put the entire decompression algorithm and the compressed image header below the first 512K of memory.  It also puts the disk address of an initial RAMDisk image (if specified) into the data segment of the kernel memory.

On a PC, the kernel image starts at 0x8000 and this is where Stage 2 jumps as its last instruction.

Step 6:  Decompress the Kernel

A Decompressed Kernel of the Organic Kind

A Decompressed Kernel of the Organic Kind

Since Windows’ kernel code is still shrouded in mystery, I’ll focus on Linux here.  Linux systems in ancient times used a raw (uncompressed) kernel image called “vmlinuz” that fit entirely below the 512K limit in memory and could just be jumped into directly.  Then, when kernels got more options and hardware drivers, they got to be too big to do this anymore.  So the developers started using compressed kernel images called “zImage.”  (z refers to zlib, the compression algorithm used).  That way, the compressed image and decompressor code fit below the 512K limit.  Eventually, it was decided that 512K would not be enough to hold a compressed kernel image anymore and they wrote some code that allows part of the compressed image to be loaded below the 512K limit and the rest above.  This kernel image format is called “bzImage” (big zImage) and is the most common Linux kernel format used today.

The Linux Kernel (for an x86 machine) begins at line 107 of the file header.S.  It moves itself forward by two bytes (why? – line 111), sets some version values and signatures, checks for an initial RAMDisk image pointer, reads the command line arguments, resets the disk controller(s), zeros the base stack segment (why? – line 276), puts the CPU into Protected Mode, and calls the image decompressor at compressed/header.S.

The compressed header startup checks to see where in memory it’s loaded and decides how far to jump to the decompressor code.  It also checks to see if the compressed image is a 64 or 32 bit kernel if the CPU supports 64-bit mode.  If the kernel image is a 64 bit image, the CPU is put into Long Mode right away.  If not, it’s left in Protected Mode.  It then checks for the header of the compressed kernel image below the 512K limit and begins to copy it to the far end of the 1MB mark, working backwards.  It figures out where the kernel is supposed to start and how big it should end up.  It then calls the actual decompression routine in the file compressed/misc.c to decompress the kernel in place.  It decompresses front to back so that the end of the decompressed image overwrites the compressed image without screwing up the decompression logic.

Once the kernel is completely decompressed, it sets the base pointer to the address of the start_kernel() function in the newly-decompressed kernel image and jumps there, starting the actual Linux kernel.

Step 7:  Boot the Kernel

You see one Tux image for each CPU during Linux kernel startup

You see one Tux image for each CPU during Linux kernel startup

The kernel starts by loading all its compiled-in device drivers and probing the hardware.  If an initial RAMDisk was provided, it loads that into memory and uncompresses it.  It loads the loadable kernel modules from the initial RAMDisk image (or from the disk, if the hardware and filesystem drivers were compiled in), and looks for the program called /bin/init (or /sbin/init or /etc/init).  If it fails to find this program, the kernel panics and the system halts.  If the kernel wasn’t passed in the “S” or “1” parameters that would indicate a single-user startup, the kernel also writes disk check marker files onto all the filesystems for later.

Step 8:  Calling init

I don’t have a good picture for init – it’s a program that the kernel starts.  In fact, it’s THE program.  The kernel only starts a single userland process – init.  It’s responsible for controlling the state of the machine, launching and handling other processes (programs), and handling communication between processes.  It’s like the traffic cop of the OS.  Init reads the arguments passed to it by the kernel, reads the system startup scripts, and sets the appropriate environment variables.  If the system is destined for single-user mode (or interactive startup), init spawns a root shell (/bin/sh) on the system console and waits for it to exit before continuing.  Otherwise, it checks the disks that were scheduled for checking earlier by the kernel, starts all the userland programs scheduled to run at boot, starts a getty (TTY handler) on each of the 8 virtual terminals on the system console, opens the system to users, and waits for a login.

The system is now at a normal and fully-functional level.  I hope you’ve enjoyed this bewildering journey into the maze of twisty little passages, all alike that is the boot process of a modern computer system.

Advertisements

, , , , , , , , , , , , , ,

  1. #1 by Chadwick on February 16, 2010 - 8:52 PM

    I lol’d at the decompressed kernel.

    • #2 by Joshua on February 17, 2010 - 1:59 PM

      Well, it makes sense – the kernel is compressed inside its hard shell and the shell cracks, allowing the steam to pull the protein and starch outward.

  2. #3 by Phillip on February 17, 2010 - 9:18 AM

    It was informative, but I prefer to just call it magic.

    • #4 by Matt on February 17, 2010 - 5:47 PM

      Magic, as brought to you by Phoenix

      • #5 by Phillip on February 18, 2010 - 2:35 PM

        If someone needs to be the source, it may as well be them. And besides, AMI just doesn’t sound right.

        • #6 by Matt on February 19, 2010 - 3:57 PM

          And it imprints a giant bird born of fire into your memory!

          Cause if any two things should mix, its your boot process and fire.

          • #7 by Phillip on February 20, 2010 - 3:54 PM

            Oh, undoubtedly. I enjoy having to sometimes prod my boot process along with fire. The PC doesn’t, but if it knew what was good for it, it wouldn’t make me prod it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: