How to Switch to the LILO Boot Loader in Debian GNU/Linux

The most current version of this document can be found at http://users.wowway.com/~zlinuxman/lilo.htm.

Contents

Disclaimer
Maintenance Log
Introduction to the Boot Process
A Brief History of Linux Boot Loaders for i386/amd64
Checking the Capabilities of the BIOS
Installing and Configuring LILO
Booting Non-Linux Operating Systems with LILO
MBR versus VBR or EBR
Creating Boot Floppies
Special Considerations for USB Floppy Drives
Using USB Memory Sticks Instead of a Boot Floppy
The vga Option and Linux Fonts
Disk Ids
Common Problems and Their Solutions
     MBR Program Can't Chain Load LILO from a VBR or EBR
     LILO's Installer Fails Due to "Corrupted" Partition Table Entry
     USB Keyboard Doesn't Work
     Duplicate Disk Ids
     Only Part of the Word "LILO" Written During Boot (And Other Boot Errors)
Conclusion

Disclaimer

This is not an official Debian site.  The author is not a lilo Debian package maintainer.  The author is not a lilo upstream maintainer.  This document details the author's experiences and recommendations with regard to installing and using lilo under Debian GNU/Linux on the i386 and amd64 architectures.  All opinions expressed are those of the author and do not necessarily represent the opinions or the official positions of Debian, lilo upstream, or any other organization or individual.  This information is presented in the hope that it will be useful, but without any warranty or guarantee of any kind.  This information is presented free of charge, free of support, free of service, and free of liability.  Take this information with as many grains of salt as you think it's worth; and use it, if you choose to do so, entirely at your own risk.  The author hereby explicitly places this material in the public domain.  All trademarks, registered trademarks, service marks, etc. are the property of their respective owners.

Maintenance Log

Introduction to the Boot Process

On the i386 platform, the boot process has historically been divided into two stages when booting from a partitioned device, such as a hard disk.  (When booting from a non-partitioned device, such as a floppy diskette, only one stage is involved.)  After the BIOS (Basic Input/Output Subroutines) has finished initializing, it reads the MBR (Master Boot Record) boot-loader program into storage and then transfers control to it.  This is stage one.  The MBR is the first sector of the hard disk, which has an LBA (Logical Block Address) value of 0 and a CHS (Cylinder:Head:Sector) value of 0:0:1, which means cylinder 0, head 0, sector 1.  Each sector is 512 bytes long.  Cylinders are numbered from 0 to 1,023 (0x0 to 0x3ff), heads (tracks) are numbered from 0 to 254 (0x0 to 0xfe), and sectors are numbered from 1 to 63 (0x1 to 0x3f).  LBAs are numbered from 0 to 4,294,967,295 (0x0 to 0xffffffff).  At a maximum, only the first 8.4G (8.4 gigabytes) of a hard disk is addressable by means of a CHS value.  With LBAs, one can address up to 2T (two terabytes).  Methods of handling larger disks will be discussed later.

Near the end of the MBR is the PT (Partition Table), which contains four entries, each of which is sixteen bytes long.  An entry consisting of all zeros is an unused entry.  Each used entry describes either a primary partition or an extended partition.  (There is a flag byte in each PT entry which identifies the partition type.)  There can be multiple primary partitions, but there can be at most one extended partition.  An extended partition is further subdivided into logical partitions, but information about logical partitions is not contained in the MBR.  Information about logical partitions is contained in a linked list of EBRs (Extended Boot Records), beginning with the first sector of the extended partition.  The general format of an MBR looks like this:
 

                 ┌───────────────────────────┐
           0x000 │         boot code         │
                 ├───────────────────────────┤
           0x1b8 │ disk signature (optional) │
                 ├───────────────────────────┤
           0x1bc │   usually nulls; 0x0000   │
                 ├───────────────────────────┤
           0x1be │      Partition Table      │
                 ├───────────────────────────┤
           0x1fe │   magic number (0xaa55)   │
                 └───────────────────────────┘

The "magic number" is the boot signature.  The boot signature is what tells the BIOS that an MBR boot-loader program is installed in the MBR.  The magic number is stored in little-endian format, in accordance with the way binary integers are stored on i386 hardware.  That is, the byte at offset 0x1fe is 0x55; and the byte at offset 0x1ff is 0xaa.  If the BIOS does not find this signature, a boot failure occurs.  For more detailed information about the format of an MBR, including the embedded PT, click here.  For information about the format of an EBR, click here.

Each MBR PT entry has a "boot flag", which indicates whether or not that partition is "active", that is, marked to be booted.  A logical partition defined in an EBR PT entry also has a "boot flag".  The MBR boot-loader program searches for a partition that is marked active.  Historically, the MBR boot-loader program searches only the MBR's PT: it does not search the EBR chain for logical partitions which are marked active.  An exception will be discussed later.  If no active partition is found, a boot failure occurs.  Depending on which MBR boot-loader program is being used, a boot failure may also occur if more than one partition (of those searched) is marked active.  In other cases, the first partition found which is marked active is selected; and other partitions which are marked active are ignored.

Once the active partition has been found, the MBR boot-loader program reads the first sector of that partition into storage and transfers control to it.  This process is known as chain loading.  This is stage two.  Although it is possible for a boot program to reside in the first EBR (the first sector of the extended partition), this was not historically done.  The first sector of a primary or logical partition is a VBR (Volume Boot Record).  Historically, each operating system was expected to be installed in its own volume; and each operating system was responsible for providing its own boot-loader program in the VBR of that volume that knew how to boot that operating system.  The MBR boot-loader program simply chain loads the VBR boot-loader program of whichever partition is marked active.

A VBR is the first sector of a non-partitioned device, such as a floppy diskette, or the first sector of a primary or logical partition on a partitioned device, such as a hard disk.  The general format of a VBR looks like this:
 

                 ┌──────────────────────────┐
           0x000 │  jump to the boot code   │
                 ├──────────────────────────┤
           0x003 │   BIOS Parameter Block   │
                 ├──────────────────────────┤
     0x02c/0x03e │        boot code         │
                 ├──────────────────────────┤
           0x1fe │  magic number (0xaa55)   │
                 └──────────────────────────┘

The BIOS Parameter Block (BPB) is sometimes present and sometimes not present, depending on which type of file system is used on the volume.  For file systems which use it, the BPB is present regardless of whether or not the volume is bootable.  That is, even if there is no VBR boot-loader program installed in the VBR, the BPB is still there.  The format of the BPB is given here.  The BPB is used by the FAT12, FAT16, FAT32, HPFS, and NTFS file systems.  When a boot-loader program is installed in a VBR containing one of these file systems, care must be taken not to overlay the BPB or the file system on the volume will become unusable.  Traditional Linux file systems, such as ext2, ext3, ext4, reiserfs, etc., do not use the BPB; so boot-loader programs may safely overlay this portion of the VBR for volumes containing such filesystems.  On the other hand, Linux swap partitions and the XFS filesystem use the VBR as a superblock; so no boot-loader program can safely be installed in the VBR of a volume containing such a filesystem.

The "magic number" (0xaa55) is the "boot signature", stored in little-endian format.  It is present only if a VBR boot-loader program is installed in the VBR.  If the VBR does not contain a boot-loader program, the "magic number" field will contain something other than 0xaa55, usually zeros.  A boot failure will occur if the BIOS (for a non-partitioned device) or the MBR boot-loader program (for a partitioned device) does not detect this boot signature in the VBR of the volume it is being asked to boot.

To summarize, an MBR is the first sector of a partitioned device, such as a hard disk.  It does not contain a BPB, but it does contain a PT.  A VBR is the first sector of a non-partitioned device, such as a floppy diskette, or the first sector of a primary or logical partition on a partitioned device, such as a hard disk.  It may contain a BPB, depending on the file system used on the volume; but it does not contain a PT.  An EBR describes a logical partition.  The first EBR is the first sector of the extended partition.  If there is more than one logical partition, additional EBRs will be chained together in a linked list, with the first EBR as the head of the chain.  Each EBR describes one, and only one, logical partition.

The MBR, VBRs, and the first EBR may contain a boot-loader program; but historically, the purposes of these boot-loader programs are different.  Historically, the purpose of the MBR boot-loader program is to chain load the VBR boot-loader program of whichever partition is marked active.  Historically, the purpose of the VBR boot-loader program is to boot the operating system installed on that volume.  Historically, the first EBR never actually contained a boot-loader program.  In the early days, DOS provided the MBR boot-loader program, as well as a VBR boot-loader program that knew how to boot DOS.  But Linux did not provide a VBR boot-loader program, so a solution had to be found.

A Brief History of Linux Boot Loaders for i386/amd64

The primary job of any boot-loader program is to get the operating system kernel loaded into memory and then to transfer control to it.  Once the boot-loader program has transferred control to the kernel, the boot loader's job is done.  Some operating systems, such as DOS® and Windows®, include the boot-loader program as part of the operating system.  Linux does not.  Linux relies on an external boot-loader program to get it loaded and started.

One of the oldest boot-loader programs for Linux is LOADLIN, which stands for LOAD LINux.  LOADLIN is actually a DOS program.  To load Linux with LOADLIN, you first have to boot DOS; then you run the LOADLIN program under DOS to load the Linux kernel into storage and to transfer control to it.  Once the Linux kernel gets control, it completely takes over the machine.  DOS never gets control back.  This method works, but the Linux user community did not like having to boot DOS first in order to load Linux.  They wanted a boot-loader program that could be booted by the hardware, was capable of loading and starting the Linux kernel, and had no dependencies on DOS (other than possibly its MBR boot-loader program).

Such a program was eventually developed; and its author decided to call it LILO, which stands for LInux LOader.  LILO was written in such a way that it can be installed in the first sector of a partition, in accordance with the design of DOS; or it can be installed in the MBR, where it replaces the MBR boot-loader program provided by DOS (and later, Windows).  Of course, if it replaces the MBR boot-loader program, then it not only needs to know how to load Linux, but also needs to be capable of chain loading VBR boot-loader programs, such as those provided by DOS and Windows.  Those capabilities were added to LILO.  Also, when replacing the MBR, one must be careful not to wipe out the PT, which is embedded in the MBR.  When installing to the MBR, lilo is careful to preserve the PT.  (LILO, upper case, is used to refer to the boot loader itself; lilo, lower case, is used to refer to the lilo command which runs at a Linux shell prompt or to the Debian package by the same name.)

What if you want to install LILO to the first sector of a partition, the way DOS works; but you don't have DOS (or Windows) installed?  (And therefore, you don't have a generic MBR boot-loader program from DOS or Windows installed.)  That is not a problem: lilo can install one of two generic MBR boot-loader programs.  (They merely chain load the partition which is marked active, which is what the DOS or Windows MBR boot-loader program does.)  One of them, the standard version, works the same way that the DOS or Windows MBR boot-loader program does in that it searches only the MBR's PT to look for an active partition.  The other one, the extended version, in addition to searching the MBR's PT, also searches the PT entries of the EBR chain to see if there are any logical partitions marked active.  When using this MBR boot-loader program, LILO can be installed in the first sector of a logical partition.

LILO reigned supreme for a long time as the boot-loader program of choice for Linux on the i386 platform.  (By extension, it also works on the amd64 platform, since an amd64 machine boots up in an i386-compatible mode.)  But LILO is platform specific.  The boot-loader program itself (as opposed to the boot-loader installer) is written in i386 assembly language, runs in real mode, and uses BIOS calls to get its job done.  As Linux was ported to other platforms, each platform developed its own boot loader, typically patterned in design after LILO.  For example, when Linux was ported to the s390/s390x platform, the zIPL boot loader (part of the s390-tools package) was developed to do the hardware-specific boot-loading chore for Linux on the s390/s390x platform.

LILO was designed to load the Linux kernel; but the GNU (Gnu is Not Unix) project had been working on their own kernel, called the Hurd kernel.  Unlike the monolithic Linux kernel, the Hurd kernel is modular; and LILO really wasn't designed to load such kernels.  (In other words, the Linux kernel consists of a single disk file; but the Hurd kernel consists of multiple disk files.)  Rather than enhance LILO to enable it to load the Hurd kernel, the GNU project decided to create a new boot loader called GRUB (GRand Unified Boot loader).  This new boot loader was intended to unify boot loading across i386 operating systems, including implementation of the new multiboot standard, which would be needed for modular kernels such as the GNU Hurd.  But for various reasons, GRUB grew and developed to the point that it became almost unmaintainable.  This program, now called GRUB Version 1, is no longer being enhanced.

A project to replace GRUB was started, and the new program they came up with is called GRUB Version 2.  Despite the similarity in name, GRUB Version 2 is not a modified version of GRUB Version 1: it is a completely new program written from scratch.  As an IBM salesman that I know frequently says, "Don't confuse marketing with reality."  GRUB Version 2, in addition to the goals for GRUB Version 1, added a lot of extra stuff.  But as Chief Engineer Montgomery Scott says in the motion picture Star Trek III: The Search for Spock, "The more they over tink the plumbing; the easier it is to stop up the drain."

Despite all the nice features of GRUB2, it has its problems.  For one thing, all the extra smarts have to go somewhere.  A single 512-byte sector, such as the MBR or a VBR, isn't enough room to hold all the code; so GRUB (either version) tends to use unallocated sectors to store extra code when it is installed to the MBR.  (By unallocated sectors, I mean sectors which are outside of a partition.  Strictly speaking, the MBR itself qualifies as being outside of a partition; but that is a well-known and accounted-for exception case known to all backup programs.)  Backup programs may not backup those unallocated sectors; and when the disk is wiped clean and restored from a backup (or restored to a different disk), the restored disk is unbootable.  Such is the case with my employer's backup and restore software.

Since there are no rules for the use of unallocated sectors (in theory they aren't supposed to be used at all), GRUB may conflict with other programs which store information in unallocated sectors, such as disk management software, BIOS extension programs, etc.  The unallocated sectors that GRUB wants to use are the unused sectors between the MBR and the start of the first partition.  Historically, fixed disk partitioning programs aligned partitions on a track boundary.  Therefore, there were always some unallocated sectors between the MBR and the start of the first partition; so that the first partition could start on a track boundary.  However, track alignment is not always enforced by modern fixed disk utilities; and there may not be any unallocated sectors between the MBR and the start of the first partition.  This can cause problems for GRUB installation.  Plus, with added complexity comes added bugs.

Fortunately, LILO is still available.  It can't boot the Hurd kernel.  But I don't run the Hurd kernel: I run the Linux kernel.  And I have no plans to switch to the Hurd kernel.  So I decided to switch back to LILO.  As one of the laws of programming says, "Inside every large program is a small program struggling to get out."  In this case, I see GRUB2 as the large program and LILO as the small program struggling to get out.  I wish the GNU project well in their efforts with GRUB2.  But for those of you who are having problems with GRUB2, or for those of you who wish to avoid problems, or for those of you who just like LILO and want to stick with something tried and true, this web page is for you.  LILO works well, but there are a number of "gotchas" that you need to watch out for when installing and configuring it.  Once it is properly installed, you should have many years of trouble-free service from it.

Having said that, LILO is not for everyone.  LILO only supports the traditional MS-DOS disk partitioning scheme, which uses 512-byte logical and physical sectors.  The maximum size of the disk is 2T.  (The exact maximum size is 2,199,023,255,552 bytes, which is 2**41.  This comes from the product of the number of bytes per sector, 512 (2**9), times the number of unique four-byte LBA values, 4,294,967,296 (2**32).)  In order to access disks with a capacity greater than 2T or with a sector size greater than 512, you need to use the newer GPT partitioning mechanism.  If your disk is partitioned with the newer GPT partitioning mechanism, LILO will not work for you.  (For more information about the GPT partitioning mechanism, click here.)  If you want to run the Hurd kernel, LILO will not work for you.  And of course, if you're not running on the i386 or amd64 architecture, LILO will not work for you.

Finally, the machine must have a BIOS.  Many of the newest motherboards use a UEFI (Unified Extensible Firmware Interface), which is intended as a replacement for the traditional BIOS.  In practice, most UEFI motherboards also have a BIOS for forward compatibility.  The BIOS included with UEFI-based motherboards is known as a Compatibility Support Module (CSM).  However, if the Connected Standby feature is enabled, this disables the Compatibility Support Module.  If you have a UEFI-based machine with no BIOS for forward compatibility, or if the BIOS has been disabled, you must use another boot loader designed for use in UEFI-based systems, such as elilo or grub-efi-amd64.  For more information about UEFI, click here.

Note: LILO may work on a GPT-partitioned disk if it is installed to the MBR; if it only loads Linux; if the kernel image, initial RAM file system image, and the map file reside within the first 2T of the disk; and if the disk uses, or emulates via the BIOS, 512-byte sectors.  But officially, GPT-partitioned disks are not supported.  Also, there is no problem with having one or more GPT-partitioned disks in your system if LILO is not installed on them and is not asked to boot anything from them.

Checking the Capabilities of the BIOS

LILO relies heavily on the BIOS to get its job done.  And all BIOS chips are not created equal.  There are a number of different BIOS manufacturers (IBM, Phoenix, Award, AMI, Dell, etc.), and the BIOS chips have been manufactured over a long period of time.  Furthermore, the BIOS capabilities are seldom advertised directly to the consumer.  BIOS manufacturers sell their products to motherboard manufacturers, and there's a relatively small number of motherboard manufacturers.  The end consumer who buys the final PC (Personal Computer) rarely has any information about the specific capabilities of the BIOS in his machine.  Yet, in order to optimally configure lilo, you must know this information.

BIOS routines are only callable by programs which run in real mode or virtual 8086 mode, two different operating modes of 80386 (and newer) processors.  At boot time, when the boot loader is running, the CPU (Central Processing Unit) is operating in real mode; so BIOS routines are callable at that point.  All processors in the i386 and amd64 processor family power up in real mode for forward compatibility with the 8088 processors used in the original IBM PC and PC/XT (eXtra Technology) computers.  Explicit action must be taken under program control to switch the CPU to another mode.

When the Linux kernel first gets control, it is also running in real mode.  (That is why the kernel has to be loaded at the 1M (one megabyte) line.  Programs running in real mode are limited to the address range 0x0-0x10ffef for executable code.  The portion of this range which is above 1M, 0x100000-0x10ffef, sixteen bytes short of a 64K segment, is known as the HMA (High Memory Area).  It is the only portion of extended memory which is directly addressable by programs running in real mode.)  One of the first things that the Linux kernel does when it gets control from the boot loader is to set its video mode (the vga option).  This is done by means of a BIOS call, so this must be done while the kernel is still running in real mode.  Once all operations which must be performed in real mode have been completed, the kernel switches the CPU into protected mode (i386) or 64-bit mode (amd64).  The Linux kernel then operates in that mode for the remainder of its life.  (There are exceptions made for some user-space programs, such as dosemu, which run in virtual 8086 mode.)

The interface to the BIOS from a program running in real mode or virtual 8086 mode is the int (interrupt) instruction.  The int instruction carries with it an interruption code.  In most cases, a function code, specified by register AX, is also used.  The interruption code and the function code together tell the BIOS what is to be done.  (In some cases there is a subfunction code as well.)  The DOS kernel also runs in real mode.  The DOS kernel implements some interrupt functions also, in addition to those implemented by the BIOS.  Programs written in assembly language for execution under DOS can use these interrupts, whether implemented by the BIOS or by the DOS kernel, to accomplish various tasks.  These interrupts came to be known informally as "DOS interrupts", even though some of them are implemented by the BIOS and some of them are implemented by the DOS kernel.  These interrupts are documented in references such as Ralf Brown's Interrupt List.

DOS programmers don't really care which interrupts are implemented by the DOS kernel and which interrupts are implemented by the BIOS.  The maintainers of boot loaders do care, however, since only interrupts implemented by the BIOS are available in a boot-loader environment.  Here is an example of a reference which documents only interrupts implemented by the BIOS.  For more general information about BIOS interrupts, see this page.

If you are not sure about the capabilities of your BIOS, I recommend the LILO diagnostic diskette This utility tests all the BIOS interrupts that LILO might use and displays the test being made and the results on the screen.  It was originally written by the LILO developers primarily for debugging and trouble-shooting purposes.  (For example, maybe they had a problem report that LILO failed on a certain kind of BIOS.  They would ask the user to run the diagnostic utility and send in the results, which would provide information needed by the developers to fix the problem.)  However, this utility is also useful to find out in detail what the capabilities of a given BIOS are.

To use this utility, first format a double-sided, high-density, 3.5-inch diskette in MS-DOS format, 1.44M capacity.  (That's 512 bytes/sector, 18 sectors/track, 2 tracks/cylinder, and 80 cylinders, for a total of 1,474,560 bytes.)  If you don't have DOS or Windows to do the formatting, the mformat command of the mtools package will provide the function you need.  Make sure the diskette contains no bad sectors.  Then download the file (in binary of course), which is called bootdiagnostic.b.gz, via the above link.  Unzip the file via
 

     gunzip bootdiagnostic.b.gz

then dump it to the diskette via
 

     dd if=bootdiagnostic.b of=/dev/fd0

This assumes that your floppy drive is at /dev/fd0.  Change device names for your system as needed.  You normally need to be root to write directly to a device in this way.

Now shutdown your Linux system and boot from this floppy diskette.  (You may have to alter settings in your BIOS setup program to allow booting from a floppy drive.)  This utility must run in real mode.  That is why it must be directly booted.  It cannot run under Linux.  Press Enter to move through the screens of the diagnostic utility one screen at a time.  Press ^C (Ctrl+C) to halt the diagnostic utility early.  When you are finished running the diagnostic utility, remove the floppy diskette from the floppy drive; and boot your Linux system again.

Note: many newer systems, especially laptops and notebooks, don't have internal floppy drives anymore.  But with these machines you can usually attach an external USB (Universal Serial Bus) floppy drive.  The BIOS of newer machines can usually be configured to boot from it.  That is what I have done with my IBM ThinkPad X31, which doesn't come with a floppy drive.  You can also use dd to dump the floppy image to a USB memory stick, then boot from the memory stick.  Doing so will of course destroy all data currently on the USB memory stick.  Make sure your USB memory stick is in USB-FDD emulation mode.  Not all memory sticks support USB-FDD emulation.  Read the fine print before you buy.

Note: If you are using a USB keyboard, and you can't get it to work with the diagnostic utility, see the USB Keyboard Doesn't Work section.  The same BIOS setting which enables the use of USB keyboards is also often necessary to allow booting from a USB device!

The two most important things you need to know about your BIOS are whether or not it supports access to extended memory above 16M via BIOS Int 15h Function 87h (the large-memory option) and whether or not it supports EDD (Enhanced Disk Drive) packet addressing for your boot drive (the lba32 option).  As you work your way through the various screens displayed by the LILO diagnostic diskette, the first question will be answered by a screen which looks like this:
 

     Int 15h  Function 87h           [AT][PS/2]
     Move Extended Memory Block

     Call With:
         DS=1276  ES=1276  CS=1000  SS=1276
         AX=8700  BX=0000  CX=0001  DX=0000  SI=1AC8  DI=0000

     Returns:
         DS=1276  ES=1276  CS=1000  SS=1276
         AX=0000  BX=0000  CX=0001  DX=0000  SI=1AC8  DI=0000
         Carry = 0

     R/W test at address 0011FFFE successful
     R/W test at address 0111FFFE successful



     Hit <Enter> to continue, <^C> to quit ...

If you see a line which says, "R/W test at address 0111FFFE successful", as in the above example, then your BIOS supports access to extended memory above the 16M line via BIOS Int 15h Function 87h; and you may safely use the large-memory option.  Be sure that there is only one leading zero in the address.  (Note that the line above it tests an address with two leading zeros.  This address is below the 16M line.)  Of course, you need to have enough memory installed for 0111FFFE to be a valid free RAM address; or this test is not possible.  That's 17,536K, not taking into account memory holes, shadow RAM, memory used by the BIOS, etc.

The 16M restriction originated from the Intel 80286 chip, which was first used in the IBM PC/AT (Advanced Technology) computer.  This chip was the first in common use to support "extended memory" (memory above the 1M addressing limit of the 8088 chips used in the PC and PC/XT).  However, direct access to extended memory (except for the HMA discussed earlier) is only possible when the CPU is operating in protected mode.  To make access to extended memory by real-mode programs easier, a new BIOS interface, "Move Extended Memory Block" (Int 15h Function 87h), was introduced along with the 80286 chip.  It allows copying up to 65536 contiguous bytes of memory from one location to another, and both the source location and the destination location may reside anywhere in memory (below or above the 1M line).  This BIOS routine switches the CPU into protected mode, performs the memory copy, then switches back to real mode and returns control to the invoking program.

Since the 80286 chip can address a maximum of 16M of memory, the original BIOS Int 15h Function 87h interface specified both the source address and the destination address as a three-byte (24-bit) value.  Linux has never been supported on the 80286 processor: an 80386 processor or later has always been required.  However, some members of the 80386 processor family, such as the 80386SX and the 80386EX, also have a 16M memory addressing limit.  Linux used to be supported on these processors.

The 80386DX chip was the first to introduce a 32-bit address bus, which allows accessing up to 4G of memory.  Accordingly, the BIOS interface (Int 15h Function 87h) was enhanced to allow specification of a 32-bit address for both source and destination fields.  The low-order 24 bits of the source and destination addresses are placed in the same locations as before, and the high-order 8 bits of the source and destination addresses are placed in separate areas of the control block passed to the BIOS routine which are not contiguous with the low-order 24 bits of their respective addresses.  These two bytes were formerly reserved fields and were formerly required to be zero.

Starting with Debian Sarge (3.1), support for the 80386 processor family was dropped.  An 80486 processor or newer is now required, even though the Debian architecture designation is still called i386.  All models of the 80486 processor family and later have an address bus of at least 32 bits.  If the BIOS in your PC does not support access to extended memory above 16M via BIOS Int 15h function 87h, then your BIOS has a bug.  Some Soekris and Pcengines BIOS chips have been reported to contain this BIOS bug as late as 2005, even though the Intel 80386DX CPU was introduced in 1985!  (See this report and this report.)  You might want to check with the manufacturer to see if a BIOS update is available to fix this problem.  If not, then you will have to live with the restriction as documented in this web page.

Note: this BIOS interface was not enhanced again when CPUs with the PAE (Physical Address Extensions) feature came out, nor was it enhanced when 64-bit processors came out.  Therefore, access to extended memory by this interface is limited to the first 4G of memory, even if more than 4G of memory is installed.  Also, although most BIOS interrupts can be used in virtual 8086 mode, as well as in real mode, this one is an exception.  For a number of technical reasons, this BIOS interrupt can usually be used only in real mode.

Continuing on through the screens displayed by the LILO diagnostic diskette, you will eventually come to one which looks like this:
 

     Int 13h  Function 08h           [PC][AT][PS/2]
     Get Drive Parameters  (device 80h)

     Call With:
         DS=1276  ES=0000  CS=1000  SS=1276
         AX=0800  BX=1234  CX=0103  DX=0080  SI=1AC8  DI=4321

     Returns:
         DS=1276  ES=0000  CS=1000  SS=1276
         AX=0000  BX=1234  CX=12FF  DX=7F01  SI=1AC8  DI=4321
         Carry = 0

     Disk geometry (C:H:S) = 787:128:63 (6,346,368 sectors) = 3.25G
     Fixed disks on system = 1



     Hit <Enter> to continue, <^C> to quit ...

This screen reports the drive geometry as seen by the classic Int 13h Function 08h BIOS call.  It is really of no consequence unless EDD packet addressing is not supported by your BIOS, but you don't know the answer to this question yet.  Make sure you are looking at the information for the boot drive, which is BIOS device 80h.  Hard drives have a BIOS device number of 80h and above.  The floppy drives (A: and B:, as DOS would call them) are usually 00h and 01h, respectively.  (We don't care about the geometry of the floppy drives!)  In this example, the classic Int 13h Function 08h drive geometry for the hard disk is 787 cylinders, 128 tracks (heads) per cylinder, and 63 sectors per track, for a total of 6,346,368 sectors (787*128*63).  At 512 bytes per sector, that is 3,249,340,416 bytes (6,346,368*512), or about 3.25G.  If EDD packet addressing is not supported by the BIOS, your /boot partition must be wholly contained within the first 3.25G of the hard disk (in this example).  Otherwise, LILO will not be able to read the map file, the kernel image file, or the initial RAM file system image file from the disk.

Continuing on through the screens presented by the LILO diagnostic diskette, you will soon come upon this one:
 

     Int 13h  Function 41h           [EDD]
     Check EDD Extensions Present (device 80h)

     Call With:
         AX=41ED  BX=55AA  CX=0103  DX=0080  SI=1AC8  DI=4321

     Returns:
         AX=21ED  BX=AA55  CX=0005  DX=0080  SI=1AC8  DI=4321
         Carry = 0

     Enhanced Disk Drive Support: yes
     Drive locking and ejecting: no
     Device access using packet calls: yes
     EDD extensions version 1.1 (hex code 21h)



     Hit <Enter> to continue, <^C> to quit ...

Again, make sure you are looking at the information for the boot drive (device 80h), not a floppy drive or some other device.  The thing to look for is a line which says, "Device access using packet calls: yes".  If this is the case, then your BIOS supports the "Fixed disk access subset" of the Int 13h extensions.  This subset includes the functions that LILO needs to access the disk using packet calls, such as function 42h (Extended read) and function 43h (Extended write).  And if that is the case you may safely use the lba32 option.  (lba32 is now a default option.)  In this case, there are no restrictions on where your /boot partition may be located: it may be anywhere on the hard disk (up to 2T).  If "Device access using packet calls: no" is displayed, or if the "Carry" flag is not 0, then you may not use the lba32 option.  Use the linear option instead (a non-default option).  In this case, the location of your /boot partition is restricted, as explained above.

Note: EDD packet addressing was introduced to eliminate the 8.4G maximum disk size of the original Int 13h Functions 02h/03h.  This comes from the limits of 1,024 cylinders * 255 tracks/cylinder * 63 sectors/track * 512 bytes/sector.  The exact limit is 8,422,686,720 bytes.  For more information about the EDD extensions, see this document.

Installing and Configuring LILO

OK, let's get started.  I'm assuming that you have already run the LILO diagnostic diskette and therefore know the capabilities of your BIOS.  I'm also assuming that your current boot loader is GRUB (either GRUB Version 1 or GRUB Version 2).  For Squeeze (6.0) and later releases, issue (as root of course):
 

     apt-get --purge install lilo grub-pc-

Note the hyphen character appended to the grub-pc package name.  This installs the lilo package and removes the grub-pc package.  Since the --purge option is specified, the grub-pc package will also be purged, as well as removed.  (All Linux commands in this document are assumed to be issued by the root user unless otherwise noted.)  Substitute grub-legacy for grub-pc if you are running GRUB Version 1.  Now you need to create /etc/lilo.conf, the lilo configuration file.  There are scripts included with the most recent versions of lilo which automate the creation of the configuration file (liloconfig) or which convert an existing one to use UUIDs (lilo-uuid-diskid), but I prefer to configure lilo manually.  Here is a sample configuration file:
 

     # /etc/lilo.conf
     #
     # global options
     #
     append="acpi=off notsc clocksource=pit"
     boot=/dev/disk/by-id/ata-IBM-DBCA-203240_HP0HPL43952
     compact
     default=Linux
     delay=40
     install=text
     large-memory
     lba32
     root="UUID=04db5929-51e6-424a-ac5b-a592b96b9d04"
     read-only
     vga=normal
     #
     # per-image options
     #
     image=/boot/vmlinuz
          label=Linux
          initrd=/boot/initrd.img
     #
     image=/boot/vmlinuz.old
          label=LinuxOld
          initrd=/boot/initrd.img.old
          optional

Don't copy this file verbatim: you have to think.  You have to know your system.  The global options apply to all kernel images; the per-image options apply to a single kernel image.

The append option is used to specify kernel boot parameters that are to be supplied to the kernel at boot time.  Don't blindly copy these kernel boot parameters; you have to use options appropriate to your system.  These options used as an example are from my IBM ThinkPad 600.  They probably aren't optimal for your computer.  Most people don't need them at all.  Some kernel boot parameters, such as root, ro, and vga, are handled by separate LILO configuration options.  Do not include these kernel boot parameters in the character string specified by append.

The boot option specifies where LILO's first-stage boot-loader program is to be written.  To install the first-stage boot-loader program to the MBR of the boot drive, one used to use specifications such as /dev/hda or /dev/sda, depending on whether your hard disk was IDE (Integrated Drive Electronics) or SCSI (Small Computer Systems Interface), respectively.  This is still supported, but it is no longer recommended.  The reasons for this are two fold.

First of all, with newer Linux kernels, the assignment of devices to kernel device numbers and user-space device names is no longer predictable, as it once was.  For example, if you have two SCSI disks in your system, one of them will be assigned the device name /dev/sda and the other one will be assigned the device name /dev/sdb.  But which disk is assigned to which device name may vary from one boot to the next.  Even if you only have one SCSI disk, you're still not completely safe.  On one of my systems, my hard disk is normally /dev/sda and my CD-ROM drive is normally /dev/sr0.  That's if I boot from the hard disk.  But if I boot the system from a rescue CD, such as the Debian installer operating in rescue mode, the CD-ROM drive is /dev/sda and the hard disk is /dev/sdb!  External USB drives, floppy or hard, can also cause a similar problem, since they show up as SCSI disks.  USB memory sticks also show up as SCSI disks.  Given this problem, how are you going to specify something in the boot configuration record which will always point to the MBR of your boot drive?

The second problem is that the naming convention for devices depends on which driver is being used.  Newer Linux kernels use the libata SCSI emulation drivers for IDE hard disks.  Older Linux kernels use the traditional IDE drivers.  Let's say, for example, that you have only one hard disk; and it is IDE (also known as Parallel AT Attachment, or PATA).  If you boot the 2.6.32-3-686 kernel (or older), the hard disk may be called /dev/hda.  But if you boot the 2.6.32-5-686 kernel (or newer), it may be called /dev/sda.  If you have both kernels installed and you switch back and forth between them regularly, then how do you specify something in the boot configuration record which will always point to the MBR of your boot drive?

The solution to this problem is to use a udev-created symbolic link.  In the example configuration file above, /dev/disk/by-id/ata-IBM-DBCA-203240_HP0HPL43952 is a udev-created symbolic link to the MBR of the (normal) boot drive, regardless of which driver is used for IDE hard disks and regardless of how many disk drives are in the system, the order in which they are discovered, or what the boot device is.  List the udev-created symbolic links in /dev/disk/by-id to determine an appropriate value to use for your system.  For example:
 

     cd /dev/disk/by-id
     ls -Al

If you have a traditional IDE (PATA) hard disk, you want to use a symbolic link which starts with "ata" because it will be present regardless of whether the kernel drivers are the traditional IDE drivers or the newer libata SCSI emulation drivers.  The symbolic links which start with "scsi" are only present when the libata SCSI emulation drivers are being used.  It is best to use a udev-created symbolic link which will always point to a specific physical disk every time, regardless of its device name in the currently-booted session.

If you're using a standard generic MBR boot-loader program in the MBR, such as the DOS/Windows MBR boot-loader program, the MBR boot-loader program from the Linux mbr package, or LILO's standard generic MBR boot-loader program, or if you plan to install one there, then you can install LILO's first-stage boot-loader program to the first sector of a primary or extended partition.  For a primary partition, that would normally be the /boot partition (or the / partition, if the /boot directory is part of the same partition as /).  Again, you should use a udev-created symbolic link (such as /dev/disk/by-uuid/...) rather than a traditional block special file name (such as /dev/sda2).  For example, /dev/disk/by-uuid/04db5929-51e6-424a-ac5b-a592b96b9d04 is a udev-created symbolic link to the second primary partition on the boot disk for the sample system listed above.  List the symbolic links in the /dev/disk/by-uuid directory for an appropriate value to use on your system.

If you have assigned disk labels, a symbolic link from /dev/disk/by-label may be used instead.  Or perhaps you would prefer a symbolic link to the partition listed in /dev/disk/by-id.  In any case, make sure that the partition which will contain LILO's first-stage boot-loader program (and only that partition) is marked active in the PT.

Warning: reformatting a partition normally changes the uuid, unless the existing uuid is explicitly specified as an option when formatting.  Thus, commands such as mke2fs and mkswap will normally change the uuid, which in turn will change the udev-created symbolic link in /dev/disk/by-uuid, which in turn may necessitate changes to /etc/lilo.conf.  Changes to /etc/fstab, /etc/initramfs-tools/conf.d/resume, etc., may also be needed.  Reformatting will also change the disk label, unless the old label is explicitly specified as an option during formatting.  This will change the udev-created symbolic link in /dev/disk/by-label and may also necessitate changes in other files as in the above case.

Also, please note that if you're going to install LILO's first-stage boot-loader program to the first sector of a partition and you're using a standard generic MBR boot-loader program in the MBR, LILO's first-stage boot-loader program may not be installed to a Linux swap partition, a partition containing an XFS file system, or a logical partition.  Ideally, the partition should contain a file system which does not use the BPB, such as ext2, ext3, ext4, reiserfs, etc.  Older versions of LILO's first-stage boot-loader program could not be installed in the VBR of a partition which contains a file system that uses the BPB, such as FAT12, FAT16, FAT32, HPFS, or NTFS.  Newer versions of lilo are smart enough to detour the code around the BPB, but you will get a warning message about it.  Also, you don't want to install LILO's first-stage boot-loader program to the VBR of a partition which already contains a VBR boot-loader program, such as the "C:" drive of DOS or Windows.  If you do, you will never again be able to boot the operating system which is installed in that partition.

Personally, I like to create a separate /boot primary partition, which can be relatively small, and install LILO's first-stage boot-loader program there.  I like to create this partition as close to the beginning of the disk as possible.  This is especially useful if your machine has an old BIOS which does not support EDD packet addressing.  The relatively small /boot partition can be kept within the addressing limits of CHS values in the disk geometry as reported by BIOS Int 13h Function 08h, while allowing other Linux partitions, such as /, /home, swap partitions, etc., to be outside this range.

I also like to make all partitions used by Linux (except /boot) logical partitions.  This includes /.  With a maximum of three primary partitions (assuming that an extended partition exists), you may be forced to do this anyway.  For example, the three primary partitions may be a Windows partition, a utility partition, and the /boot partition.  You're already at the limit for primary partitions.  If you create one more primary partition, for a total of four, then you won't be able to create an extended partition; and therefore, you won't be able to create any logical partitions either.  Additional partitions (/, /home, swap, etc.) now have to be logical partitions.  For SCSI disks, or for disks handled by the SCSI emulation driver, you are limited to a maximum of eleven logical partitions, numbered from 5 to 15, inclusive.  If you have less than three primary partitions, that does not increase the number of logical partitions that you may have, since logical partitions are always numbered beginning with 5.

If you're using LILO's extended generic MBR boot-loader program in the MBR, i.e.
 

     lilo -M /dev/sda ext

this will allow LILO's first-stage boot-loader program to be installed in the first sector of a logical partition, in addition to a primary or extended partition.  However, I would try to avoid such non-standard usage if I were you.  Even so, a logical partition in which LILO's first-stage boot-loader program is installed must otherwise qualify based on usage.  For example, it can't be a Linux swap partition, a partition containing an XFS filesystem, or a partition which already contains a VBR boot-loader program that must be preserved, such as OS/2 installed in a logical partition.

compact tells LILO to issue read requests for consecutive sectors as a group, instead of one by one.  This makes the map file much smaller and greatly decreases load time.  Up to 127 contiguous sectors can be read into storage with a single BIOS call when this option is used.  For some very old BIOSes this option may not work properly due to a BIOS bug, and you may have to refrain from using it, but that is very unlikely.  (Unfortunately, this BIOS bug is not tested for by the LILO diagnostic diskette.)  Sometimes, the compact option works for hard disks, but not for floppy disks, or vice versa.  You may have to do some trial-and-error experimentation.  If LILO works without the compact option, but fails with the compact option, then your BIOS probably has a bug.  You should check with the manufacturer to see if a BIOS update is available which fixes this bug.  If not, then don't use the compact option.

default=Linux specifies the kernel to be booted by default.  "Linux", in this case, is a label name.  It matches up with label=Linux, which is specified later on.

delay=40 specifies that LILO will wait for 40 tenths of a second (4 seconds) for an interrupt before booting the default kernel.  If you want to boot something other than the default kernel, press the Shift key (by itself) within 4 seconds of seeing LILO written to the screen.

install=text tells lilo to install the traditional text-mode interface.  (There is also a menu-based interface for you people that like that stuff.)

large-memory tells LILO that it may use memory above the 16M line.  The BIOS must have support for memory addressing above 16M in Int 15h Function 87h for this option to work.  All BIOSes are supposed to support this; but in practice, some don't due to a bug.  Without the large-memory option, both the compressed kernel image and the compressed initial RAM file system image are loaded below the 15M line.  The kernel is free to allocate the uncompressed version of the initial RAM file system image anywhere it wants, but the kernel has to be decompressed in place.  And kernel decompression occurs before the initial RAM file system image is decompressed (which means that the memory occupied by the compressed version of the initial RAM file system image is still allocated and cannot be overlaid).  To allow as much room as possible for kernel image decompression, the compressed kernel image is loaded at as low an address as possible (never below 1M) and the compressed initial RAM file system image is loaded at as high an address as possible, but no portion of it is allocated above 15M.  (Some computers have a "memory hole" between 15M and 16M.  That is what LILO is trying to avoid.)

If your compressed initial RAM file system image does not fit in its traditional location (i.e. there is not enough room for the compressed kernel image to decompress itself in place) and your BIOS does not support memory addressing above 16M via Int 15h Function 87h, something has to give.  For example, you can specify MODULES=dep instead of MODULES=most in either /etc/initramfs-tools/initramfs.conf or /etc/initramfs-tools/conf.d/driver-policy.  If the latter file exists, it overrides the former.  I recommend using driver-policy, since that makes it possible to avoid changing initramfs.conf; and thus initramfs.conf can be replaced by a newer version of initramfs-tools.  I recommend using MODULES=dep, even when large-memory is used, because it makes the initial RAM file system image smaller and allows the initial RAM file system image to load faster.  Another option is to build a custom kernel which eliminates unneeded function to reduce the size of the kernel and its initial RAM file system.  (See my kernel-building web page for more information about how to build a custom kernel.)

Note: when using MODULES=dep, you should not "cross-build" your initial RAM file system images but always build them while running the corresponding kernel.  For example, if your computer has a traditional IDE hard disk (PATA), the 2.6.32-3-686 and 2.6.32-5-686 kernels use different drivers for this hard disk.  If you build the initial RAM file system image for the 2.6.32-3-686 kernel while running the 2.6.32-5-686 kernel (or vice versa) and use MODULES=dep, the initial RAM file system image will not contain the right drivers; and the next time you try to boot the other kernel, it will not boot.

If you are running lilo 23.0 or later and you know (because of running the LILO diagnostic diskette) that your BIOS does not support access to extended memory above 16M via Int 15h Function 87h, you can explicitly specify the small-memory option (which of course is mutually exclusive with the large-memory option).  If you do this, lilo will warn you at map installer time if the decompressed kernel image and the compressed initial RAM file system image taken together will not fit between the 1M line and the 15M line.  This will allow you to do something about it before you attempt to shutdown and reboot!

lba32 tells LILO to use 32-bit logical block addressing to read the kernel image file and the initial RAM file system image file from the disk.  (This also applies to the map file and to the VBRs of any non-Linux operating systems that LILO may be called upon to chain load.)  This is valid if your BIOS supports EDD packet addressing.  This is the case for most modern systems.  If this is not the case, use the linear option instead.  In this case, your /boot partition is restricted in location to that portion of the disk which is addressable according to the disk geometry returned by BIOS Int 13h Function 08h.

root="UUID=04db5929-51e6-424a-ac5b-a592b96b9d04" specifies which disk partition is to be mounted as the permanent root file system.  Again, you have to know your system to know what to specify here.  An alternate form is root="LABEL=xxx...".  This should match what is specified for the / mount point in /etc/fstab.  (The syntax is a little different, but the UUID or LABEL should be the same.)  The use of a block special file name, such as root=/dev/sda2, or a symbolic link to a block special file name, such as root=/dev/disk/by-uuid/04db5929-51e6-424a-ac5b-a592b96b9d04, is supported; but it is not recommended.  If the root device is specified as a block special file name or a symbolic link to a block special file name, lilo translates this specification into a hexadecimal number containing the kernel major and minor device numbers for that partition (as currently in use when the lilo command is run); and LILO passes that hexadecimal number (with leading zeros suppressed) to the kernel at boot time.  The full format of this 16-hex-digit number is MMMMMmmmmmmMMMmm, where the upper-case "M"s represent the hex digits of the major device number and the lower-case "m"s represent the hex digits of the minor device number.  For example, what the kernel sees at boot time might be something like
 

     root=802

This would mean major device number 0x8 and minor device number 0x2, which would be the second partition of the first SCSI disk. 

Note: at the time of this writing, initramfs-tools can only correctly handle such numbers if the major number, in decimal, is less than or equal to 4095 and the minor number, in decimal, is less than or equal to 255.  In other words, the first eleven digits of the sixteen-digit hexadecimal number must be zero in order for initramfs-tools to correctly handle the number.  I anticipate this bug eventually being fixed.

Since the ata and scsi drivers use different major numbers (8 for SCSI, 3 for traditional IDE), it will not work as expected if the major number used at map installer time does not match the major number used at boot time.  For example, if the lilo command is run under kernel 2.6.32-5-686, the major number used at map installer time is 8 (SCSI).  If kernel 2.6.32-3-686 is subsequently booted and the hard disk is IDE, the device major number will be 3 (IDE), not 8; and the root partition will not be found.  Further difficulties are encountered if the devices are not discovered in the same order at boot time as they were discovered in the boot session when the lilo command was run.  Only the root="UUID=xxx..." and root="LABEL=xxx..." forms can be used for a driver-independent and discover-order-independent specification.  Note that both of these forms require quotation marks, since the parameter value contains an equal sign.

Using a root specification of the form root="UUID=xxx..." or the form root="LABEL=xxx..." causes LILO to pass that literal string to the kernel at boot time in the kernel boot parameters.  For example,
 

     root=UUID=xxx...

(Note that the quotes are removed by the time the value is passed to the kernel.)  However, in order for the kernel to be able to successfully translate this into a specific partition, such as /dev/sda2, your kernel must have an initial RAM file system; udev and its dependencies must be in the initial RAM file system; and udev must have been previously started with enough of a profile to enable it to find all the disk devices and their partitions, to have read all the UUIDs and LABELs from all the partitions, and to create symbolic links for them in the directories under /dev/disk.  Otherwise, the kernel will not be able to find the root file system.  Stock Debian kernels are set up this way; but if you have built a custom kernel without an initial RAM file system, you will not be able to use this method to identify the permanent root file system in the "root" parameter in /etc/lilo.conf.  This is yet another good reason to use an initial RAM file system.

While we're on the subject of the root file system, there are actually three stages of the root file system in a typical Debian boot.  Stage one is the initial RAM file system.  This is specified by the initrd configuration file statement.  LILO loads this file into storage along with the kernel image file and passes the address of the initial RAM file system to the kernel in a special way when it passes control to the kernel.  After the kernel decompresses itself and performs some other preliminary initialization, it decompresses the initial RAM file system image and mounts the decompressed image as the root file system during the early stages of the boot process.  A number of kernel modules are loaded from the initial RAM file system at this stage, including those required to do I/O to the disks.  The udev user-space process is launched from the initial RAM file system to find all the disk labels and UUIDs and create symbolic links in the various /dev/disk directories at this stage.

Stage two mounts the permanent root file system read only.  This is specified by the root and read-only configuration file statements.  fsck is run on the root file system at this stage, if needed; and repairs, if needed, are made.  After mounting the permanent root file system read only, the storage formerly occupied by the initial RAM file system is freed.  Also at this stage, the file /etc/fstab is read into storage from this stage two root file system.  Following that, swap partitions which are specified in /etc/fstab are activated.  Additional kernel modules may be loaded at this stage, and another instance of udev may be started to recognize other devices.

Stage three mounts the permanent root file system read/write.  This is specified by the /etc/fstab file, which was read into storage from the stage two root file system.  Other file systems are also checked and mounted at this stage.  Another instance of udev may be started at this stage to initialize a third group of devices.  It is important that the partition identified by the root configuration file statement in /etc/lilo.conf and the root file system partition identified in /etc/fstab be the same partition.

read-only specifies that the permanent root file system identified by the root configuration file statement be mounted read only during stage two.  (See the description of root above.)  Debian init scripts expect this.  This results in the "ro" token being passed to the kernel in the kernel boot parameters.

The vga option tells the kernel what hardware text video mode is desired at boot time.  "normal" means the default 80x25 text mode set during power on.  Adjust this as desired.  For documentation on this, see Documentation/svga.txt in the Linux kernel source code.  See also The vga Option and Linux Fonts in this document.

The image, label, and initrd combinations specify a kernel which can be booted.  Note the use of the standard symbolic link names.  The default kernel (label=Linux) specifies /boot/vmlinuz as the kernel image name and /boot/initrd.img as its initial RAM file system image name.  This tells you two things: (a) symbolic links will be needed, and (b) they must be maintained in the /boot directory.  This ties in with /etc/kernel-img.conf, as we will see shortly.  The backup kernel image (label=LinuxOld) specifies /boot/vmlinuz.old as its kernel image name and /boot/initrd.img.old as its initial RAM file system image name.  These are the standard backup symbolic link names.  This menu item has the optional attribute, which means that lilo will not flag an error if these symbolic links do not exist (or are broken links).  The idea here is that you may have only one kernel installed; and if there is no backup kernel, that is not an error.

To boot the backup kernel, press the Shift key (by itself) within four seconds after you see "LILO" displayed on the screen.  This will cause a "boot:" prompt to be displayed.  You can then type "LinuxOld" (without the quotes) at the boot prompt and press Enter, and it will boot the backup kernel (if there is one).  If you can't remember the labels of your kernels, press the Tab key at a boot prompt.  LILO will display the names of the valid labels and issue another boot prompt.  You can then type the label of the kernel you want and press Enter.  If you want to add boot parameters (in addition to those specified in the append option), you can add them after the label name.  For example, if you type "Linux single" (without the quotes) at a boot prompt and press Enter, LILO will boot the default kernel in single-user mode. 

When supplying additional boot parameters or overriding boot parameters using the "boot:" prompt, you may supply parameters which are not valid for inclusion in the append option, such as root, ro (override with rw), or vga.  The "boot:" prompt supports a number of line-editing keys as well, such as Backspace, ^H, ^I, ^U, and ^X.  Backspace and ^H both delete the previously-typed character.  ^I is the same as pressing the Tab key: it displays a list of the valid label names.  ^U and ^X both delete the whole line and allow you to start over.

Once you are finished creating/editing the /etc/lilo.conf file, save the changes and exit the editor.  Now edit /etc/kernel-img.conf.  If you're running Squeeze (6.0) or later releases, it should look like this:
 

     # Kernel image management overrides
     # See kernel-img.conf(5) for details
     do_symlinks = yes
     relative_links = yes
     link_in_boot = yes

The combination of do_symlinks = yes, relative_links = yes, and link_in_boot = yes causes the symbolic links needed by lilo to be created and maintained in the /boot directory, which is where lilo will be looking for them.  The links will use relative path names instead of absolute path names.

In Squeeze (6.0) and later releases, the lilo package installs hook scripts in /etc/kernel/postinst.d, /etc/kernel/postrm.d, and /etc/initramfs/post-update.d which take care of getting the boot-loader installer run when it needs to be run.  These hook scripts can be improved upon, however.  See Debian bug report 599934 for details.  Although this "bug" has been fixed and is now incorporated into Wheezy (7.x), this fix might not ever appear in a Squeeze (6.0) stable point release; so you might want to think about fixing it yourself if you run Squeeze (6.0).

For Squeeze (6.0) and later releases, do_symlinks = yes only works for stock kernel image packages.  If you are using a custom kernel image package created by make-kpkg or by "make deb-pkg", do_symlinks = yes will have no effect.  In order to maintain the symbolic links when using a custom kernel, you will need to install your own hook scripts to maintain the symbolic links.  I have provided some that you can use for this purpose on my kernel-building web page.  (See the "Customizing the Squeeze (6.0) Environment" sub-section of "Customizing the Kernel Installation Environment".)

If you install these hook scripts, which I call zy-symlinks, they will be in effect for both custom kernels and stock kernels; so you should set do_symlinks = no in /etc/kernel-img.conf to avoid duplication of effort.  Of course, if do_symlinks = no is specified, the values specified for relative_links and link_in_boot are meaningless, so you might as well delete them.  In summary, all you really need in /etc/kernel-img.conf for Squeeze (6.0) and later releases is do_symlinks = no, provided that the zy-symlinks hook scripts have been installed.  After you are finished editing /etc/kernel-img.conf, save the changes and exit the editor.

Now for a not-so-obvious step, which is needed for all releases.  We need to set the initial symbolic links manually.  They will be maintained automatically after this, but we have to define them correctly the first time.  For example:
 

     cd /
     rm vmlinuz
     rm initrd.img
     rm vmlinuz.old
     rm initrd.img.old
     cd boot
     ln -s vmlinuz-2.6.32-5-686 vmlinuz
     ln -s initrd.img-2.6.32-5-686 initrd.img
     ln -s vmlinuz-2.6.32-3-686 vmlinuz.old
     ln -s initrd.img-2.6.32-3-686 initrd.img.old

Use kernel image file names and initial RAM file system image file names that are appropriate for your system, depending on what kernel version(s) are installed.  Of course, if there is no previous kernel, you don't have to remove or create the "old" links.

Now remove the residual grub files that didn't get purged by issuing the command "rm -r /boot/grub".  On Squeeze (6.0) and later releases, manually delete any hook scripts which were installed by your old boot-loader package that didn't get purged when the package was purged.  In particular, check /etc/kernel/postinst.d, /etc/kernel/postrm.d, /etc/kernel/preinst.d, /etc/kernel/prerm.d, and /etc/initramfs/post-update.d for files such as zz-update-grub, etc.  If files like these are found, delete (rm) them.  Make sure that the postinst_hook = update-grub and postrm_hook = update-grub lines, if they were present, have been deleted from /etc/kernel-img.conf.

Finally, make sure everything works by issuing "dpkg-reconfigure linux-image-$(uname -r)".  The maintainer script will rerun depmod, which really isn't necessary in this case; then it will rebuild the initial RAM file system image, which may or may not be needed depending on whether or not you changed any files under /etc/initramfs-tools; then it will maintain the symbolic links (for Squeeze (6.0) and later releases if the zy-symlinks hook scripts have been installed) or issue informational messages indicating that the maintenance of the symbolic links is being skipped; then finally it will run lilo, which will write out LILO's first-stage boot loader for the first time.

The first time lilo rewrites a boot sector (MBR, VBR, or first EBR) to write out its first-stage boot loader, it will make a backup copy of the old version of the sector for backout purposes in the /boot directory.  The naming convention of the backup copy is boot.xxxx, where xxxx is the kernel composite device number of the partition or disk whose boot sector is being rewritten.  (See the description of the root configuration file record for more information about kernel composite device numbers.)

If you formerly had GRUB (either version) installed in the MBR and you elected to install LILO's first-stage boot-loader program somewhere other than the MBR, then GRUB still may be installed in the MBR.  Or perhaps removing GRUB restored whatever was in the MBR before GRUB was installed, whatever that was.  In either case, you want to make sure that a functional generic MBR boot-loader program exists in the MBR before you shutdown your system.  Otherwise, you probably won't be able to boot anything.  The LILO generic MBR boot-loader program will meet this requirement.  For example:
 

     lilo -M /dev/sda mbr

        or

     lilo -M /dev/sda ext

depending on whether you want the standard (mbr) or extended (ext) generic MBR boot-loader program installed.  (Specify the appropriate device name for your boot disk.)

The first time that lilo writes a generic MBR boot-loader program to the MBR, it will make a backup copy of the old MBR in the /boot directory first, for backout purposes.  The name of the file will be boot.xxxx, where xxxx is the kernel composite device number of the boot disk.  (See the description of the root configuration file record for more information about kernel composite device numbers.)

Also, make sure that the partition which contains the LILO first-stage boot loader is marked active.  This can be done with the "activate" (a) subcommand of the Linux "fdisk" command, but take care: marking a partition active with the "activate" (a) subcommand does not automatically deactivate the partition which was previously marked active.  Therefore, if you aren't careful, you can easily end up with two partitions marked active at the same time.  In this case, some MBR boot-loader programs will chain load the first partition they find which is marked active.  And in most cases, that will be the wrong one.  Other MBR boot-loader programs will refuse to boot anything under these conditions.  If you're using fdisk, deactivate the active partition first with the "activate" (a) subcommand; then activate the new partition with the "activate" (a) subcommand.  Use the "print" (p) subcommand to verify that only one partition is marked active.  (Active partitions will have an asterisk (*) in the "Boot" column.)  Then use the "write" (w) subcommand to apply changes and exit fdisk.

You can also use lilo to mark a partition active.  For example:
 

     lilo -A /dev/sda 2

will mark the second partition on the first SCSI disk as active.  Unlike the Linux "fdisk" command, using lilo to mark a partition active will also automatically mark all other partitions inactive.  However, if you plan to mark a logical partition active with lilo, make sure that you install the extended version of the generic MBR boot-loader program first.  A logical partition has a partition number of five or higher.  For a partition number higher than four, lilo will not mark it active (or inactive) unless the extended version of the generic MBR boot-loader program is already installed.  You can also specify the above command without a partition number to determine which is the currently active partition on a disk.  However, a partition number higher than four will not be searched for an active flag unless the extended version of the generic MBR boot-loader program is already installed.

If you use the Linux fdisk command to mark a logical partition active (or inactive), fdisk does not check to see if the extended version of the LILO generic MBR boot-loader program is installed.  fdisk will let you mark a logical partition active or inactive regardless of which boot loader, if any, is installed in the MBR.  However, chain loading LILO's first-stage boot loader from a logical partition will not actually work at boot time unless the extended version of the LILO generic MBR boot-loader program has been installed to the MBR prior to booting.

If you changed a file in /etc/initramfs-tools and you have a backup kernel, shutdown and boot your old kernel; then run "update-initramfs -u" on it to rebuild the initial RAM file system image for the backup kernel too.
 

     update-initramfs -u -k $(uname -r)

This will rebuild its initial RAM file system image, which is necessary; and lilo will also get run again.  (Remember not to "cross-build" your initial RAM file system images if you are using MODULES=dep.)  If there is no backup kernel, rebuild the initial RAM file system image for the running kernel, just to make sure that lilo gets run when "update-initramfs -u" is issued.  This is an important test.

Booting Non-Linux Operating Systems with LILO

If you decided to install the LILO first-stage boot loader to the MBR and you have other operating systems on your hard disk that you need to be able to boot as well, such as Windows, you should know that the LILO first-stage boot loader ignores the partition boot flags.  LILO, apart from user intervention within the delay time, will always boot the kernel identified by the default option in /etc/lilo.conf, regardless of which partition is marked active.  Therefore, if you decide to install LILO to the MBR, you will need to add the non-Linux Operating System(s) to LILO's boot menu.  (Actually, you can add the non-Linux Operating System(s) to LILO's boot menu even if you didn't install LILO to the MBR, if you wish to do so.  This will result in a "double chain load" boot.  This is not the typical setup, but it works.)

The man page for lilo.conf will tell you how to do this, but here is an example for the most common case: DOS or Windows in a primary partition on the boot drive:
 

     other=/dev/disk/by-id/ata-IBM-DBCA-203240_HP0HPL43952-part1
          label=Windows

Add something like the above to the end of the per-image section of /etc/lilo.conf, save the changes, and exit the editor.  Then run the "lilo" command with no operands.  (Anytime you make a change to /etc/lilo.conf, you must run the lilo command afterwards for the change to take effect.  And of course, you won't see the effect of the changes until the next boot.)  "Windows" will now be a valid name to type at a LILO "boot:" prompt, though it cannot be passed any arguments.  "Windows" will also now show up as a valid name to type if you press the Tab key at a "boot:" prompt, along with "Linux" and "LinuxOld".

Note that DOS or Windows partitions will probably not have a UUID; so you may need to specify a udev-created symbolic link from /dev/disk/by-id rather than from /dev/disk/by-uuid, as was done above.  Again, if you have a traditional IDE hard disk (PATA), use a symbolic link which starts with "ata" rather than one which starts with "scsi", as the ones which start with "scsi" will not be present if you boot an older kernel that does not use the libata SCSI emulation drivers.  A Windows partition might have a label symbolic link in /dev/disk/by-label.  You can check to see if you like.  The point is that you want to use a udev-created symbolic link rather than a direct block special file name, such as /dev/sda1.

MBR versus VBR or EBR

Most people tend to install LILO to the MBR, since it is the path of least resistance.  But if you have another operating system besides Linux installed on your hard disk, especially if it is a recent release of Windows, it is usually safest to install LILO to a VBR or the first EBR rather than the MBR.  The reason for this is that Windows thinks it owns the MBR, and applying maintenance to Windows may cause Windows to replace the currently-installed MBR boot-loader program (LILO's first-stage boot loader) with the latest Windows version of the MBR boot-loader program.  Installing Windows on your machine after Linux is already installed tends to do the same thing: replace LILO's first-stage boot loader with the Windows MBR boot-loader program.  If it does this, your Linux system is now unbootable!

Another reason to install LILO to a VBR or the first EBR, rather than the MBR, is that BIOS utilities might work better with this configuration.  For example, I have a number of Dell machines that contain a "utility partition".  The hardware vendor intended for the utility partition to be bootable from the BIOS boot menu.  For example, if one presses the F12 key during POST (Power On Self Test), the BIOS boot menu is displayed.  One of the choices in the BIOS boot menu is "Boot to Utility Partition".  This only works if a standard MBR boot-loader program is installed to the MBR (i.e. one which simply chain loads the boot loader installed in the active partition).  If LILO's first-stage boot-loader program is installed to the MBR, the utility partition cannot be booted in this manner.  (It doesn't work if either version of GRUB is installed to the MBR either.)  The utility partition can be added to LILO's boot menu, of course; but the hardware technicians are not used to booting the utility partition via the LILO "boot:" prompt.  Installing LILO's first-stage boot-loader program to a VBR or the first EBR, marking that partition active, and using a generic MBR boot-loader program in the MBR solves the problem.

Note: If you install LILO to a VBR or the first EBR in order to avoid a conflict with the Windows MBR boot-loader program, make sure that the Windows FDISK program (or DISKPART in newer releases) is installed in your Windows system; so that you can mark the LILO partition for booting while running Windows.  Otherwise, if Windows gets marked as the active partition during the application of Windows maintenance, you won't be able to switch back to Linux again without first booting from a rescue system to alter the boot flags.  Many non-Linux fixed-disk utilities, such as the FDISK command of DOS or older versions of Windows, do not believe in booting from an extended partition (i.e. the first EBR) and will refuse to mark it active.  They may also refuse to mark a logical partition active, or it may not even be possible to make such a request with this tool.  If this is where LILO's first-stage boot-loader program was installed, you will first need to boot some type of rescue system in order to alter the boot flags in the PT; so the partition containing LILO's first-stage boot-loader program is once again marked active (and all other partitions are marked inactive).  This is another reason why installing LILO's first-stage boot-loader program to a primary partition is preferred: every fixed-disk utility program ever written allows a primary partition to be marked active.

Creating Boot Floppies

You may wish to create a boot floppy to boot your system in case something goes wrong with your normal hard disk boot mechanism.  For example, maybe you elected to install LILO to the MBR and you want a backup method of booting your Linux system in case Windows replaces the MBR during the application of Windows maintenance.  Or maybe you are a programmer making changes to LILO and you don't know if your changes are going to work or not.  Or maybe you want to test LILO before wiping out grub.  Whatever the reason you want to create one, here's how to do it.

First, insert a formatted floppy diskette in the floppy drive.  It can either be DOS formatted or formatted with a Linux file system, such as ext2.  It should be a diskette with the default density of the drive.  Make sure it is write enabled.  Then, issue a command something like this:
 

     lilo -b /dev/fd0 -m /boot/floppy_map

This assumes that the LILO configuration file, /etc/lilo.conf, has already been created.  The two options specified on the command line override settings in the configuration file.  (By default, this is /etc/lilo.conf; but the name of the configuration file can also be overridden by means of the "-C" option.  If the "-b" and "-m" options are specified in conjunction with the "-C" option, they will override settings in the file specified by the "-C" option.)  The -b option overrides the boot configuration file statement.  It specifies that the LILO first-stage boot-loader program is to be written to /dev/fd0 instead of where it would normally be written.  (This assumes that /dev/fd0 is the block special file name or a symbolic link to the block special file name of your floppy drive.  Change this as appropriate for your system.)  The -m option overrides the map configuration file statement.  It specifies that the map file is to be written to /boot/floppy_map instead of where it would normally be written.  (In the sample /etc/lilo.conf file above, we did not use a map configuration file statement.  If one is not provided, the default value is /boot/map.)  You can safely ignore warnings such as
 

     Warning: Ignoring entry 'boot'
     Warning: Ignoring entry 'map'
     Warning: boot record relocation beyond BPB is necessary: /dev/fd0
     Warning: The boot sector and map file are on different disks.

(The third warning, the one about boot record relocation beyond the BPB, will only be present if you use a DOS-formatted floppy.)  It is very important to override both the boot option and the map option when creating a boot floppy.  If you override only the boot option when creating a boot floppy, you will mess up the correctly-working boot mechanism of your hard disk.  (Or if LILO has not been installed to the hard disk yet, then when you do install LILO to the hard disk you will mess up the correctly-working boot mechanism of your boot floppy.)  Why is this the case?  Well, the first-stage boot loader points to the map file that was created along with it.  This map file contains the second-stage boot loader, as well as pointers to the blocks of the kernel image files and the initial RAM file system image files that were referenced in the configuration file at the time the lilo command was run.

If lilo is run with the "-b" option to direct output to a floppy, but without the "-m" option to override the map file name, then a new map file with the same name as before will be created; and the boot floppy will point to it.  The old version of the map file will be deleted.  This means that the physical blocks on disk which contained the old version of the map file are now unallocated blocks in the file system and are eligible for re-use.  The first-stage boot loader which was earlier installed to the hard disk still points to these blocks.  It may continue to work for a time, but eventually one or more of these blocks will be reused for another file, and the contents of the blocks will change.  And now the hard disk boot mechanism no longer works.

As an alternative to overriding the boot and map specifications with command line options, you can edit the configuration file to alter the boot and map specifications temporarily; run lilo; then change the configuration file back the way it was.  Or you can have a separate configuration file that you only use for creating boot floppies and override the configuration file by using the "-C" option.  The important thing to remember is that each copy of a LILO first-stage boot-loader program created by running the lilo command must have its own unique map file associated with it.  Multiple copies of the LILO first-stage boot-loader program mean multiple map files.

Of course, to actually boot from the floppy diskette, your BIOS must be configured to allow booting from the floppy drive.  Also, if you want to keep your boot floppy current, you must manually reissue the above command as needed for maintenance reasons.  For example, if a kernel image file is updated or an initial RAM file system image file is rebuilt, you will need to rerun both the normal lilo command (which will normally be done for you by Debian) as well as the special lilo command to update the boot floppy (which you will need to do manually).

Boot floppies have their uses, but they are not a panacea for all boot problems with your hard disk boot mechanism.  For example, if you have installed LILO to the MBR and applying Windows maintenance caused the MBR boot-loader program to be replaced by the latest version of the Windows MBR boot-loader program, you can overcome this problem by booting your Linux system via a previously-created boot floppy, if you have one.  Then simply running the lilo command will reinstall the LILO first-stage boot-loader program to the MBR.

On the other hand, if you updated your kernel or you updated its initial RAM file system and lilo did not get rerun for some reason, then both your hard disk boot copy and your floppy diskette boot copy are out of sync with the hard disk.  If you can't boot from the hard disk, then chances are you can't boot from the boot floppy either.  If you have a backup kernel, you may still be able to boot that.  But if you don't have a backup kernel, or if the backup kernel or its initial RAM file system is also out of sync with the map files, you will need a rescue CD, or something of that nature, that has a completely self-contained boot mechanism.  A completely self-contained boot floppy would have to contain not only LILO's first-stage boot-loader program, but also the map file, a kernel image file, and its initial RAM file system image file.  And that is way too much data to fit on a floppy diskette when using modern Linux kernels and their initial RAM file system images.

If you do put anything on the floppy diskette other than the boot sector, you will need to mount the floppy diskette's file system.  Here's an example that puts the map file on the floppy disk:
 

     [insert formatted floppy diskette, enabled for writing]
     mke2fs -L LILO -t ext2 /dev/fd0
     mount /dev/fd0 -t ext2 /media/floppy0
     lilo -b /dev/fd0 -m /media/floppy0/floppy_map
     umount /media/floppy0
     [remove floppy diskette]

Do not remove the floppy diskette until after a successful umount, the floppy disk activity light has gone out, and the diskette has come to a complete stop. 

Sometimes, the udisks daemon will give you trouble.  The udisks daemon is designed to work with certain desktop environments, such as GNOME, to automatically mount disks inserted into removable-media drives.  This is more often a problem with CD-ROM drives, but it can be a problem for floppy drives too.  If this is the case, you can disable polling while you do your work.  For example, in one terminal session, issue the command
 

     udisks --inhibit-polling /dev/fd0

The command will not terminate, don't wait for it.  Then switch to another terminal session to do your work with the floppy diskette.  When you are finished working with the floppy diskette, and you have removed the floppy diskette from the floppy drive, then switch back to the terminal session which is running the udisks command and type ^C (Ctrl+C) to cancel it.

The superformat utility from the fdutils package has the capability to format a floppy diskette with a higher-than-normal capacity by using various tricks.  For example, a floppy diskette with a nominal capacity of 1440KB can be formatted with a capacity of up to 1992KB.  Some of these disk formats are usable as a LILO boot disk and some of them are not.  As a general rule, disk formats which use a sector size other than 512 bytes, use a non-standard data transfer rate, or require switching back and forth between sides in order to read logically contiguous sectors cannot be used as a LILO boot disk.  Of the formats described in the documentation, the following formatting techniques can, at least in theory, be used to format a LILO boot disk: "More sectors per track", "Using interleave", "Sector skewing", and "More cylinders per disk".  The following formatting techniques or formats cannot be used for a LILO boot disk: "Larger sectors", "Mixed sector sizes", "Smart use of the data transfer rate", "2M formats", "XDF format", and "XXDF format".

For a 3.5-inch high-density diskette in a 3.5-inch high-density drive, the maximum capacity that can be obtained for use as a LILO boot diskette is 1743KB, using 21 sectors per track, interleave factor 2, sector skewing, 83 cylinders, the standard sector size of 512 bytes, and the standard data transfer rate.  The throughput is 26KB/s.  Whether you can actually achieve this depends on your hardware.  Use these techniques at your own risk.

Special Considerations for USB Floppy Drives

Traditional floppy disk controllers and their attached floppy disk drives are often not included in newer systems anymore, especially in laptops and notebooks, where space is at a premium.  However, these systems often accept an externally-attached USB floppy disk drive as an accessory.  The BIOS is set up (or can be set up) to recognize the USB floppy drive as BIOS device number 0x00, the traditional "first floppy disk" BIOS device number, so that the machine can boot from this floppy drive.  Such is the case, for example, with my IBM ThinkPad X31.

There are special considerations for using a USB-attached floppy drive in Linux and with lilo.  Linux does not recognize such a device as /dev/fd0, the traditional Linux device name for the first floppy drive.  Thanks to the SCSI emulation drivers used by modern Linux kernels, the USB floppy drive shows up as a SCSI disk, with a name like /dev/sda, /dev/sdb, etc.  Like all other SCSI disks in the system, the actual device name can vary from one boot to the next, as has been previously discussed.  The trick is to get a symbolic link defined so that /dev/fd0 is a symbolic link to the actual block special file name of the floppy drive, regardless of what that name is in the current boot.  This does not happen automatically: you have to write your own udev rule to accomplish this.

Here's an example from my IBM ThinkPad X31.  I have created a file called /etc/udev/rules.d/99-local.rules.  It's contents are as follows:
 

     SUBSYSTEM=="block", SUBSYSTEMS=="scsi", ATTRS{vendor}=="TEAC", ATTRS{model}=="FD-05PUB", SYMLINK+="fd0"

(For older releases of udev, use SYSFS instead of ATTRS.)  Obviously this rule is geared to match on a specific make and model of USB floppy drive.  Change these values as appropriate for your specific floppy drive.  The "lsusb -v" command can help determine what those values should be.  Of course, a reboot is required after creating this file for the change to take effect; but after reboot you should now see a symbolic link called /dev/fd0 which points to the actual block special file name of the floppy drive, such as /dev/sdb.  The floppy drive does not need to be plugged in at boot time: you can hot-plug the drive.  However, a reboot is normally needed to activate the new udev rule.  The new udev rule must be in effect at the time the device is discovered in order for the symbolic link to be created.

The second thing you need to do is to tell lilo that the BIOS device number for this device is 0x00.  In the typical scenario, /dev/fd0 points to /dev/sdb, and lilo will generally assume that the BIOS device number for this disk is 0x81, which is incorrect.  Add the following lines to the global section of /etc/lilo.conf:
 

     disk=/dev/fd0
             bios=0x00

These lines must be in the file at the time that the lilo command is run.  Once you have done these two things, you should be able to use USB floppy drives with Linux and lilo.  If you use mke2fs to create a file system on a floppy disk mounted in an external USB floppy drive, you will probably get a warning message from mke2fs that you are trying to make a file system on the entire disk, rather than on a partition of the disk.  Linux normally treats external USB disk drives, whether floppy or hard, as though they were hard drives.  Go ahead and respond with a "y" to the "are you sure?" prompt.  It is actually a floppy drive, not a hard drive; and you want to treat the diskette as a non-partitioned device, not a partitioned device. 

Many of the utilities in the fdutils package, including superformat, do not work with USB floppy drives, since USB floppy drives do not allow direct access to the floppy controller.  However, once formatted in a standard floppy drive, it may be possible to use these extended format diskettes in USB floppy drives, provided the restrictions named above are observed.  It all depends on your hardware.  Again, use at your own risk.

Using USB Memory Sticks Instead of a Boot Floppy

USB memory sticks can generally operate in one of three modes: USB-FDD emulation mode (floppy drive emulation), USB-ZIP emulation mode (ZIP drive emulation), or USB-HDD emulation mode (hard drive emulation).  When booting from one of these devices, the BIOS will assign BIOS device number 0x00 to a USB-FDD device or a USB-ZIP device, the same BIOS device number traditionally assigned to the first floppy drive; but a USB-HDD device will get BIOS device number 0x80, the same BIOS device number traditionally assigned to the first hard drive.  You don't want to disturb BIOS device number assignments to your hard drives; therefore, you should use a USB memory stick in USB-FDD emulation mode or USB-ZIP emulation mode.

USB-ZIP emulation mode is preferable, since that allows for more usable disk space.  (In USB-FDD mode you are limited to 1.44M of disk space, regardless of how much memory the device has.)  In USB-ZIP emulation mode you will have enough disk space to put the map file, a kernel, and its initial RAM file system all on the ZIP drive, in addition to the boot sector, which will allow you to have a completely self-contained boot disk.  However, if your BIOS does not support booting from a USB-ZIP device, USB-ZIP mode will do you no good.  Make sure that your BIOS supports booting from a USB device in the operational mode that you are going to use.  Using a USB memory stick in USB-FDD emulation mode or USB-ZIP emulation mode with Linux and LILO has the same considerations as using an external USB floppy drive, as documented above.

The vga Option and Linux Fonts

The vga option is the only kernel option which LILO sets by zapping the kernel image boot sector (in memory) after the kernel image has been loaded into memory, but before control is transferred to the kernel.  (LILO zaps the value set during kernel compilation by the SVGA_MODE variable in arch/x86/boot/Makefile.)  All other options are passed in character form via the kernel command line, but the vga option cannot be handled by passing a string of characters on the kernel command line because the kernel sets the video mode before it parses the command line!  Therefore, if you are perusing the boot messages with
 

     dmesg|less

you will never see the vga option in the "Kernel command line:" line.

Changing this option may necessitate selecting a font of a different pixel height than the default pixel height of 16.  For example, let's say that you want an 80x34 text video mode (80 columns by 34 lines) instead of the default 80x25 text video mode.  You would code vga=3846 to get this video mode.  (lilo also now supports specifying the video mode in hexadecimal, such as vga=0xf06.)  But this video mode requires a 14-point font, and the default font in Linux is a 16-point font.  The boot messages will look OK initially, since initially the font provided by the video BIOS will be used; but once Linux loads its default font, everything will be messed up.

For Squeeze (6.0) and later releases, console-setup is the controlling package for fonts; and you would edit the file /etc/default/console-setup.  You would either specify the font via the FONTFACE and FONTSIZE variables (FONTFACE=VGA, FONTSIZE=14) or set FONTFACE and FONTSIZE to the null string and specify a font via the FONT variable.  For example, you might specify FONT="lat1u-14.psf.gz".  The valid values for the FONT variable are the file names in the directory /usr/share/consolefonts.  There may be a few fonts in this directory if the console-data package is not installed; but if the console-data package is installed, you will have a large selection of fonts to choose from.  In any case, make sure that you specify a font of the correct height for the video mode you have selected.

If you run an X server on this machine, keep in mind that the X drivers for a number of chipsets under Squeeze (6.0) and later releases now force Kernel Mode Setting (KMS) to be in effect in order for the X driver to work or at least enable KMS by default.  One side effect of this is that the non-X virtual consoles (by default, vt1-vt6) operate in a frame-buffer environment, which will make the initial text video mode selected by the vga option obsolete very soon after booting.  In this case, you should leave "vga=normal" specified (or don't specify the vga option at all).

At the time of this writing, the free nv driver, which has been dropped from the distribution in Wheezy (7.x), but is still present in Squeeze (6.0), and the proprietary nvidia driver are notable exceptions to the general trend of using KMS-based X drivers in Squeeze (6.0) and later releases.  If you are using either of these drivers for your X server, your text consoles (by default, vt1-vt6) are still operating in a true hardware text video mode; and you can use the vga option to specify which hardware text video mode you want to use for your text consoles.  (There are other non-KMS-based X drivers too, such as the ATI MACH64 driver.)  As documented in Documentation/svga.txt, you can use vga=ask to determine which modes are available for the combination of video chipset, video BIOS, and monitor that you are using, particularly if you do a scan.  Just remember that each text video mode has its own font height requirement, which unfortunately is not listed on the screen!  The font height requirement (i.e. the character cell height) is listed in the above-mentioned documentation for some, but not all, of the available text modes.

If you cannot find documentation for the character cell height required for a given text video mode, or if the documentation proves to be incorrect, the grabmode utility from the svgatextmode package can be used to determine the font height associated with a true hardware text video mode.  The svgatextmode package was last seen in the Lenny (5.0) distribution: it was dropped from Squeeze (6.0) before Squeeze became the stable release due to lack of support in svgatextmode for many current video chipsets and the growing trend in Linux toward KMS and framebuffer environments, which makes the portion of the package which sets video modes useless.  However, the grabmode utility from this package examines only standard VGA registers and will work with any chipset, as long as the card has VGA-compatible registers.  You must run grabmode as root on a text console which is using a true hardware text video mode to obtain correct results.

For your convenience, here is a link to a copy of the grabmode binary for the i386 architecture.  (It has some bug fixes not present in the official version.)  I hope it does not violate any licensing terms to distribute the binary by itself.  If it does, someone please tell me and I will take whatever steps are necessary to correct the problem.  Be sure to download it in binary.  Then switch to root and issue:
 

     chgrp root grabmode
     chown root grabmode
     chmod +x grabmode

It must then be moved to a directory which is in the PATH environment variable or else be path qualified when you invoke it.  For example, if it is in the current directory and the current directory is not in the PATH environment variable, you can use
 

     ./grabmode

to invoke it.  No parameters are necessary, and output is to the console.  As with anything else on this website, use it at your own risk.  (However, since it only examines the VGA registers and does not attempt to alter them, it should be safe.)  Run this program while using the font set by the video BIOS.  (Comment out the font specifications in /etc/default/console-setup and reboot, so that Linux doesn't load its own font.)  The assumption is, of course, that the video BIOS will always load a font of the correct height.  Here is some sample output from grabmode:
 

     ./grabmode: WARNING: Please be patient. This may take a while (up to 1 minute)
     "80x34"   28.343   640 680 776 800   476 491 493 525   -Hsync -Vsync  font 9x14  # 31.491kHz/59.98Hz

Do not interrupt the program (with ^Z for example) or switch to another virtual terminal while the program is running.  For best results, the system should be otherwise idle while running this program.  Output is in a similar format to a "Modeline" statement for an X server configuration file.  The only thing of interest in this context is the character cell size, which is listed immediately after the "font" keyword.  In the above example, the character cell size is 9x14.  The first number is the character cell width.  It will always be either 8 or 9, and you don't need to be concerned about it.  (All the font specifications contain the information needed to construct both eight- and nine-pixel-wide fonts.)  The second number (after the x) is the character cell height.  If the keyword "font" and the subsequent character cell size are missing from the output of grabmode, then you probably aren't running it in the foreground from a virtual terminal which has control of the real terminal and is running in a true hardware text video mode.

As a sanity check on the output, the horizontal resolution in pixels divided by eight should equal the number of text columns; and the vertical resolution in scan lines divided by the character cell height should equal the number of text rows.  (Yes, you always divide the "horizontal resolution" by eight, even if the character cell width is nine.  That is due to the fact that the "horizontal resolution" is not really the horizontal resolution in pixels.  It is actually the number of horizontal character cells times eight.  For an eight-pixel-wide font, this is equivalent to the horizontal resolution in pixels; but for a nine-pixel-wide font it is not.)  In the above example, 640/8 = 80 and 476/14 = 34.  Thus, this should be a mode for 80 text columns and 34 text rows.  Obviously, specify a font with a font height of 14 for this video mode.

Perhaps a specific example will help bring all the pieces together.  Suppose you don't run an X server (or your X server has a non-KMS-based driver); and you want a non-standard text video mode; but you're not sure which ones are available, which one you want, or what font size to use.  Here's what to do.  Under Squeeze (6.0) or a later release, edit /etc/default/console-setup.  Set the FONTFACE and the FONTSIZE variables to the null string, and comment out or delete any FONT variable specification.  This will prevent Linux from loading its own font during the boot process and will leave you with the font loaded by the video BIOS when the video mode was originally selected.  (Of course, this change does not take effect until the next boot.)  For example,
 

     ...
     # FONTFACE=VGA
     # FONTSIZE=14
     FONTFACE=""
     FONTSIZE=""
     ...
     # FONT="lat1u-16.psf.gz"
     ...

Save the changes and exit the editor.  Now shutdown and reboot.  When you see LILO written to the screen
 

     LILO 22.8
press the Shift key before the delay time expires.  This will result in a "boot:" prompt.
 
     LILO 22.8 boot:

(I am assuming here that you used install=text in /etc/lilo.conf.)  Remind yourself of the names of your kernel labels by pressing the Tab key.  This will result in LILO displaying the names of your kernel labels and then displaying another "boot:" prompt, as follows:
 

     LILO 22.8 boot:
     Linux               LinuxOld
     boot:

At the "boot:" prompt, type the name of your default kernel followed by vga=ask and press Enter.
 

     LILO 22.8 boot:
     Linux               LinuxOld
     boot: Linux vga=ask

This will boot the default kernel with vga=ask specified.  This will result in the following display:
 

     LILO 22.8 boot:
     Linux               LinuxOld
     boot: Linux vga=ask
     Loading Linux
     BIOS data check successful
     Probing EDD (edd=off to disable)...ok
     Press <ENTER> to see video modes available, <SPACE> to continue, or wait 30 sec

The last two lines of output are from the kernel itself: all previous output is from the LILO boot loader.  At this point, press the Enter key within 30 seconds.  This will result in output something like this:
 

     LILO 22.8 boot:
     Linux               LinuxOld
     boot: Linux vga=ask
     Loading Linux
     BIOS data check successful
     Probing EDD (edd=off to disable)...ok
     Press <ENTER> to see video modes available, <SPACE> to continue, or wait 30 sec
     Mode: Resolution:  Type: Mode: Resolution:  Type: Mode: Resolution:  Type:
     0 F00   80x25      VGA   1 F01   80x50      VGA   2 F02   80x43      VGA
     3 F03   80x28      VGA   4 F05   80x30      VGA   5 F06   80x34      VGA
     6 F07   80x60      VGA   7 300  640x400x8   VESA  8 301  640x480x8   VESA
     9 303  800x600x8   VESA  a 305 1024x768x8   VESA  b 307 1280x1024x8  VESA
     c 308   80x60      VESA  d 309  132x25      VESA  e 30A  132x43      VESA
     f 30B  132x50      VESA  g 30C  132x60      VESA  h 30E  320x200x16  VESA
     i 30F  320x200x32  VESA  j 311  640x480x16  VESA  k 312  640x480x32  VESA
     l 314  800x600x16  VESA  m 315  800x600x32  VESA  n 317 1024x768x16  VESA
     o 318 1024x768x32  VESA  p 31A 1280x1024x16 VESA  q 330  320x200x8   VESA
     r 331  320x400x8   VESA  s 332  320x400x16  VESA  t 333  320x400x32  VESA
     u 334  320x240x8   VESA  v 335  320x240x16  VESA  w 336  320x240x32  VESA
     x 33D  640x400x16  VESA  y 33E  640x400x32  VESA  z 345 1600x1200x8  VESA
       346 1600x1200x16 VESA
     Enter a video mode or "scan" to scan for additional modes:

The kernel detected a video BIOS which purports to be VESA-compliant, so the kernel has at this point listed only standard VGA and VESA video modes.  The first thing you notice is that both text and graphics video modes are listed here.  The text modes are described by two numbers with an x between the numbers.  An example is 80x25.  The first number is the number of text columns and the second number is the number of text rows.  The graphics modes are described by three numbers with an x between adjacent numbers.  An example is 640x400x8.  The first number is the horizontal resolution in pixels, the second number is the vertical resolution in pixels, and the third number is the color depth (the number of bits used to represent the color of each pixel).  Both text and graphics modes are listed for completeness' sake, but we are only interested in the text modes.  The graphics modes should be ignored.

Also notice that the first 36 modes listed have a menu item number or letter: 0-9, then a-z.  Any modes after the first 36 do not have a menu item number.  You should ignore the menu item number and concentrate only on the video mode number, which is a three-digit hexadecimal number.  All modes listed, whether they have a menu item number or not, will have a video mode number.  At this point you can choose a text mode that you would like to try, but your video BIOS may be capable of additional text modes not listed on the menu.  To find out, type "scan" and press Enter.
 

     LILO 22.8 boot:
     Linux               LinuxOld
     boot: Linux vga=ask
     Loading Linux
     BIOS data check successful
     Probing EDD (edd=off to disable)...ok
     Press <ENTER> to see video modes available, <SPACE> to continue, or wait 30 sec
     Mode: Resolution:  Type: Mode: Resolution:  Type: Mode: Resolution:  Type:
     0 F00   80x25      VGA   1 F01   80x50      VGA   2 F02   80x43      VGA
     3 F03   80x28      VGA   4 F05   80x30      VGA   5 F06   80x34      VGA
     6 F07   80x60      VGA   7 300  640x400x8   VESA  8 301  640x480x8   VESA
     9 303  800x600x8   VESA  a 305 1024x768x8   VESA  b 307 1280x1024x8  VESA
     c 308   80x60      VESA  d 309  132x25      VESA  e 30A  132x43      VESA
     f 30B  132x50      VESA  g 30C  132x60      VESA  h 30E  320x200x16  VESA
     i 30F  320x200x32  VESA  j 311  640x480x16  VESA  k 312  640x480x32  VESA
     l 314  800x600x16  VESA  m 315  800x600x32  VESA  n 317 1024x768x16  VESA
     o 318 1024x768x32  VESA  p 31A 1280x1024x16 VESA  q 330  320x200x8   VESA
     r 331  320x400x8   VESA  s 332  320x400x16  VESA  t 333  320x400x32  VESA
     u 334  320x240x8   VESA  v 335  320x240x16  VESA  w 336  320x240x32  VESA
     x 33D  640x400x16  VESA  y 33E  640x400x32  VESA  z 345 1600x1200x8  VESA
       346 1600x1200x16 VESA
     Enter a video mode or "scan" to scan for additional modes: scan

At this point the screen will clear, and your monitor will flash wildly for a while.  The kernel is calling the video BIOS repeatedly, trying all possible video modes, and checking the "Carry" flag for a "supported" (0) or "unsupported" (1) decision.  When finished, the kernel will display a new list of modes, with possibly some additional ones that weren't listed before.  Here is some possible output:
 

     Mode: Resolution:  Type: Mode: Resolution:  Type: Mode: Resolution:  Type:
     0 F00   80x25      VGA   1 F01   80x50      VGA   2 F02   80x43      VGA
     3 F03   80x28      VGA   4 F05   80x30      VGA   5 F06   80x34      VGA
     6 F07   80x60      VGA   7 300  640x400x8   VESA  8 301  640x480x8   VESA
     9 303  800x600x8   VESA  a 305 1024x768x8   VESA  b 307 1280x1024x8  VESA
     c 308   80x60      VESA  d 309  132x25      VESA  e 30A  132x43      VESA
     f 30B  132x50      VESA  g 30C  132x60      VESA  h 30E  320x200x16  VESA
     i 30F  320x200x32  VESA  j 311  640x480x16  VESA  k 312  640x480x32  VESA
     l 314  800x600x16  VESA  m 315  800x600x32  VESA  n 317 1024x768x16  VESA
     o 318 1024x768x32  VESA  p 31A 1280x1024x16 VESA  q 330  320x200x8   VESA
     r 331  320x400x8   VESA  s 332  320x400x16  VESA  t 333  320x400x32  VESA
     u 334  320x240x8   VESA  v 335  320x240x16  VESA  w 336  320x240x32  VESA
     x 33D  640x400x16  VESA  y 33E  640x400x32  VESA  z 345 1600x1200x8  VESA
       346 1600x1200x16 VESA    154  132x43      BIOS    155  132x25      BIOS
       164  132x60      BIOS    165  132x50      BIOS    168   80x60      BIOS
     Enter a video mode or "scan" to scan for additional modes:

In the above example, you can see that five additional video modes were discovered as the result of the scan.  At this point you must select a video mode.  Let's suppose that you are used to a 128x48 pseudo text mode on another computer which uses frame buffer text consoles.  (That would correspond to a 1024x768 pixel resolution and a 16-point font.)  And let's suppose that you want to choose a text mode on this computer that is as close to 128x48 as possible.  Looking over the choices, you find that 132x50 is the closest you can get.  You see that there are two video mode numbers that fit the bill: 30B and 165.  The first is a standard VESA mode, whereas the second is a non-standard mode supported by this particular video BIOS.  You decide to go with the VESA standard mode, so you type "30B" and press Enter.
 

     Mode: Resolution:  Type: Mode: Resolution:  Type: Mode: Resolution:  Type:
     0 F00   80x25      VGA   1 F01   80x50      VGA   2 F02   80x43      VGA
     3 F03   80x28      VGA   4 F05   80x30      VGA   5 F06   80x34      VGA
     6 F07   80x60      VGA   7 300  640x400x8   VESA  8 301  640x480x8   VESA
     9 303  800x600x8   VESA  a 305 1024x768x8   VESA  b 307 1280x1024x8  VESA
     c 308   80x60      VESA  d 309  132x25      VESA  e 30A  132x43      VESA
     f 30B  132x50      VESA  g 30C  132x60      VESA  h 30E  320x200x16  VESA
     i 30F  320x200x32  VESA  j 311  640x480x16  VESA  k 312  640x480x32  VESA
     l 314  800x600x16  VESA  m 315  800x600x32  VESA  n 317 1024x768x16  VESA
     o 318 1024x768x32  VESA  p 31A 1280x1024x16 VESA  q 330  320x200x8   VESA
     r 331  320x400x8   VESA  s 332  320x400x16  VESA  t 333  320x400x32  VESA
     u 334  320x240x8   VESA  v 335  320x240x16  VESA  w 336  320x240x32  VESA
     x 33D  640x400x16  VESA  y 33E  640x400x32  VESA  z 345 1600x1200x8  VESA
       346 1600x1200x16 VESA    154  132x43      BIOS    155  132x25      BIOS
       164  132x60      BIOS    165  132x50      BIOS    168   80x60      BIOS
     Enter a video mode or "scan" to scan for additional modes: 30B

The kernel now sets this video mode and continues the boot process.  The video BIOS supplies a font of the appropriate height when it sets the video mode; and since you have prevented Linux from loading its own font, the font provided by the video BIOS remains in effect.  After booting is complete, login as root to a text console, such as vt1.  Now we need to find out what the correct font height is for this video mode.  For that, we run the grabmode utility.  Here is the output:
 

     ./grabmode: WARNING: Please be patient. This may take a while (up to 1 minute)
     "132x50"   40.034   1056 1112 1272 1304   400 413 415 449   -Hsync +Vsync  font 8x8   # 30.700kHz/68.37Hz

Looking at the "font" output from grabmode (8x8), we see that the font is eight pixels wide (the first number is 8) and eight pixels high (the second number is 8).  It is this second number which matters: the required font height is 8.  (Note that this is half the standard height of 16; so we can't expect this font, or any eight-pixel-high font, to have the same smooth look as the standard 16-pixel-high font.)  Doing our sanity check, we see that the horizontal resolution, 1056, divided by 8 yields 132, the number of text columns; and the vertical resolution, 400, divided by the font height, 8, yields 50, the number of text rows; so the data is consistent.  The remainder of both division operations is zero, as it must be.

If you want to try a different video mode, reboot and go through the process all over again.  If you are satisfied with the video mode, but you want to try out a Linux font, edit /etc/default/console-setup (if you are running Squeeze (6.0) or a later release) and specify something with the correct height.  For example,
 

     ...
     FONTFACE="VGA"
     FONTSIZE="8"
     ...

Save the changes and exit the editor.  Then try it out with
 

     setupcon -f

If you decide you don't like the Linux standard eight-point VGA font, you can try another font.  For example,
 

     ...
     FONTFACE=""
     FONTSIZE=""
     ...
     FONT="lat1u-08.psf.gz"
     ...

(Make sure that the appropriate font is installed in /usr/share/consolefonts.)  Then run "setupcon -f" again.  You may use any font you like, as long as it has the proper height. 

To make the video mode permanent, edit /etc/lilo.conf and specify the video mode.  For example,
 

     ...
     # vga=0x30b   132x50, 8-point font
     vga=0x30b
     ...

Save the changes and exit the editor.  Then run the "lilo" command with no operands.  From now on, the kernel will set this video mode on every boot (unless overridden at the LILO "boot:" prompt); and Linux will load the font which you specified in /etc/default/console-setup for Squeeze (6.0) and later releases.

The video BIOS in some older graphics cards has a bug in which the vertical display end value is not set correctly.  Older S3 cards and Cirrus Logic cards are known for this BIOS bug.  For example, the 80x34 video mode, vga=0xf06, is supposed to have a vertical display end value of 476, which is the product of the number of text rows, 34, times the font height, 14.  But if your video BIOS has the bug I just mentioned, the vertical display end value may be set to 480 instead of 476.  This results in four extra scan lines at the bottom of the screen that produce garbage.  If your video BIOS has this bug, you can get around it by adding 0x8000 to the video mode number.  This is the "adjust the vertical display end" bit.  So, for example, if you specify the video mode as vga=0x8f06, the kernel will set the 80x34 video mode by using BIOS Int 10h; then it will recalculate the number of vertical scan lines as the number of text rows times the font height and manually alter the corresponding VGA register to set this value.

Disk Ids

How does LILO identify disks?  The lilo command which is invoked from a Linux shell prompt identifies disks internally via the kernel composite device number.  (See the description of the root configuration file record for more information about kernel composite device numbers.)  But at boot time, the LILO boot loader itself can only reference disks by their BIOS device numbers.  A BIOS device number is a one-byte hexadecimal value.  By convention, a hard disk has a BIOS device number of 0x80 or higher, and the hard disk from which the BIOS loads an MBR is always assigned a BIOS device number of 0x80.  Other hard disks in the system get BIOS device numbers of 0x81, 0x82, etc.

The problem, of course, is for the lilo command to let the LILO boot loader know what BIOS device numbers to use in order to do I/O to the right disks.  Historically, in the absence of any other information, lilo simply guessed.  For example, in the old days, lilo would assume that /dev/hda was BIOS device number 0x80, /dev/hdb was BIOS device number 0x81, etc.  If this guesswork did not produce correct results, there was a way you could tell lilo what the correspondence was.  For example, you could code stuff like this in the /etc/lilo.conf configuration file:
 

     disk=/dev/hda bios=0x80
     disk=/dev/hdc bios=0x81

(Perhaps /dev/hdb is an IDE CD-ROM drive.)  To improve lilo's guesswork when this kind of information is not provided in the configuration file, newer versions of LILO make diagnostic BIOS calls during boot to obtain relevant BIOS information and save this information for later use by the lilo command.  The lilo command itself cannot make BIOS calls, since Linux runs in protected mode (i386) or 64-bit mode (amd64).  This is known as the "BIOS data check" feature of LILO.  This is all well and good as long as the system was booted by LILO; but if the system was booted by some other means, such as a rescue CD, this information is not available to the lilo command.

However, this historic method is no longer reliable (if indeed it ever was).  A newer BIOS recognizes multiple hard disk controllers and will allow designation of any disk on any controller as the hard disk from which to boot.  Since this disk is the boot disk, it will receive BIOS device number 0x80; other hard disks in the system will receive BIOS device numbers in some order (known only to the BIOS).  Booting from a different disk tomorrow, the BIOS device numbers will be assigned in a still-different manner.  Hence, on today's systems, the BIOS device numbers can be quite variable, a major departure from the past.  As has been previously discussed, modern Linux kernels do not necessarily assign the same disk to the same kernel composite device number on every boot either, which further complicates the problem.  So how does lilo solve this problem?

lilo solves this problem by using the disk id to identify hard disks.  The disk id is a four-byte field in the MBR at offset 0x1b8.  I referred to it earlier as the "disk signature".  DR-DOS, Windows NT, Windows 2000, and subsequent Microsoft operating systems use this field to identify disks too.  Beginning with version 22.5, lilo now insists that every hard disk in the system have a unique disk id in this field.  If a disk id is missing for one or more hard disks, lilo will create a disk id for it which does not conflict with any disk ids already in use by other disks in the system and update the disk.  (lilo's definition of "missing" varies, depending on the release of lilo.  Version 22.5 only considers 0x00000000 as missing.  Versions 22.5.1 and above also consider 0xffffffff to be missing.  Versions 22.5.4 and above consider any string of four identical bytes to be missing.)  If two different hard disks in the same system have the same disk id, lilo will issue an error message and will not update the boot code.  You will need to solve this problem somehow before you can proceed.  I'll cover that subject later.

Assuming that the disk ids are unique, the map installer creates a table that contains a correspondence between the kernel composite device numbers, BIOS device numbers, and disk ids for each hard disk that LILO may need to access at boot time.  This information is stored in the boot code.  The kernel composite device numbers are as of the current boot; and the BIOS device numbers are lilo's best guess, based on the techniques previously discussed.  The disk ids are extracted from each hard disk's MBR.  At boot time, LILO interrogates the BIOS and figures out, by comparing disk ids, what the actual BIOS device number currently is for each disk id in the table and uses that BIOS device number to do I/O to that disk during the current boot.  This solves the problem.

If you want to see what the device table looks like, issue the command
 

     lilo -t -v2|tail

Because of the test option (-t), lilo will not update the boot code, but it will tell you what it would have done if it had updated the boot code.  Sample output from this command might look something like this:
 

     Mapped 6 (4+1+1) sectors
     Added Windows

      BIOS  Volume S/N  Device
       80    B21AB21A    0300
       81    EBF5EB74    0340
       82    EBF5EB7B    1600
       83    34225390    2100
       84    78711C09    0800
     The boot sector and the map file have *NOT* been altered.

The above example assumes that a kernel which uses the traditional IDE drivers is running, which explains the kernel composite device numbers of 0300 and 0340.

Of course, this only works for hard disks.  When booting from a non-partitioned device, such as a floppy diskette, there is no MBR; therefore, there is no disk id.  When doing I/O to a floppy diskette, lilo's best guess as to the BIOS device number, based on the historical techniques described above, are used.  Historically, the first floppy drive in the system, /dev/fd0, is assumed to have a BIOS device number of 0x00; and the second floppy drive in the system, /dev/fd1, if it exists, is assumed to have a BIOS device number of 0x01.  If this is not the case, you may have to help lilo out by including BIOS device number information in the configuration file.  This is another use for the lilo diagnostic diskette: it can help you find the BIOS device number information, if you really need to know it.  Fortunately, BIOS device number information for floppy disks doesn't usually move around much, unlike hard disks.  Also, if at boot time LILO cannot find a BIOS device which has the same disk id as one in the table, LILO will fall back to using the BIOS device number from the table created at install time.

Common Problems and Their Solutions

MBR Program Can't Chain Load LILO from a VBR or EBR

If you install LILO to a VBR or the first EBR, and if your Windows MBR boot-loader program is very old, you may have a problem.  I think this problem is best illustrated by example.  I recently acquired an IBM ThinkPad X31.  I booted a Windows 95 rescue diskette, used the Windows 95 FDISK program to delete the partition for the Windows XP pre-installed system, created a small "DOS" primary partition, marked it active, saved the changes, exited FDISK, and rebooted (from the rescue diskette again).  After rebooting, I used "FDISK /MBR" to install the Windows 95 MBR boot-loader program to the MBR of the hard disk, replacing the one originally provided by Windows XP.  I then formatted the C: drive and proceeded to manually install my rescue diskette to the C: drive.  After I had a bootable Windows 95 rescue system installed on my C: drive, I removed the rescue diskette from the floppy drive and booted from the Windows 95 rescue system from the hard drive.  I then proceeded to install the IBM ThinkPad Configuration Utility for DOS on it (PS2.EXE).  I then used PS2.EXE to configure my laptop hardware as I desired.

Once I finished with that, I was ready to install Debian.  During Debian Squeeze (6.0) installation, I created a second primary partition, which gets mounted in Linux as "/boot", and installed LILO to the VBR of this new partition.  The new partition was marked active during installation.  (And the old partition was marked inactive.)  Installation of Squeeze (6.0) proceeded without a hitch; but after installation, when I attempted to boot the system, I got the message
 

     Missing operating system

and a system halt.  What went wrong?  Is there a bug in LILO?  Did LILO get installed properly?  No, there is not a bug in LILO; and yes, LILO was installed properly.  The problem is that the Windows 95 MBR boot-loader program did not read the correct sector when it attempted to chain load LILO.  And since it read the wrong sector, and that sector did not have a valid boot signature in it, the Windows 95 MBR boot-loader program issued the above message and halted the machine.  So why did it read the wrong sector?

When a disk partitioning program, such as FDISK, creates a partition on a hard disk partitioned in the MS-DOS format, it records both a CHS value and an LBA value in the PT entry for the first sector of the partition.  The Windows 95 FDISK program uses the disk geometry reported by BIOS Int 13h function 08h to do the translation between CHS values and LBA values.  On this machine, the geometry so reported is 63 sectors/track and 240 tracks/cylinder.  The DOS partition starts at the beginning of the second track, which is LBA value 63 and CHS value 0:1:1 using this geometry.  The Windows 95 FDISK program placed these values in the PT entry for partition number 1 when I created the partition.  Why does the Windows 95 FDISK program use the geometry reported by BIOS Int 13h function 08h?  It does so because all parts of Windows 95 use BIOS Int 13h function 02h to read from the disk and BIOS Int 13h function 03h to write to the disk.  Both functions take CHS values as arguments, and both functions rely on the geometry reported by BIOS Int 13h function 08h.

In Linux, it's a different story.  Linux disk partitioning programs, such as parted, fdisk, etc., mainly care about the LBAs.  The CHS values aren't that important to Linux disk partitioning programs.  They record CHS values in the PT entry for DOS compatibility, but I'm really not sure where they get the disk geometry to make those calculations.  (I suspect that the geometry is obtained from information provided by the hard disk controller.)  However, I do know that they do not get the disk geometry from BIOS Int 13h function 08h.  The disk geometry used by Linux disk partitioning programs is often different from the disk geometry returned by BIOS Int 13h function 08h.  The LBA value that parted wrote to the PT entry for partition number 2 during Linux installation is correct.  But the CHS value was computed using the disk geometry that parted used, which was 63 sectors/track and 255 tracks/cylinder.

If the Windows 95 MBR boot-loader program read a VBR or EBR by its LBA value, it would read the correct sector.  If it converted the LBA value to a CHS value according to the disk geometry returned by BIOS Int 13h function 08h, then used this CHS value in the call to BIOS Int 13h function 02h, it would also read the correct sector.  (This assumes that the first sector of the partition is within the limits of a CHS value using the disk geometry returned by BIOS Int 13h function 08h.)  But it doesn't do either of those things.  Instead, it takes the CHS value from the PT entry for partition number 2 and passes that value directly to BIOS Int 13h function 02h to read the sector.  That algorithm doesn't work, since the CHS value in the PT entry was computed using a different disk geometry than BIOS Int 13h function 08h reports.  So it read the wrong sector, did not find a valid boot signature on it, issued an error message, and halted the machine.

It could have been worse.  If the CHS value stored in the PT entry for partition number 2 was invalid according to the disk geometry reported by BIOS Int 13h function 08h, some type of I/O error could have been reported.  For example, the sector number could have been greater than the number of sectors per track; or the head number could have been greater than the number of heads minus one; or the cylinder number could have been greater than the number of cylinders minus one.  And of course, even if parted and BIOS Int 13h function 08h agree on the number of sectors per track and the number of tracks per cylinder, if the partition starts beyond the maximum cylinder number (which can be no higher than 1,023), then the CHS value stored in the partition table entry is bogus anyway.

The fix is easy once you understand the problem.  You need to replace the MBR boot-loader program with one that uses LBA values instead of CHS values to access the disk.  If I had left the original MBR boot-loader program from Windows XP intact, I might have been all right.  But I didn't.  Fortunately, lilo provides such a program.  After booting my system via a rescue CD, I issued
 

     lilo -M /dev/sdb mbr

(The hard disk showed up as /dev/sdb during this particular boot: /dev/sda was my CD-ROM drive.)  This installed lilo's standard generic MBR boot-loader program to the MBR, which reads sectors by their LBA values.  This fixed the problem.  Of course, this defeats the purpose of leaving the MBR boot-loader program from Windows alone.  But then again, if you're using a version of Windows which is that old, you probably aren't going to be applying any maintenance to it.  Microsoft removed all support patches for Windows 95 from their web site long ago.  Of course, an MBR boot-loader program from any release of DOS is going to have the same problem.

Note that installing one of lilo's generic MBR boot-loader programs to the MBR via "lilo -M" and installing LILO's first-stage boot-loader program to the MBR are two completely different things.  "lilo -M" creates a generic MBR boot-loader program in the MBR.  It is intended as a replacement for the DOS/Windows MBR boot-loader program for just such situations as this.  It simply chain loads the first sector of whichever partition is marked active.  It cannot load Linux.  But it can chain load the LILO first-stage boot-loader program installed in the first sector of a partition.  And that can load Linux.

LILO's Installer Fails Due to "Corrupted" Partition Table Entry

When LILO's first-stage boot-loader program is going to be installed somewhere other than the MBR, lilo may perform some sanity checks on the corresponding PT entry before rewriting the VBR or EBR.  In particular, it will check to make sure that the CHS and LBA values recorded in the PT entry translate back and forth to each other.  The question is, which disk geometry will be used for the validation?  The geometry assumed by lilo may not match the geometry used by the partitioning program which created the partition.  If they don't match, you can get some very scary error messages that make you think that your PT is corrupted.  For example, you may see something like this:
 

     Warning: Int 0x13 function 8 and function 0x48
      return different head/sector geometries for BIOS drive 0x80
             fn 08: 992 cylinders, 128 heads, 63 sectors
             fn 48: 7936 cylinders, 16 heads, 63 sectors
     Warning: Device 0x0800: Inconsistent partition table, 3rd entry
        CHS address in PT: 609:0:1  -->  LBA (613872)
        LBA address in PT: 4910976  -->  CHS (4872:0:1)
     Fatal: Either FIX-TABLE or IGNORE-TABLE must be specified
     If not sure, first try IGNORE-TABLE (-P ignore)

This looks scary, but the PT is not really corrupted: lilo just picked the "wrong" disk geometry to use in the validation.  In the above example, lilo assumed a geometry of 63 sectors/track and 16 tracks/cylinder to perform the validation; and that is not the disk geometry that the disk partitioning program used when it created the partition.  To fix this problem, run the program used to partition the disk and list the disk's geometry.  For example:
 

     fdisk /dev/sda

     Command (m for help): p

     Disk /dev/sda: 4095 MB, 4095737856 bytes
     128 heads, 63 sectors/track, 993 cylinders
     .
     .
     .
     Command (m for help): q

Now specify this geometry in the /etc/lilo.conf file in the global section.  For example:
 

     .
     .
     .
     disk=/dev/disk/by-id/ata-IBM-DBCA-203240_HP0HPL43952
             sectors=63
             heads=128
             cylinders=993
     .
     .
     .

(Note that we used /dev/sda for convenience purposes when running fdisk, but in /etc/lilo.conf we use a udev-created symbolic link for it to insulate ourselves from device name changes based on which set of drivers is being used or in what order the disks are discovered.)  This will force lilo to use this disk geometry when validating the PT entry, and it will check out OK.  Note that this geometry may not match either of the two geometries listed in the lilo error messages.  That's OK.  The point is to match the geometry used by the disk partitioning program which created the partition in order to get past lilo's sanity checks.  It's the LBA value that counts.

Note: fdisk will list the disk's geometry, as seen by Linux, in response to the "p" command.  parted does not list the disk's geometry in response to the "print" command.  But since both programs obtain the disk geometry from the same source, the Linux operating system, it is OK to use fdisk to obtain the disk geometry, even if the partition was created by parted.

In some cases you may not be able to satisfy lilo's sanity checks.  For example, if your BIOS supports EDD packet addressing, you are using an MBR boot-loader program that exploits EDD packet addressing, and lilo is using the lba32 option, you can install LILO's first-stage boot loader in a VBR or the first EBR which is beyond the addressable limit of a CHS value in the geometry used by the program which created the partition.  In that case, specify the ignore-table option in the global section of the /etc/lilo.conf file.  This will cause lilo to bypass the test for the CHS value and the LBA value translating back and forth to each other.

Unfortunately, using the ignore-table option also causes lilo to bypass the test for partition types; and it is that test which prevents lilo from installing the first-stage boot loader in the VBR of a partition where it must not be installed, such as a Linux swap partition; so be extra careful.  Such errors may be caught later when lilo examines the actual VBR, but they will not be caught earlier when looking at the partition type entry in the PT.

USB Keyboard Doesn't Work

LILO uses the BIOS interface to the keyboard; so if you're using a keyboard which plugs into a USB port, as opposed to a keyboard which plugs into a traditional PS/2 keyboard port, make sure that your BIOS supports access to USB keyboards via BIOS calls.  Computers which come with on-board USB ports usually have a BIOS which supports this, especially if the computer doesn't even have a traditional PS/2 keyboard port; but the option is not necessarily enabled by default.  For example, one user has reported that his computer, which has an AMI BIOS, version 02.54, dated 04/22/2005, does not support BIOS access to USB keyboards by default.  The keyboard worked fine in Linux, once the kernel was booted; but he could not boot an alternate kernel using LILO because LILO did not "see" any keystrokes coming from the keyboard.  He solved the problem by entering the BIOS setup program and enabling the following option:
 

     Features Setup
        USB function for DOS

Once this option was enabled, LILO was able to "see" the keyboard; and the user was able to boot an alternate kernel (and otherwise use the LILO command line interface or menu).  Other BIOS setup programs may have a different name for the option.  You will have to figure out exactly how to enable BIOS support for USB keyboards in your particular BIOS if it is not enabled by default.

Duplicate Disk Ids

As mentioned before, starting with lilo 22.5, lilo now insists that every hard disk in the system have a unique disk id in its MBR.  If two different disks in the system have the same disk id, lilo will issue an error message and will refuse to update the boot code.  So if you run into this situation, what do you do?  Suppose, for example, that lilo complains that /dev/sda and /dev/sdb have duplicate volume ids (disk ids).  The fix is a two-step process.  Step 1 is to zero out the disk id in one of the disks.  Normally, you want to zero out the disk id in the higher-numbered disk.  You can do this by issuing the command
 

     lilo -z -M /dev/sdb mbr

This will write out a generic MBR boot-loader program to the MBR of /dev/sdb and will also zero out its disk id.  Note that this will destroy whatever boot code was previously in the MBR of /dev/sdb!  You have been warned!  If you used to have grub installed there, it's gone now.  The second step is to run lilo with no operands.
 

     lilo

When lilo runs this time, it will find that /dev/sdb has a missing disk id and will assign one to it.  lilo will then update the boot code as usual.  If you don't want to replace the MBR boot code of /dev/sdb, then you will need to find some way of editing the disk id in /dev/sdb.  lilo itself does not provide such a mechanism.  Warning!  Altering the disk id of a Windows boot disk may make Windows unbootable.  Maybe the first question to ask is, do you really need both of these disks in your system at the same time?

One way to zero out the disk id for a disk without replacing the boot code is to use fdisk.  Here is a sample fdisk session to illustrate how it's done:
 

     # fdisk /dev/sdb

     Command (m for help): m
     Command action
        a   toggle a bootable flag
        b   edit bsd disklabel
        c   toggle the dos compatibility flag
        d   delete a partition
        l   list known partition types
        m   print this menu
        n   add a new partition
        o   create a new empty DOS partition table
        p   print the partition table
        q   quit without saving changes
        s   create a new empty Sun disklabel
        t   change a partition's system id
        u   change display/entry units
        v   verify the partition table
        w   write table to disk and exit
        x   extra functionality (experts only)

     Command (m for help): x

     Expert command (m for help): m
     Command action
        b   move beginning of data in a partition
        c   change number of cylinders
        d   print the raw data in the partition table
        e   list extended partitions
        f   fix partition order
        g   create an IRIX (SGI) partition table
        h   change number of heads
        i   change the disk identifier
        m   print this menu
        p   print the partition table
        q   quit without saving changes
        r   return to main menu
        s   change number of sectors/track
        v   verify the partition table
        w   write table to disk and exit

     Expert command (m for help): i
     New disk identifier (current 0x0c170c16): 0
     Disk identifier: 0x00000000

     Expert command (m for help): w
     The partition table has been altered!

     Calling ioctl() to re-read partition table.

     WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
     The kernel still uses the old table. The new table will be used at
     the next reboot or after you run partprobe(8) or kpartx(8)
     Syncing disks.
     #

For this type of change, you can ignore the error message about the re-read of the partition table failing, and a reboot is not necessary.  Now, simply run the lilo command without operands.  lilo will detect that this disk has a missing disk id and will create a new one for it which does not conflict with any existing disk.

Only Part of the Word "LILO" Written During Boot (And Other Boot Errors)

LILO's greatest strength is also its biggest weakness: it does not understand the structure of any Linux file system.  How is that a strength?  LILO's boot-loader installer, also referred to as the map installer, creates a list of blocks that the boot-loader program needs to read.  During boot, the boot-loader program itself simply needs to read a predetermined list of blocks into memory.  This is a very simple algorithm and makes the boot-loader program itself small and fast.  That is a strength.  That is what allows the first-stage boot-loader program to fit in a single 512-byte sector and not use any unallocated sectors.  But it is also a weakness.  It is a weakness because that list of blocks will become obsolete if any block of the kernel image file or initial RAM file system image file changes its physical location on the hard disk.  Therefore, any time the kernel is updated or the corresponding initial RAM file system image gets rebuilt, the boot-loader installer (the lilo command) must be rerun to rebuild this list of blocks.

The system tries to rerun the boot-loader installer when needed, but that doesn't always happen.  Under Squeeze (6.0) and later releases, perhaps needed hook scripts have not been properly installed; or maybe they were not marked executable with "chmod +x ...".  It is critical to have your kernel installation environment set up properly when using LILO.

You can also get yourself into trouble if you monkey around with certain files that you shouldn't touch.  For example, if you copy the kernel image file or the initial RAM file system image file to another file name, erase the original, then rename the copy back to the original name, the kernel image file (or initial RAM file system image file) is back in the /boot directory where it belongs; but it now occupies different physical blocks than it did when the boot-loader installer was last run.  Your kernel may continue to boot successfully for a while, as long as the old blocks still contain their former contents.  But these blocks are now unallocated blocks that are part of the file system.  Eventually these blocks will be allocated for another file, and their contents will change.  And then you will have major problems, which by then may be far removed in time from the original cause.

Another file you should never touch is the map file (/boot/map).  This file contains the second-stage boot-loader program as well as the list of blocks discussed above.  LILO reads it the same way: as a list of physical block numbers created when the boot-loader installer was last run.  You must not do anything which will change the physical block numbers in which it is allocated.  If you do any of these things, be sure to rerun the boot-loader installer so that the list of physical blocks is updated.  Otherwise, you will have boot problems.  If you actually encounter the boot problems, it is too late for preventive medicine.  To cure the problem, boot your system from a rescue CD (or memory stick, or whatever) and rerun the boot-loader installer (the lilo command).  Then shutdown and reboot.  If you are ever in doubt about whether the lilo command needs to be run or not, run it just to be safe.  It can't hurt anything to run it.  There's no such thing as running it too many times.  On the other hand, don't run it if you know you don't need to run it.

Conclusion

Happy booting!  If anyone has any comments, suggestions, complaints, corrections, or any other form of feedback, please drop me a line at zlinuxman@wowway.com.

Return to my home page