1. Introduction

1.1. Preamble

This document is dedicated to those crazy "Do It Yourself" Linux enthusiasts who would rather build their own GNU/Linux system from source code than run a precompiled binary distro.

DIY Linux is not for the faint of heart. There are many drawbacks and not everyone will have the ability or desire to do it. You don't need to be an experienced programmer but you must have above average knowledge of software building principles. If all you want is a GNU/Linux system that simply "just works" then you would be far better off with a precompiled binary distro prepared by professionals. Nevertheless, if you have a legitimate reason to build your own GNU/Linux system completely from source, have sufficient technical skill, are a bit of a control freak, or you're the kind of person who simply must be in the driver's seat when it comes to your Linux system, you'll find a high quality build recipe here.

The intended audience is the more technically adept Linux user. Folks new to the world of DIY Linux should first head over to the Linux From Scratch (LFS) project to learn the basics.

1.2. Goals

This "reference" build is intended to be a solid foundation for building custom GNU/Linux systems. It's only a Reference Build so feel free to add or remove packages and customize it in any way you see fit. However, please keep in mind that the build recipe and package versions presented here have been tested and are known to work, so if you deviate you'll be on your own. Of course you're on your own anyway as this is, after all, a "DIY" project.

The goals can be summarized as follows:

  • Build a reasonably up-to-date and usable base GNU/Linux system from source.

  • The term "usable" implies use of Package Management (PM), because a system without PM can be frustratingly difficult to maintain. Additionally, if PM is used to create binary packages, it's possible to take advantage of the "build once, deploy anywhere" philosophy thus simplifying maintenance and administration of multiple systems. Therefore this build recipe strives to accommodate general Package Management principles. You'll find the overall build method and individual build commands aiming to be "PM friendly" towards a number of existing Package Managers (see below).

  • The build method used to bootstrap the system must be robust and work from a virgin default development install of every major Linux distro released within the last 5 years. Upgrading tools on the host defeats the purpose and therefore does not qualify. The method must also work when coming the other way ie: from a host running a latest cutting edge distro.

  • The base system must be able to build a wide range of commonly used open source software.

  • The base system must be self-hosting i.e. be able to rebuild itself reproducibly with verification provided by a binary comparison technique known as Iterative Comparison Analysis (ICA).

  • Will use latest, stable, released, pristine upstream sources with as few patches as possible and as minimal build commands as efficiently possible (within reason). So called "beta" or "unstable" releases will be used in exceptional circumstances (see below).

  • New technologies will not be adopted into the Reference Build before they are ready.

1.3. Build Methods

The basic theory of operation involves building the target system natively inside a chroot environment. To this end, two separate build phases are utilized. The goal of the first phase is to build a self-hosted temporary toolchain which is installed into a non-standard prefix. The second phase is then conducted inside a chroot environment whereby the result of the first phase is used as a platform for building the final system.

There are currently two variations of the build method presented here. The first is the so-called "Next Generation" method which leverages cross compilation for the initial "Pass 1" toolchain. The second is the original method which is now deprecated and no longer recommended. The new method has 2 main advantages, 1) there is much less reliance on the host system and 2) it is much better equipped to handle modern "multi-arch" platforms like x86_64. For example, when targeting x86_64 it doesn't matter whether the host system is 32-bit or 64-bit or a combination of both (userland and/or kernel), or whether you want a pure 64-bit system or a bi-arch (multilib) system. The Next Generation build method caters for all these possibilities.

1.4. Architectures

The following architectures are supported by the Reference Build: i386 (x86), ppc (PowerPC) and x86_64 (AMD64). We don't have access to hardware for other architectures therefore we are unable to test or support them (this includes 64-bit PowerPC). Please note the PowerPC and AMD64 builds appear to function perfectly but they are not as well tested as the x86 build.

[Important] Important

The Reference Build assumes a mostly native build environment. However, it should be noted that the Next Generation build method employs cross compilation for the initial pass 1 toolchain only. This has many advantages. For example, if you are running a 32-bit x86 system on AMD64 hardware and want to build a 64-bit environment (single or multi-arch), the cross toolchain can be used to cross compile a 64-bit kernel at the appropriate point in the procedure. The original build method does not support bi-arch and in the case of pure x86_64, it required the host machine to already be running a 64-bit kernel and have a 64-bit userland available.

1.5. Automation

It is assumed the reader will be scripting his/her own builds. In order to make the script writer's life easier we have structured this document in such a way as to allow direct extraction of (a) each package's build commands ("scriptlets") and (b) overall package information ("packagedata"). You can find a tarball containing these files right here. These scriptlets (and packagedata) are extremely handy for integration into your own build scripts.

A sample implementation of this "Document Centric Automation" is the author's own build scripts which you can find here. It's worth noting that the author runs a full test build using these scripts before any changes are made to the published version of this document. This goes a long way towards ensuring that the build commands as printed here are of high quality and free of typographical errors.

1.6. Build Notes

In addition to the goals listed above, the aim is to produce an NPTL-enabled system based on:

  • Latest 2.6.X kernel.

  • Latest GCC releases. (Note: Multiple versions of all major Toolchain components are supported. Please see the important notes below for details.)

  • Latest Glibc releases.

  • Latest Binutils releases. (FSF or HJL contingent on prevailing toolchain climate - Please see Section B, “Which Binutils?” for some rationale discussion.)

[Important] Important - Latest not necessarily the greatest!

Glibc development moves quite rapidly these days. While this is very much welcomed, there is a downside in that the release schedules of other toolchain components can sometimes get out of sync. This is a problem because history has shown that you cannot easily mix and match versions of toolchain components. In particular, Glibc compilation is very sensitive to the GCC version. For example, you cannot expect to compile a recent Glibc with a GCC from the 2.95.X era. It simply doesn't work. In our experience the most stable and trouble-free combinations of toolchain components are those of same or similar vintage. Another point to consider, every new major GCC release is stricter than the last. This means that once you get beyond the base Reference Build you will almost certainly come across packages that fail to compile with the latest GCC. For these reasons we have decided to support multiple major versions of toolchain components. The DIY build recipe has been carefully crafted to remain "out of the box" compatible with multiple GCC, Glibc and Binutils versions as per the table below. Simply set the corresponding environment variable for each toolchain component (detailed in Section 2.1, “Environment”) to build with your chosen versions.

Another important factor in building toolchains is which set of kernel headers to use. The Linux-Libc-Headers package (LLH) worked well for a number of years. But it has now been superceded by a new method known as "make headers_install" (HDRS_INST) which was introduced in linux-2.6.18. HDRS_INST is now recommended because it provides a consistent and up-to-date set of headers that have the backing of kernel developers, system library developers and distributors alike.

The table below shows some recommended toolchain combinations. Other combinations might work but do not receive any testing, and in some cases require patches (not provided here). If you decide to mix and match versions, you need to be aware of the risks associated with doing so. For example, occasionally you'll see warnings from upstream developers like this and this.

Glibc GCC Binutils Headers Comments
2.3.6 3.4.6 2.16.1 FSF LLH A classic. Rock solid, but starting to show its age.
2.6.1 4.2.4 2.18 FSF HDRS_INST Very stable. Was the previous default.
2.7 4.2.4 2.18 FSF HDRS_INST An excellent match. Current Reference Build default.
2.7 4.3.3 2.18 FSF HDRS_INST Seems to be stable. No known major problems.
2.8 4.3.3 2.18 FSF HDRS_INST Appears OK. But still quite new so tread carefully.

This table shows toolchain combinations that were once recommended but are no longer tested. But in all likelihood they should still work fine in the current Reference Build context.

Glibc GCC Binutils Headers
2.3.6 4.0.4 2.16.1 FSF LLH
2.4 4.1.2 2.17 FSF LLH
2.5.1 4.1.2 2.17 FSF HDRS_INST

Udev is not yet included in the Reference Build. It'll go in once (a) there is unanimous Udev buy-in from all the major distros shipping 2.6 kernels which apparently hasn't happened yet and (b) Udev development settles down to a point where upgrading to the latest version doesn't break the system.

Bootscripts are not included in the Reference Build. Bootscripts tend to be a matter of personal taste so folks are encouraged to write their own or plug in an existing 3rd party package.

In the beginning we started out with something resembling the then current LFS. We then developed modifications with a view towards achieving the goals as listed above. Some other points worth noting:

  • The equivalent of LFS Chapters 5 & 6 are named the "Temptools phase" and the "Chroot phase". Better names might possibly be "Bootstrap phase" and "Sysroot phase" but confusion could arise as those terms are already used in GCC parlance.

  • The Temptools phase is essentially all about bootstrapping a new toolchain. Therefore it doesn't have to be perfect and it doesn't demand that test suites be run. As long as it can build a verifiably clean system (from a reproducibility point of view) inside the Chroot phase it has done its job.

  • The pass1's of Binutils and GCC are no longer statically linked. Static linking is a diversion from the real goal of the Temptools phase and is a potential source of failure when what is needed is robustness.

  • We have implemented use of `config.site' to increase efficiency with the build commands. Every package that uses an Autoconf produced configure script can take advantage of this. eg: specify `--prefix=/usr' only once in config.site instead of multiple times for every package.

  • We have dropped `make install' for many of the Temptools phase packages to reduce cruft. A simple `cp -v <prog> ${TT_PFX}/bin' will often do the trick.

  • We have added the optional capability for parallel building on SMP.

  • We have added the optional capability to build without debugging information via some careful use of CFLAGS and LDFLAGS. Stripping the binaries will achieve a similar result but generating all that debugging info in the first place is likely to slow down the build (think disk I/O).

  • We have added `-v' to our invocations of the core file utilities to enhance the level of system feedback while building.

  • We are compiling Glibc against sanitized kernel headers ie: the `linux-libc-headers' package. Compiling Glibc against raw kernel headers is always an option, but it goes against the Glibc developer's general advice of recent times to use distro-supplied sanitized headers. As of kernel-2.6, sanitized kernel headers are the only realistic option for installation into /usr/include because the raw headers are pretty much unusable when installed there.

  • The host MUST be running a 2.6.x kernel for the build of an NPTL-enabled Glibc to succeed. If you're bootstrapping the build on an old distro, you can usually get away with a procedure that involves building a temporary non-modular 2.6.x kernel then booting it with a grub boot floppy. See here for details. FIXME - must document this procedure properly.

1.7. Prerequisites

FIXME mention stuff like gawk for recent glibc, bison for bash, texinfo for HJL binutils etc. Also point out how one can easily build "pass 0" versions of these tools if getting them installed on the host is impossible FIXME

1.8. Security

FIXME shoot down the hysteria surrounding security patches. It's horses for courses. You want security? Then FFS run a distro which provides security updates. Mention how utterly idiotic it is to build a system yourself then place it in a security vulnerable situation (ie: on the internet) without having the faintest clue about security then expecting someone else to be responsible for patches. FIXME

1.9. Kernel

FIXME explain the issues touched upon here. FIXME

1.10. Package Management

There are many and varied techniques for implementing Package Management (PM) strategies. This LFS page has a reasonable writeup of the various PM styles that exist today. We've geared this Reference Build towards those Package Managers that operate on the basis of diverting `make install' to a temporary location for the purpose of creating an archive of the installed files. These Package Managers typically use the commonly available Makefile variable `DESTDIR' to achieve the diversion. Some examples of Package Managers in this category are RPM (Red Hat/Fedora etc), Dpkg (Debian etc), Portage (Gentoo), Pkgtool (Slackware), BPM (Bent Linux), Pacman (Arch Linux) and Pkgutils (Crux). Please note, we don't insist that you employ PM when using this Reference Build. All the build commands will still work correctly if you choose to perform a "build in place" style installation à la LFS. If you do decide to employ PM, feel free to choose any Package Manager from the supported category that meets your needs, because the Chroot phase installation commands are generic and should work with all of them. In our experience Package Managers with simple spec file formats are the easiest to integrate into a PM based Reference Build. You could even write your own simple Package Manager if you felt that way inclined.

In typical "build in place" style installations the entire Chroot phase is performed as root. However, in most source building scenarios it's generally considered best practice to build packages as a non-privileged user, and only when it's time for `make install' should you need to become root. Therefore in an ideal world involving PM, you would build and create archives as a non-privileged user, then install the archives with a Package Manager as root. We've set up this Reference Build in such a way as to try and achieve that ideal world. Be aware that creating package archives as a non-privileged user can sometimes result in wrong file ownerships and permissions on files inside the built archives. A potential solution to this problem is to make use of the "Fakeroot" package which we've included as an optional extra. Please note, we don't insist that you build packages in the Chroot phase as a non-privileged user when using this Reference Build. Building as root still works correctly.