Toward a fully reproducible Debian
The three developers of the title are part of the sharpened message, each being an example of the problem that reproducible builds aim to solve. Alice, a system administrator who contributes to a Linux distribution, is building her binaries on servers that, unknown to her, have been compromised; her binaries are trojan horses, carrying malicious content into systems that run them. Bob, a privacy-oriented developer, makes a privacy-preserving browser, but is being blackmailed into secretly including vulnerabilities in the binaries he provides. Carol is a free-software user whose laptop is being attacked by an evil maid called Eve, the third developer of the title; each time Carol shares free software with her friends, it is pre-compromised by Eve. All of these attacks hurt free-software users and the reputation of free software as a whole.
Worse, the mere existence of these
classes of attack is a disincentive to share software. People like Alice
may reason that, if their servers turn out to be compromised, they will
be blamed for the malicious software they have unwittingly distributed,
and that the potential opprobrium may not be justified by the fleeting
gratitude they currently get for their unpaid work. Others who have
servers, skills, and time they might once have volunteered to help build
free software may similarly decline to paint a target upon themselves.
In Lamb's words, they may say "I'm not going to do this free software
lark. I'm going to go for a walk instead". Participation in the
community is reduced, as is the trust we all place in the binaries
we install.
Building everything from sources that one has hand-inspected is a solution to this, but it doesn't scale. Many of us aren't qualified to spot security weaknesses (Lamb's specific example was the one-line patch that throttled Debian's ability to generate random keys, back in 2008), and in any case you still need to get that initial compiler from somewhere. Many users, even those who love free software and wish to use it in preference to proprietary software, will continue to install binaries. I will be one of them.
But if compilation from a given set of sources in a given environment always resulted in binaries that were bit-for-bit identical to each other, having confidence in the integrity of your binaries would be a much easier proposition, since you could compare your own binary copy with those of a suitable number of others. You could ring a friend and compare checksums, or you could perhaps participate in a distributed checksum validation scheme comparable to the old Perspectives system for distributed validation of SSL certificates. Many strategies for increasing confidence would be possible, but only if the build is reproducible. That is what Debian has been striving for, and why.
There are other advantages to reproducible builds. For developers, they mean that successive generations of a binary should change only in proportion to the source changes that were made between them; the only changes you should see in your binaries are the ones you intended to be there. It helps cut down on unnecessary build dependencies; you can remove a dependency and rebuild, and if the binary hasn't changed, you didn't need that dependency and can get rid of it. In some cases, it can even help find bugs: Lamb referred to a build that had been made non-reproducible by a 15-digit random number that was generated during each build and baked into the resulting binary. It turned out that it was used as an OpenID secret, which meant that everyone running a given build of the software was using the same secret key.
Clearly, reproducible builds are a good thing, but it turns out they aren't trivial. As we reported earlier, many build systems put timestamps inside binaries, which is an obvious problem. But some go further and include user, group, and umask information, and sometimes environment variables, which are also a problem. Build paths are often rolled in, for example in C++ assertions. File ordering can be an issue, because Unix doesn't specify an order in which readdir() and listdir() should return the contents of a directory, so components can get built in an unpredictable order. If these components are packed into a binary in the order they're returned, each build will be different even if no other change has been made.
Similar problems exist with dictionary key ordering: for example, a build that iterates over the keys of a Perl hash will have problems, since these elements are also returned in a variable order. Parallelism in a build process can also make the build order non-deterministic, because different elements can build at different speeds at different times.
Debian's approach to this has been the development of the torture test. Everything is built twice, an A build and a B build, and between the two builds as much as possible is varied. The clock on the B build server is 18 months ahead of the clock on the A server; their hostnames and domain names are different. The reproducible build team developed a FUSE filesystem called disorderfs, which tries to be as non-deterministic as a working filesystem can be. They vary the time zone, locale, UID, GID, and kernel, all to try to determine to a high degree of accuracy whether a given build is reproducible.
When this work started in 2013, 24% of the software in Debian would build reproducibly. As we reported earlier, by 2015 it was up to about 75%, and as of March 2018, said Lamb, 93% of the packages in Debian 10 (buster) built reproducibly on amd64. But, while the proportion has been steadily increasing, the increase hasn't been monotonic. Lamb's graph of reproducibility vs. time since late 2014 showed a couple of big backslides. These, he said, tended to correlate with new variations introduced to the torture test. A sharp drop in reproducibility in late 2016, for example, marked the introduction of variable build paths. The issues that this exposed were dealt with over the next four months; this work included a patch to GCC.
Meanwhile, the idea is spreading; various distribution and build-system projects, including coreboot, Fedora, LEDE, OpenWRT, NetBSD, FreeBSD, Arch Linux, Qubes, F-Droid, NixOS, Guix, and Meson, have all joined the reproducible builds project. Three reproducible build summits have been held, and more are anticipated. Good tools other than disorderfs have come out of the project, such as the .buildinfo file we covered earlier. We also covered diffoscope, but this can now interpret about sixty different types of content, from Android APKs to xz-compressed files. As happens with any good tool, people are starting to find other uses for it. Lamb said that he found it particularly helpful verifying that security patches didn't touch any more of a binary than he expected them to; he also noted its utility in comparing binary blobs such as an old and a new router firmware image.
An honest look at limitations is a good thing. In response to a later question about whether diffoscope could identify "effectively reproducible" builds that differ in only trivial ways, Lamb declined to come up with a definition of "sufficiently reproducible". For him, the reproducibility test is exact binary compatibility; diffoscope is a tool to help diagnose failures to meet that standard, not an opportunity to lower the bar. Reproducible builds themselves are no panacea. They do nothing to help find backdoors or other vulnerabilities in the source; if your Git repository has been compromised, reproducible builds won't save you. Similarly, they do nothing to find or fix programming errors, weak algorithm choices, or "testing" modes in the style of Volkswagen.
Further improvement is possible. User interfaces for handling the installation of software that cannot build reproducibly could do with a lot of improvement over Debian's current offer. Toolchains continue to need fixes: the GCC patch referred to earlier is not yet in upstream, and a whole bunch of OCaml packages aren't reproducible. The advantage of fixing these at the toolchain level is that you fix a given issue across half a million packages at once instead of having to modify each individual package's build to work around it. Lamb noted that help from OCaml and R experts would be particularly valuable right now.
He hopes that the next release of Debian will be 100% reproducible, and noted that progress can be seen at isdebianreproducibleyet.com. He mentioned that at some point a policy change to Debian might be considered, such that software that wouldn't build reproducibly wouldn't be accepted for inclusion. In response to my question, he said that 93% is to his mind too early for such a change, but that once reproducible builds get to 98-99% he'd become much more supportive of it.
In the tricky middle ground of 95-96%, his position would depend on why builds were non-reproducible, as there are a few valid reasons for this to happen. In response to another question, he said that two good reasons for a non-reproducible build were packages that build inside their own virtual machine, such as Emacs, and security packages with signing keys such as secure boot. The former, he thinks, can probably be solved with enough work; the latter can't be fixed, but there are only a couple of them, and an exception could be made.
I'm a big fan of tools that let me manage my own security. I prefer a YubiKey to an RSA token, because I can generate and load my own secrets. This reproducible build work cheers me up because it allows me to manage the security of my vendor-supplied binaries in addition to trusting my vendor's building and signing infrastructure. Hopefully, Debian will reach a point where it mandates reproducible builds and others will soon follow suit.
[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting my
travel to the event.]
| Index entries for this article | |
|---|---|
| Security | Deterministic builds |
| Security | Distribution security |
| GuestArticles | Yates, Tom |
| Conference | FLOSS UK/2018 |