Free software's not-so-eXZellent adventure
The technical details of the attack are fascinating in many ways. See the companion article for more about that aspect of things.
For those needing a short overview: the XZ package performs data compression; it is widely distributed and used in many different places. Somebody known as "Jia Tan" managed to obtain maintainer-level access to this project and used that access to insert a cleverly concealed backdoor that, when the XZ library was loaded into an OpenSSH server process, would provide an attacker with the ability to run arbitrary code on the affected system. This code had found its way into some testing distributions and the openSUSE Tumbleweed rolling distribution before being discovered by Andres Freund. (See this page for a detailed timeline.)
The hostile code was quickly removed and, while it is too soon to be sure, it appears that it was caught before it could be exploited. Had that discovery not happened, the malicious code could have found its way onto vast numbers of systems. The consequences of a success at that level are hard to imagine and hard to overstate.
Social engineering
Like so many important projects, XZ was for years the responsibility of a single maintainer (Lasse Collin) who was keeping the code going on his own time. That led to a slow development pace at times, and patches sent by others often languished. That is, unfortunately, not an unusual situation in our community.
In May 2022, Collin was subjected to extensive criticism in this email thread (and others) for failing to respond quickly enough to patches. That, too, again unfortunately, is not uncommon in our community. Looking back now, though, the conversation takes on an even more sinister light; the accounts used to bully the maintainer are widely thought to have been sock puppets, created for this purpose and abandoned thereafter. In retrospect, the clear intent was to pressure Collin into accepting another maintainer into the project.
During this conversation, Collin mentioned (more than once) that he had been receiving off-list help from Tan, and that Tan might be given a bigger role in the future. That, of course, did happen, with devastating effects. Tan obtained the ability to push code into the repository, and subsequently abused that power to add the backdoor over an extended period of time. As well as adding the backdoor, Tan modified the posted security policy in an attempt to contain the disclosure of any vulnerabilities found in the code, changed the build system to silently disable the Landlock security module, redirected reports from the OSS-Fuzz effort, and more.
Once the malicious code became part of an XZ release, Tan took the campaign to distributors in an attempt to get them to ship the compromised versions as quickly as possible. There was also a series of patches submitted to the kernel that named Tan as a maintainer of the in-kernel XZ code. The patches otherwise look innocuous on their face, but do seem intended to encourage users to update to a malicious version of XZ more quickly. These patches made it as far as linux-next, but never landed in the mainline.
Much has been made of the fact that, by having an overworked and uncompensated maintainer, XZ was especially vulnerable to this type of attack. That may be true, and support for maintainers is a huge problem in general, but it is not the whole story here. Even paid and unstressed maintainers are happy to welcome help from outsiders. The ability to take contributions from — and give responsibility to — people we have never met from a distant part of the world is one of the strengths of our development model, after all. An attacker who is willing to play a long game has a good chance of reaching a position of trust in many important projects.
This whole episode is likely to make it harder for maintainers to trust helpful outsiders, even those they have worked with for years. To an extent, that may be necessary, but it is also counterproductive (we want our maintainers to get more help) and sad. Our community is built on trust, and that trust has proved to be warranted almost all of the time. If we cannot trust our collaborators, we will be less productive and have less fun.
Closing the door
As might be expected, the Internet is full of ideas of how this attack could have been prevented or detected. Some are more helpful than others.
There have been numerous comments about excess complexity in our systems. They have a point, but few people add complexity for no reason all; features go into software because somebody needs them. This is also true of patches applied by distributors, which were a part of the complex web this attack was built on. Distributors, as a general rule, would rather carry fewer patches than more, and don't patch the software they ship without a reason. So, while both complexity and downstream patching should be examined and avoided when possible, they are a part of our world that is not going away.
In the end, the specific software components that were targeted in this attack are only so relevant. Had that vector not been available, the attacker would have chosen another. The simple truth is that there are many vectors to choose from.
There is certainly no lack of technical and process questions that should be asked with regard to this attack. What are the dependencies pulled in by critical software, do they make sense, and can they be reduced? Why do projects ship tarballs with contents that are not found in their source repository, and why do distributors build from those tarballs? How can we better incorporate the ongoing reproducible-builds work to catch subtle changes? Should testing infrastructure be somehow separated from code that is built for deployment? What are the best practices around the management of binary objects belonging to a project? Why are we still using ancient build systems that almost nobody understands? How can we get better review of the sorts of code that makes eyes glaze over? What is the proper embargo policy for a problem like this one?
These are all useful questions that need to be discussed in depth; hopefully, some useful conclusions will come from them. But it is important to not get hung up on the details of this specific attack; the next one is likely to look different. And that next attack may well already be underway.
On the unhelpful side, the suggestion from the OpenSSF that part of the problem was the lack of an "OpenSSF Best Practices badge" is unlikely to do much to prevent the next attack. (Note that the organization seems to have figured that out; see the Wayback Machine for the original version of the post). The action by GitHub to block access to the XZ repository — after the horse had long since left the barn, moved to a new state, and raised a family — served only to inhibit the analysis of the problem while protecting nobody.
The source
This attack was carried out in a careful and patient fashion over the course of at least two years; it is not a case of a script kiddie getting lucky. Somebody — or a group of somebodies — took the time to identify a way to compromise a large number of systems, develop a complex and stealthy exploit, and carry out an extensive social-engineering campaign to get that exploit into the XZ repository and, from there, into shipping distributions. All of this was done while making a minimum of mistakes until near the end.
It seems clear that considerable resources were dedicated to this effort. Speculating on where those resources came from is exactly that — speculation. But to speculate that this may have been a state-supported effort does not seem to be going out too far on any sort of limb. There are, undoubtedly, many agencies that would have liked to obtain this kind of access to Linux systems. We may never know whether one of them was behind this attempt.
Another thing to keep in mind is that an attack of this sophistication is unlikely to limit itself to a single compromise vector in a single package. The chances are high that other tentacles, using different identities and different approaches, exist. So, while looking at what could have been done to detect and prevent this attack at an earlier stage is a valuable exercise, it also risks distracting us from the next attack (or existing, ongoing attacks) that do not follow the same playbook.
Over the years, there have not been many attempts to insert a backdoor of this nature — at least, that we have detected — into core free-software projects. One possible explanation for that is that our software has been sufficiently porous that a suitably skilled attacker could always find an existing vulnerability to take advantage of without risking the sort of exposure that is now happening around the XZ attack. An intriguing (and possibly entirely wishful) thought is that the long and ongoing efforts to harden our systems, move to better languages, and generally handle security issues in a better way has made that avenue harder, at least some of the time, pushing some attackers into risky, multi-year backdoor attempts.
Finally
In the numerous discussions sparked by this attack, one can readily find two seemingly opposing points of view:
- The XZ episode shows the strength of the free-software community. Through our diligence and testing, we detected a sophisticated attack (probably) before it was able to achieve its objectives, analyzed it, and disabled it. (Example).
- The world was saved from a massive security disaster only by dint of incredible luck. That such an attack could get so far is a demonstration of the weakness of our community; given our longstanding sustainability problems, an attack like this was inevitable at some point. (Example).
In the end, both of those points of view can be valid at the same time. Yes, there was a massive bit of luck involved in the detection of this attack, and yes, the support for maintainers (and many other contributors) is not what it needs to be. But, to paraphrase Louis Pasteur, chance favors the prepared development community. We have skilled and curious developers who will look into anomalous behavior, and we have a software environment that facilitates that sort of investigation. We never have to accept that a given software system just behaves strangely; we have, instead, gone to great lengths to ensure that it is possible to dig in and figure out why that behavior is happening — and to fix it. That is, indeed, a major strength; that, along with luck and heroic work, is what saved us.
We must hope that we can improve our game enough that our strengths will
save us the next time as well. There is a good chance that our community,
made up of people who just want to get something done and most of whom are
not security experts, has just been attacked by a powerful adversary with
extensive capabilities and resources. We are not equipped for that kind of
fight. But, then, neither is the proprietary world. Our community has
muddled through a lot of challenges to get this far; we may yet muddle
through this one as well.
| Index entries for this article | |
|---|---|
| Security | Backdoors |
| Security | Vulnerabilities/Social engineering |