Assessing risk with the Core Infrastructure Initiative
The Linux Foundation's Core Infrastructure Initiative (CII) exists to "fortify" critical open-source software projects with funding, code reviews, and other kinds of support, with a particular eye toward shoring up those packages to prevent serious security crises. CII was formed in response to the memorable "Heartbleed" vulnerability discovered in OpenSSL, which was the first adopted project. Recently, CII unveiled its Census Project, a semi-automated ranking of open-source projects by security risk. The numbers make for some interesting reading—although the conclusions subsequently drawn by the CII can be puzzling.
The Census Project was announced on July 9, at which time the CII presented the results of its project-analysis work. There is a multi-page HTML table on the Census Project page, as well as a white paper [PDF] co-authored with the Institute for Defense Analyses (IDA) that goes into detail on the methods and metrics considered and used. The upshot is that each project examined in the census is assigned an integer score on a scale from 0 to 16, with higher numbers indicating the greatest risk that the project could be the source of an undiscovered security hole. The peculiar aspect to the story, however, is that the CII appears to have amassed a list of high-risk projects that has little to do with the results of the Census Project.
Scoring projects
The process used to determine the scores, though, did not involve any inspection of the code itself—just a look at project "metadata" of various flavors. As described on the web site, the Census Project counts seven factors when compiling its scores. How these factors are measured requires a more detailed examination (below), but the list itself is short:
- The number of CVEs filed (worth from 0 to 3 points)
- The project's contributor count over the past 12 months (2 to 5 points)
- The project's ranking in the Debian popularity list (point value unspecified)
- Whether or not the project has a known web site (0 or 1 point)
- Whether or not the package is exposed to the network (0 or 2 points)
- Whether or not the package processes network data (0 or 1 point)
- Whether or not the package could be used for local privilege escalation (0 or 1 point)
- Whether or not the project includes an executable or only provides data (0 or -3 points)
The number of points assigned for popularity in Debian is not specified. The other factors, however, are only enough to add up to a score of 13, so perhaps the popularity is a 0–to–3 score—and it would appear that a high popularity ranking corresponds to more "risk" points. In addition, CII's Emily Ratliff noted that only CVEs since 2010 were counted.
Individual pages for each project assessment provide a bit more detail (see, for example, the page for tcpd), noting which language the program is implemented in, so other factors may be part of the scoring formula. Ultimately, of course, the score is the product of a human assessment of the project, as the CII web site makes plain. While some of the input data is harvested from Debian and from Black Duck's OpenHub, other factors clearly involve some qualitative judgment—such as whether or not a package could be used for local privilege escalation—and the white paper mentions that the speed with which CVEs are fixed played a role in the rankings.
Of the packages assessed so far, the first big cliff in the scoring occurs between the packages scoring 9 or above and those scoring 8 or below. This top-scoring class of packages includes the following:
Package Score tcpd 11 whois 11 ftp 11 netcat-traditional 11 at 10 libwrap0 10 traceroute 10 xauth 10 bzip2 9 hostname 9 libacl1 9 libaudit0 9 libbz2-1.0 9 libept1.4.12 9 libreadline6 9 libtasn1-3 9 linux-base 9 telnet 9
Regrettably, the raw numbers that make up each package's score do not appear to be available. It would have been interesting to see the exact point values assigned for number of contributors, for example. It is also not entirely clear how some of factors are scored—does "could be used for local privilege escalation" mean simply "is installed setuid," for example? The project has a GitHub repository where some of the data-scraping code can be inspected, but the CII site and white paper both indicate that human assessment of the data plays a major role in the final process (starting with cleaning up the "noisy" raw data).
Beyond scores
In the end, though, the oddest thing about the scoring is that
these raw scores do not indicate which projects
CII will invest in. The white paper, after a lengthy (60-page) explanation of
the methodologies employed, comes up with a different set of
human-selected "riskiest" projects based on the authors'
"knowledge of how the programs are used
" and on which
project "appear to be relatively unmaintained
". The
human-identified project list includes: xauth, bzip2, libaudit0,
libtasn1-3, bind9, exim4, isc-dhcp, gnutls26, gpgme, openldap, pam,
openssl, net-tools, openssh, rsyslog, wget, apr-util, coolkey, ntp,
gnupg, gzip, expat, freetype, libgcrypt11, keyutils, xz-utils,
p11-kit, pcre3, cyrus-sasl2, libxml2, shadow, tar, zlib, apr,
libjpeg8, libpng, libressl, unzip, giflib, mod-gnutls, postfix, and
cryptsetup.
This list contains little that is surprising. The projects highlighted are those that must deal with untrusted network connections, those that are responsible for processing potentially malicious data file formats, and those that are responsible for enforcing security measures for the system as a whole or for application programs. This may seem a bit anticlimactic, since it varies little from the list that any security-conscious user might come up with on their own.
Nevertheless, it is good to see someone attempt systematic analysis to reach a conclusion about the riskiness of common programs. The troubling factor is that, so far, the analysis only underscores common sense. The larger question is what CII intends to do with this information. The first few CII-supported projects (ntpd, GnuPG, Frama-C, OpenSSL, OpenSSH, Debian's reproducible builds, and The Fuzzing Project) were selected before there was a formal process in place.
The Census Project is a first step toward assembling such a
process. Still, the web page makes a point of saying that "the
decision to fund a project in need is not automated by any
means.
" The white paper concludes by saying only that CII
participants "believe the next step is to further investigate
these OSS projects for security and project healthiness.
"
Interestingly enough, outsiders are invited to participate in the CII's project-identification process by contributing patches or suggestions to the Census Project code on GitHub or by writing to one of the CII mailing lists. Thus far, two other projects have been suggested for consideration on the cii-census list (the archives of which are visible only to subscribers): the Trinity fuzz tester and the PaX patch set. Both suggestions were referred to the CII steering committee, which includes one representative each from the supporting companies: Amazon Web Services, Adobe, Bloomberg, Cisco, Dell, Facebook, Fujitsu, Google, Hitachi, HP, Huawei, IBM, Intel, Microsoft, NetApp, NEC, Qualcomm, RackSpace, Salesforce.com, and VMware.
The CII itself is still finding its footing. Apart from the Census
Project, multiple pages on the site invite projects interested in
funding to contact the CII with a grant request, but point them to a
contact page
but does not yet have a formal process defined. Time will tell how CII goes about selecting which
projects to support from among the high-risk prospects. Hopefully,
much of that selection process will take place in the open. As this
census shows, there is
no shortage of important projects that are in need of additional
support; transparency in determining which ones merit support is as
important as the ability to study the resulting improvements to the
source code.
| Index entries for this article | |
|---|---|
| Security | Core Infrastructure Initiative |
| Security | Research |