Appropriate sources of entropy
A steady stream of random events allows the kernel to keep its entropy pool stocked up, which in turn allows processes to use the strongest random numbers that Linux can provide. Exactly which events qualify as random—and just how much randomness they provide—is sometimes difficult to decide. A recent move to eliminate a source of contributions to the entropy pool has worried some, especially in the embedded community.
The kernel samples unpredictable events for use in generating random numbers, storing that data in the entropy pool. Entropy is a measure of the unpredictability or randomness of a data set, so the kernel estimates the amount of entropy each of those events contributes to the pool. Many kernels run on hardware that is lacking some of the traditional sources of entropy. In those cases, the timing of interrupts from network devices has been used as a source of entropy, but it has always been controversial, so it was recently proposed for removal.
Two of the best sources of random data for the entropy pool—user interaction via a keyboard or mouse and disk interrupts—are often not present in embedded devices. In addition, some disk interfaces, notably ATA, do not add entropy, which extends the problem to many "headless" servers. But network interrupts are seen as a dubious source of entropy because they may be able to be observed, or manipulated, by an attacker. In addition, as network traffic rises, many network drivers turn off receive interrupts from the hardware, allowing the kernel to poll periodically for incoming packets. This would reduce entropy collection just at the time when it might be needed for encrypting the traffic.
This is not the first time eliminating the IRQF_SAMPLE_RANDOM flag
from network drivers has come up; we looked at the issue two years
ago (though the flag was called SA_SAMPLE_RANDOM at that time).
It has come up again, starting with a query on linux-kernel from
Chris Peterson: "Should network devices be allowed to contribute
entropy to /dev/random?
" Jeff Garzik, kernel network device driver
maintainer, answered: "I tend to push people to /not/ add
IRQF_SAMPLE_RANDOM to new drivers,
but I'm not interested in going on a pogrom with existing code.
"
For anyone that is interested in such a pogrom, Peterson proposed a patch to eliminate the flag from the twelve network drivers that still use it. This sparked a long discussion on how to provide entropy for those devices that do not have anything else to use. While the actual contribution of entropy from network devices is questionable, mixing that data into the pool does not harm it, as long as no entropy credit—the current estimate of entropy in the pool—is awarded. Alan Cox proposed a new flag to track sources like that:
Some were in favor of an approach like this, but Adrian Bunk notes that:
If a customer wants to use /dev/random and demands to get dubious data there if nothing better is available fulfilling his wish only moves the security bug from his crappy application to the Linux kernel.
Part of the problem stems from a misconception about random numbers gotten from /dev/random versus those that are read from /dev/urandom, which we described in a Security page article last December. In general, applications should read from /dev/urandom. Only the most sensitive uses of random numbers—keys for GPG for example—need the entropy guarantee that /dev/random provides. In a system that is getting regular entropy updates, the quality of the random numbers from both sources is the same.
There is still an initialization problem for some systems, though, as Ted Ts'o points out:
A potential entropy source, even for embedded systems, is to sample other kernel and system parameters that are not predictable externally. Garzik suggests:
And there are plenty of untapped entropy sources even so, such as reading temperature sensors, fan speed sensors on variable-speed fans, etc.
Heck, "smartctl -d ata -a /dev/FOO" produces output that could be hashed and added as entropy.
Another source is from hardware random number generators. The kernel already has support for some, including the VIA Padlock that seems to be well thought of. Not all processors have such support, however. The Trusted Platform Module (TPM) does have random number generation and is becoming more widespread, especially in laptops, but there is no kernel hw_random driver for TPM.
Garzik advocates adding a kernel driver for what he calls the "Treacherous Platform Module", but as others pointed out, it can all be done in user space using the TrouSerS library. Even for the hardware random number generators that are supported in the kernel there is no automatic entropy collection, as it is left up to user space to decide whether to do that. This is done to try and keep policy decisions about the quality of the random data out of kernel code.
Systems that wish to sample that data should use rngd to feed the kernel entropy pool. rngd will apply FIPS 140-2 tests to verify the randomness of the data before passing it to the kernel. Andi Kleen is not in favor of that approach:
There is concern that some of the hardware random number generators are poorly implemented or could malfunction, so it would be dangerous to automatically add that data into the pool. Doing the FIPS testing in the kernel is not an option, leaving it up to user space applications to make the decision. There is nothing stopping any superuser process from adding bits to the entropy pool—no matter how weak—but the consensus is that the kernel itself must use sources it knows it can trust.
Another instance of this problem—in a different guise—appears in a discussion about random numbers for virtualized I/O, with Garzik asking: "Has anyone yet written a "hw" RNG
module for virt, that reads the host's
random number pool?
" Rusty Russell responded with a patch for a virtio "hardware"
random number generator as well as one that adds it into his lguest
hypervisor. The lguest patch reads data from the host's
/dev/urandom,
which is not where H. Peter Anvin thinks it
should come from:
The virtio implementation only provides the hw_random implementation, thus it requires user space help to get entropy data into the kernel. Much like any process that can read /dev/random, lguest could exhaust the host entropy pool, so there was some discussion of limiting how much random data guests can request from the device. A guest implementation could then use a small pool of entropy read from the host to seed its own random number generator for the simulated hardware device.
Removing the last remaining uses of IRQF_SAMPLE_RANDOM in network drivers seems likely, though some way to mix that data into the entropy pool without giving it any credit is still a possibility. With luck, that will encourage more effort into incorporating new sources of entropy using tools like EGD or, for systems that have it available, random number hardware. For systems that lack the traditional entropy sources, this should lead to a better initialized and fuller pool, while eliminating a potential attack by way of network packet manipulation.
| Index entries for this article | |
|---|---|
| Kernel | Random numbers |
| Kernel | Security |