Other approaches to random number scalability

By Jake Edge
October 21, 2015

Back in late September, we looked at a patch to improve the scalability of random number generation on Linux systems—large NUMA systems, in particular. While the proposed change solved the immediate scalability problem, there were some downsides to that approach, in terms of both complexity and security. Some more recent discussion has come up with other possibilities for solving the problem.

The original idea came from Andi Kleen; it changed the kernel's non-blocking random number pool into a set of pools, one per NUMA node. That would prevent a spinlock on a single pool from becoming a bottleneck. But it also made the kernel's random number subsystem more complex. In addition, it spread the available entropy over all of the pools, effectively dividing the amount available to users on any given node by the number of pools.

But, as George Spelvin noted, the entropy in a pool is "not located in any particular bit position", but is distributed throughout the pool—entropy is a "holographic property of the pool", as he put it. That means that multiple readers do not need to be serialized by a spinlock as long as each gets a unique salt value that ensures that the random numbers produced are different. Spelvin suggested using the CPU ID for the salt; each reader hashes the salt in with the pool to provide a unique random number even if the pool is in the same state for each read.

Spelvin provided a patch using that approach along with his comments. Random number subsystem maintainer Ted Ts'o agreed with Spelvin about how the entropy is distributed, but had some different ideas on how to handle mixing the random numbers generated back into the pool. He also provided a patch and asked Kleen to benchmark his approach. "I really hope it will be good enough, since besides using less memory, there are security advantages in not spreading the entropy across N pools."

Either approach would eliminate the lock contention (and cache-line bouncing of the lock), but there still may be performance penalties for sharing the pool among multiple cores due to cache coherency. The non-blocking pool changes frequently, either as data gets mixed in from the input pool (which is shared with the blocking pool) or as data that is read from the pool gets mixed back in to make it harder to predict its state. The cache lines of the pool will be bounced around between the cores, which may well be less than desirable.

As it turned out, when Kleen ran his micro-benchmark, both patch sets performed poorly in comparison to the multi-pool approach. In fact, for reasons unknown, Spelvin's was worse than the existing implementation.

Meanwhile, while the benchmarking was taking place, Ts'o pointed out that it may just make sense to recognize when a process is "abusing" getrandom() or /dev/urandom and to switch it to using its own cryptographic-strength random number generator (CSRNG or CRNG) seeded from the non-blocking pool. That way, uncommon—or, more likely, extremely rare—workloads won't force changes to the core of the Linux random number generator. Ts'o is hoping to not add any more complexity into the random subsystem:

At this point, I wonder if it might not be simpler to restrict the current nonblocking pool to kernel users, and for userspace users, the first time a process reads from /dev/urandom or calls getrandom(2), we create for them a ChaCha20 CRNG, which hangs off of the task structure. This would require about 72 bytes of state per process, but normally very few processes are reading from /dev/urandom or calling getrandom(2) from userspace.

The CRNG would be initialized from the non-blocking pool, and is reseeded after, say, 2**24 cranks or five minutes. It's essentially an OpenBSD-style arc4random in the kernel.

Spelvin was concerned that the CSRNG solution would make long-running servers susceptible to backtracking: using the current state of the generator to determine random numbers that have been produced earlier. If backtracking protection can be discarded, there can be even simpler solutions, he said, including: "just have *one* key for the kernel, reseeded more often, and a per-thread nonce and stream position." But Ts'o said that anti-backtracking was not being completely abandoned, just relaxed: "We are discarding backtracking protection between successive reads from a single process, and even there we would be reseeding every five minutes (and this could be tuned), so there is *some* anti-backtracking protection."

Furthermore, he suggested that perhaps real abusers could get their own CSRNG output, while non-abusers would still get output from the non-blocking pool:

On the flip side, the time when you might care about anti-backtracking protection is say, when you're generating a new session key for a new connection. So perhaps one approach is to use some kind of ratelimiter algorithm so that if you're using /dev/urandom "carefully" (say, no more than ten times a second), we'll use the non-blocking pool. But once a process exceeds that limit, it will switch over the the CRNG, and then the only performance that abuser process will hurt is its own (because it would be even faster if they were running something like arc4random in userspace).

Spelvin had suggested adding another random "device" (perhaps /dev/frandom) to provide the output of a CSRNG directly to user space, because he was concerned about changing the semantics of /dev/urandom and getrandom() by introducing the possibility of backtracking. But he agreed that changing the behavior for frequent heavy readers/callers would not change the semantics since the random(4) man page explicitly warns against that kind of usage:

[...] so if any program reads more than 256 bits (32 bytes) from the kernel random pool per invocation, or per reasonable reseed interval (not less than one minute), that should be taken as a sign that its cryptography is not skillfully implemented.

Spelvin posted another patch set that pursues his ideas on improving the scalability of generating random numbers. It focuses on the reducing the lock contention when the output of the pool is mixed back into the pool to thwart backtracking (known as a mixback operation). If there are multiple concurrent readers for the non-blocking pool, Spelvin's patch set ensures that one of them causes a mixback operation; others that come along while a mixback lock is held simply write their data into a global mixback buffer, which then gets incorporated into the mixback operation that is done by the lock holder when releasing the lock.

There has been no comment on those patches so far, but one gets the sense that Ts'o (or someone) will try to route around the whole scalability problem with a separate CSRNG for abusers. That would leave the current approach intact, while still providing a scalable solution for those who are, effectively, inappropriately using the non-blocking pool. Ts'o seemed strongly in favor of that approach, so it seems likely to prevail. Kleen has asked that his multi-pool approach be merged, since "it works and is actually scalable and does not require any new 'cryptographic research' or other risks". But it is not clear that the complexity and (slightly) reduced security of that approach will pass muster.

Index entries for this article
Kernel	Random numbers
Security	Random number generation

Don't mess with my random numbers

Posted Oct 22, 2015 13:49 UTC (Thu) by abatters (✭ supporter ✭, #6932) [Link] (3 responses)

I would strongly prefer that /dev/random and /dev/urandom always provide the highest-quality randomness possible, even if it is slow. I do not want to downgrade the security of my encryption keys to cater to the abuses of a few misbehaving programs. If those programs run too slowly, then either add another random device for them with different sematics (e.g. /dev/prandom), or convert them to use their own userspace CSRNG.

Don't mess with my random numbers

Posted Oct 22, 2015 19:05 UTC (Thu) by eternaleye (guest, #67051) [Link]

The concerns about "backtracking" (and the overall concerns about the "quality" of randomness from random/urandom/csprng) are very much overstated.

In particular, if ChaCha20 was vulnerable to backtracking, that would be exactly equivalent to a known-plaintext key-recovery attack, which would mean it was completely broken as a stream cipher. That's not a realistic concern.

(Equivalence: In a stream cipher, the keystream is XORed with the plaintext to generate the output. If the plaintext is known, it can be XORed again to recover the used portion of the keystream. However, using that to predict future output would be a complete break, and in any stream cipher is insufficient - you must instead use what keystream you recovered to figure out the key, which is exactly what a cipher is designed to defend against.)

The main concerns with making an RNG are:

Insufficient access to sources of entropy

This is distinct from insufficient access to entropy - only the kernel has access to certain data, such as disk timings. Thus, implementing an RNG in userspace is a real tradeoff, as you either sacrifice exactly the speed you sought to gain by reseeding with syscalls more frequently, or sacrifice security by not doing so.

Poor implementation

Crypto is hard. Secure crypto is harder. See also: Debian's changes to OpenSSL's random implementation, OpenSSL's SILENTLY falling back to a horrendously bad RNG if it can't use the system one, etc.

Primitive choice

Choosing the correct building blocks for an RNG is important as well - in fact, the current Linux RNG, based on a very ad-hoc construction using SHA-1, half-MD4, etc, has been criticized on these grounds with detectable biases as a result. ChaCha20 is a superior primitive for deriving a stream from the pool than Linux' random currently uses, as a matter of fact: Better statistical behavior, far better security analysis, and better speed.

Don't mess with my random numbers

Posted Oct 25, 2015 5:29 UTC (Sun) by malor (guest, #2973) [Link] (1 responses)

I'm in this camp; I would rather see the existing interfaces left alone, and then a new /dev/frandom device provided for people who need mass quantities of semi-random bits. (like for disk wiping).

I presently use urandom for that, which I have just learned today is something I'm not supposed to be doing. That's fine, but I'd rather see a facility for doing what I'm doing (using a flood of semi-random data as a drive wiper), rather than seeing urandom weakened in any way.

Yes, I'm a urandom abuser, and probably deserve a hearty finger shake and a tsk or two, but I think trying to defend against scurrilous people like me, by changing the quality of the bits provided, could be a really severe hidden issue. People truly depend on that stuff, and it's quite possible that predictable output from /dev/urandom could end up killing someone. No hyperbole there.... if a repressive government known for mass surveillance is able to break into privileged communications that depended on the strength of /dev/urandom, both parties could be executed.

Further, it's going to be real hard to test from outside, and as a user of bits, you'd want some kind of signal that you were no longer getting ones with the same guarantee anymore. I don't even know how you'd do that. And writing test cases to make sure that it was doing what it was supposed to do, under all circumstances, would be very hard, and testing the heuristic 'only punish abusers' code would be almost impossible.

/dev/urandom is understood to not be as good as /dev/random, and people aren't *supposed* to use it for really important stuff, but they might be doing it anyway. I'd say preserving existing guarantees is probably more important than scalability. Rather, switch jerks like me to the ChaCha20 cipher with /dev/frandom. I'd be perfectly happy to switch (and could probably wipe drives a fair bit faster, to boot.)

That just seems safest on every front. The random pools are NOT things to screw around with lightly.

Don't mess with my random numbers

Posted Oct 27, 2015 0:47 UTC (Tue) by eternaleye (guest, #67051) [Link]

> I presently use urandom for that, which I have just learned today is something I'm not supposed to be doing.

Not quite; "abuser" is being used very loosely on the ML.

> rather than seeing urandom weakened in any way.

Thinking that the proposed idea would weaken urandom is "correct" in the same way as thinking that because ZFS uses 128-bit sizing it's less likely to run out of space on your computer than ext4 which uses 64-bit sizing: the math is correct, but the actual expenditure needed to even approach it being _relevant_ is so unrealistic in terms of *physics* as to be laughable.

> People truly depend on that stuff, and it's quite possible that predictable output from /dev/urandom could end up killing someone.

Considering that ChaCha20 has exactly the same weight on its shoulders, this is again true but irrelevant. (If ChaCha20 is weak in a way that affects using it in urandom, then TLS is in a lot of trouble).

> Further, it's going to be real hard to test from outside, and as a user of bits, you'd want some kind of signal that you were no longer getting ones with the same guarantee anymore.

The *existing* Linux RNG is hard to test - incidentally, researchers did it anyway. Turns out it's an ad-hoc mess and makes *worse* guarantees than ChaCha20. Funny, that.

> /dev/urandom is understood to not be as good as /dev/random

Not only wrong, but *dead* wrong, and actively harmful, to the point of causing vulnerabilities to DOS and other attacks: see http://sockpuppet.org/blog/2014/02/25/safely-generate-ran...

Cryptographers have long-since come to the conclusion that entropy estimation _doesn't work_ worth a damn, and the only difference between /dev/urandom and /dev/random is that the latter blocks based on entropy estimation.

Other approaches to random number scalability

Posted Oct 22, 2015 15:16 UTC (Thu) by alonz (subscriber, #815) [Link] (1 responses)

Just this week, Google's Adam Langley described details of BoringSSL – and it uses /dev/urandom for all of its randomness needs, without any additional CSPRNG… So now it looks like all of Google's servers, as well as all Android devices, will be constantly “abusing” the entropy pool.

Other approaches to random number scalability

Posted Oct 26, 2015 4:35 UTC (Mon) by abartlet (subscriber, #3928) [Link]

Samba has now also chosen do take this approach. We had a RC4-based PRNG we held in userspace, but chose to remove it as it was slow, complex and required tracking across fork() etc. These risks are avoided by trusting the Kernel. I would far rather we had one, good, kernel-based PRNG that can handle fork() without special work than 100s of poorly implemented userspace PRNG implementations looking for another Debian-OpenSSL disaster.

That is, read() of /dev/urandom is really easy to audit, and a call to the new getentropy() call will be even easier, while userspace PRNG systems are really hard.

Other approaches to random number scalability

Posted Oct 22, 2015 19:50 UTC (Thu) by jlargentaye (subscriber, #75206) [Link]

> George Spelvin noted

Speaking of that person, they're an alluring mystery to me. "George Spelvin" is a common pseudonym used in american theater [1]. The developer has been around for a long time and is clearly of high skill. For example, this post [2] where they improve the assembly implementation of SHA1 on PPC... without access to the HW! They used to only go by their email address, linux@horizon.com.

Has anyone else been intrigued, and/or have more info about this anonymous hacker?

[1] https://en.wikipedia.org/wiki/George_Spelvin
[2] http://www.gelato.unsw.edu.au/archives/git/0504/1753.html

Other approaches to random number scalability

Posted Oct 23, 2015 4:26 UTC (Fri) by josh (subscriber, #17465) [Link] (1 responses)

Or, better yet, we could discard the entire notion of the "entropy pool" beyond ensuring that we have enough entropy at startup to seed one or more CSPRNGs. The CS in CSPRNG stands for "cryptographically secure".

Other approaches to random number scalability

Posted Oct 25, 2015 5:11 UTC (Sun) by malor (guest, #2973) [Link]

Yes, but what's not included in that acronym is *believed* secure. A breakthrough by a sufficiently clever mathematician could change that overnight.

There probably aren't going to be any math breakthroughs that change measurements of user interactions on a microsecond scale.

Other approaches to random number scalability

Posted Oct 23, 2015 16:58 UTC (Fri) by alankila (guest, #47141) [Link] (4 responses)

Just two days ago, I tried to deploy an app on a virtual server using tomcat7. It wouldn't work because not enough entropy was available. Apparently the miserable thing reads something from /dev/random, and despite the RDRAND instruction was available on the server, it still wouldn't work, I guess because it wouldn't use it. Looking at dmesg, I saw that the kernel reported collecting like 16 bits of entropy after 2 seconds, and that it took full 24 seconds to merely collect enough entropy to start urandom. The mind boggles. Here we have capability to generate literally gigabytes of perfectly random data using instruction dedicated for the purpose, and yet... we don't.

Other approaches to random number scalability

Posted Oct 23, 2015 17:07 UTC (Fri) by mjg59 (subscriber, #23239) [Link] (2 responses)

If you've got RDRAND available, you should probably have rngd running to pipe it into /dev/random.

Other approaches to random number scalability

Posted Oct 25, 2015 5:08 UTC (Sun) by malor (guest, #2973) [Link] (1 responses)

RDRAND is generated by a CSPRNG running on the chip, so it's probably not suitable for that purpose. Broadwell and newer processors add a new RDSEED instruction, which supposedly reads more or less directly from the hardware RNG.

It's probably no more trustworthy than RDRAND in terms of the NSA's influence, but if you believe that they do what Intel claims, then RDSEED would be a good source to pipe back into the kernel, where RDRAND is of much lower quality.

Other approaches to random number scalability

Posted Oct 25, 2015 5:34 UTC (Sun) by mjg59 (subscriber, #23239) [Link]

rdrand is frequently reseeded, so as long as you're passing in appropriate entropy estimates you're probably good.

Other approaches to random number scalability

Posted Oct 24, 2015 2:12 UTC (Sat) by smckay (guest, #103253) [Link]

Crypto community is scared of RDRAND. It's widely-deployed proprietary RNG hardware that NSA has a large incentive to backdoor. Anyone who knows for certain whether it has been isn't going to say anything.

Also, it's a nice bonus at best if a JVM uses anything platform specific.