Unpredictable sequence numbers

By Jake Edge
August 17, 2011

It has been known for 15 years or more that using predictable network sequence numbers is a security risk, so most implementations, including Linux, have randomized the initial sequence number (ISN) for TCP connections. Due to performance concerns, though, Linux used a combination of the MD4 cryptographic hash, along with changing the random seed every five minutes, to create the ISN. In addition, only a partial MD4 implementation was used, which effectively limited the ISNs to 24 bits of randomness. That's all changed with a recent patch that has been merged into the mainline as well as the stable and longterm kernels.

Sequence numbers are used by TCP to keep the bytes in the connection stream in order. An ISN is established at the time the connection is made, and incremented by the number of data bytes in each packet. That way, both sides of the connection can recognize when they have received out-of-order packets and ensure that the data that gets handed off to the application is properly sequenced.

Initially, TCP specified that ISNs would increment every four microseconds to avoid having multiple outstanding connections with the same sequence number. But, in the mid-90s, it was recognized that predictability in choosing ISNs could be used by attackers to potentially inject packets into the set up of a connection, or into an established session itself. That led to RFC 1948, which suggested establishing a separate sequence number space for each connection, and randomizing the ISNs based on the connection parameters.

Basically, the idea is that by using the source address/port and destination address/port as input to a cryptographic hash (the RFC suggests MD5), along with a random seed generated at boot time, an unpredictable ISN can be created. But Linux went its own way, using the partial MD4 and resetting the random seed frequently (which was meant to add some additional unpredictability).

According to the description in David Miller's patch, Dan Kaminsky recently alerted the kernel security mailing list (i.e. security@kernel.org, which is a closed list for security discussions) that the Linux ISN generation was vulnerable to brute force attacks. Presumably, the increased speed of today's computers coupled with the higher bandwidth available means that a brute force attack against a 24-bit space is more plausible today. Also, as Miller points out, the increase in computer speed also means that the need for using MD4 for performance reasons has likely passed.

Over the years since RFC 1948, MD5 has been considerably weakened, so SHA-1 was also considered for the Linux fix. But, as Miller describes it, the performance cost was simply too high:

MD5 was selected as a compromise between performance loss and theoretical ability to be compromised. Willy Tarreau did extensive testing and SHA1 was found to harm performance too much to be considered seriously at this time.

Down the road, a sysctl knob may be added to select different modes, Miller said. That could include the "super secure" SHA-1 version, as well as a mode that turns off any hashing for networks that run in trusted environments.

While it may have made sense at the time, it is clear that using MD4 (and effectively limiting it to 24 bits of randomness) is just too risky today. Attacks against the earlier implementation may be hard to pull off, but the effects can be rather serious. The RFC describes an attack that would inject commands into a remote shell session. While rsh is not used very frequently—at all?—any more, there are other kinds of attacks that are possible too. It's good to see this particular hole get filled.

Index entries for this article
Security	Internet
Security	Linux kernel/Networking
Security	Networking

to post comments

Unpredictable sequence numbers

Posted Aug 18, 2011 6:08 UTC (Thu) by dlang (guest, #313) [Link]

if you can get on a machine on the same subnet as either endpoint, ARP spoofing will give you full access to the session (in both directions) for a fairly trivial effort.

encryption goes a long way to defeating both problems, unless the attacker modifies the data flowing both directions to become a man in the middle (including beating whatever crypto authentication mechanism is in place) the data injected into the session will just corrupt the dataflow, not do anything useful for the attacker

the only advantage this new attack has is that you can inject data without being in the middle of the connection, although you do somehow need to figure out the source port number of a connection.

Unpredictable sequence numbers

Posted Aug 18, 2011 16:37 UTC (Thu) by linuxjacques (subscriber, #45768) [Link] (1 responses)

I run embedded devices with relatively weak processors on trusted networks and need every bit of TCP performance.

How do I disable this?

Oh, in the future a "knob may be added" ?

Gee, thanks for ignoring the embedded case (again).

Unpredictable sequence numbers

Posted Aug 20, 2011 11:37 UTC (Sat) by mastro (guest, #72665) [Link]

I would suggest to do some benchmarking/profiling first. It seems very unlikely to me that the difference between md4 and md5 of only a few bytes calculated only once at the beginning of a connection would have a measurable impact.

Much less be a bottleneck.

But estimating performance is very hard so measure first the difference between the two algorithms.

Unpredictable sequence numbers

Posted Aug 18, 2011 17:55 UTC (Thu) by zooko (guest, #2589) [Link] (1 responses)

So does this mean vindication for the people who argued (in https://lwn.net/Articles/332602/ ) that the weak PRNG used in Address Space Layout Randomization would get re-used in other contexts where its weakness was a bigger problem?

Unpredictable sequence numbers

Posted Aug 18, 2011 18:09 UTC (Thu) by dlang (guest, #313) [Link]

very unlikely, the sequence number randomisation has been around a lot longer.

Unpredictable sequence numbers

Posted Aug 19, 2011 8:59 UTC (Fri) by Np237 (guest, #69585) [Link]

I recall this to be a very serious issue for some Microsoft and Sun OSes in the late 90s.

Does anyone know what sequence number generators other OSes use nowadays? I bet they could be running into the same performance/security trade-offs as Linux.

Does it matter?

Posted Aug 19, 2011 10:53 UTC (Fri) by epa (subscriber, #39769) [Link] (3 responses)

Why does this matter? We all know that TCP/IP connections can be hijacked and snooped. That's why ssh and https exist. Is it just a case of avoiding denial of service attacks?

Does it matter?

Posted Aug 19, 2011 19:17 UTC (Fri) by njs (subscriber, #40338) [Link] (2 responses)

Normally, to hijack a TCP connection, you need to be "in the middle" in some sense -- have access to some router that the TCP is flowing over, or be on the same LAN to run arp spoofing, etc. I can't just hijack your connection to LWN from my home router. Sequence numbers are the thing that stops me -- if you can guess the sequence numbers for other people's connections, then under the right circumstances you can insert stuff into any TCP connection anywhere from any internet-connected host.

("The right circumstances" are somewhat tricky to achieve -- I'll skip the details, they should be easy to google -- but there are practical attacks possible.)

Does it matter?

Posted Aug 19, 2011 23:49 UTC (Fri) by pflugstad (subscriber, #224) [Link] (1 responses)

I think you missed epa's point.

Even being able to predict TCP sequence numbers does not allow you to inject traffic into an existing SSH or SSL (https) connection. Both protocols encrypt the data and have integrity checks over the data, so if you injected data, it would fail to decrypt and/or fail the integrity checks.

So the worst that you can probably do if you can predict TCP sequence numbers is force the connection to be reset - packets with an invalid TCP sequence number would be discarded - if the seq num is valid, then SSL/SSH would flag it and abort the connection.

Does it matter?

Posted Aug 20, 2011 1:36 UTC (Sat) by njs (subscriber, #40338) [Link]

Yes, but I also use protocols like HTTP that don't have cryptographic integrity guarantees... and those protocols are more at risk if TCP sequence numbers are predictable than if they aren't, which is why TCP sequence numbers matter beyond DoS attacks. Which was epa's question...

Unpredictable sequence numbers

Posted Aug 26, 2011 15:21 UTC (Fri) by slashdot (guest, #22014) [Link]

Why not just use /dev/random or /dev/urandom at the user's choice?