The leap second of doom

By Jake Edge
August 1, 2012

Since the last leap second caused a certain amount of havoc on Linux systems, it was probably only a matter of time before someone came up with the idea of "testing" for vulnerable systems again. Leap seconds are only supposed to occur at the end of June and December, with six months notice, so administrators might well have been waiting to update their servers for the problem until another was nigh. But "rogue" (or buggy) network time protocol (NTP) servers can effectively cause a leap second at the end of any month—which seems to be what happened on July 31.

It is not uncommon for "black hats" to keep exploiting vulnerabilities well after updates to fix them have been released. This situation is a bit different, though. While updating systems to avoid known vulnerabilities is clearly a "best practice", sometimes system administrators choose to delay updates, especially those that require a reboot, based on their sense of the likelihood of an attack. Given that no real leap seconds were scheduled, and the subversion of NTP servers (or traffic) may have seemed relatively unlikely, some (perhaps large) percentage of Linux systems have not been updated. But, not all "attacks" are caused by black hats; the original problem was caused by a bug, this one may also turn out that way.

Marco Marongiu appears to have been the first to notice the problem:

This is just to warn you that there are now some NTP servers around the globe spreading a leap second announcement for tomorrow 00:00:00 UTC (so, basically, in a few hours now).

If you didn't take action before the leapocalypse last month, you better hurry now.

Given that the notice (to the NTP questions mailing list) came less than four hours before the second "leapocalypse", it's hard to imagine that many administrators saw it in time to take action.

The most interesting question, of course, is how this could have happened. It is tempting to see it as some kind of worldwide denial of service attack, but that is probably not the most likely cause. Further discussion in the thread with Marongiu's warning points to another possible cause.

It seems that the NTP protocol has a "leap" flag (aka LI or leap indicator), which is a two-bit field that indicates whether a second should be inserted or deleted at the end of the current month. Adding a leap second at the end of any month does not correspond with current practice (June and December leap seconds only), but depending on which standard you look at, it is reasonable to do so. RFC 5905, which governs NTP, definitely allows leap seconds at the end of any month, however, so compliant implementations should allow that.

But that still leaves the question of why the LI flag was set to 1 (i.e. add a second at the end of the month). In the thread, "demonccc" noted a server with the flag set. Furthermore, Martin Burnicki described a problem his customers saw after June's leap second in which certain older NTP servers did not reset the leap flag after the event. That could cause leap seconds at the end of every month until it gets fixed.

While there aren't widespread reports of Linux systems going into infinite loops and burning up excess power (unlike June), it does appear to have affected some systems out there. The MythTV users mailing list has a thread about the problem, for example. If it is an actual attack, it is a clever one, but there are enough signs pointing to NTP server bugs that it's pretty unlikely.

Even if it is "just" caused by a bug (or bugs), it is still a bit worrisome. NTP has not generally been seen as a vector for attacks, but this situation shows that it could be. Unpatched systems could be targeted by man-in-the-middle attacks toward the end of every month for example. Both leap-second occurrences (real and fake) point to the problems that can lurk in code that only truly gets tested once in a great while. One wonders what might happen to systems (patched or not) that receive a "subtract a second" NTP message, since there has never been a real negative leap second.

Index entries for this article
Security	Network Time Protocol (NTP)

to post comments

The leap second of doom

Posted Aug 2, 2012 20:51 UTC (Thu) by liljencrantz (guest, #28458) [Link]

Jake, who exactly says that "NTP has not generally been seen as a vector for attacks"? My employer handles massive amounts of traffic, and the only possible way to handle network hiccups without going down is to aggressively deadline old queries. As the same query moves from subsystem to subsystem on it's path to completion, keeping the internal clocks of all servers synced is absolutely critical, a task that falls squarely on NTP. We have though hard about what a rouge NTP time source could do to our service and how we can protect ourselves.

The leap second of doom

Posted Aug 2, 2012 22:32 UTC (Thu) by nix (subscriber, #2304) [Link]

Martin Burnicki described a problem his customers saw after June's leap second in which certain older NTP servers did not reset the leap flag after the event.

Later, he described a more interesting case, where multiple NTP servers had each other set as upstreams. If something (perhaps network latency) caused a query to be sent to one of these servers from one of the others just before the end of June, such that it arrived just after the start of July, that server would instantly flip the leap-second flag (that it had just flipped off) back on. After that, because it's the upstream for all the other NTP servers in this little 'liar group', the lie would spread to all the others, even if they had a higher-stratum machine or an accurate time source telling them otherwise.