[go: up one dir, main page]

|
|
Log in / Subscribe / Register

Kernel building with GCC plugins

By Jonathan Corbet
June 14, 2016
It has long been understood that static-analysis tools can be useful in finding (and defending against) bugs and security problems in code. One of the best places to implement such tools is in the compiler itself, since much of the work required to analyze a program is already done in the compilation process. Despite the fact that GCC has had the ability to support security-oriented plugins for some years, the mainline kernel has never adopted any such plugins. That situation looks likely to change with the 4.8 kernel release, though.

For many years, GCC famously did not support plugins out of a fear that proprietary plugins would undermine the free compiler. That roadblock ended in 2009, when the GCC runtime library exemption was rewritten. This library, which is needed by almost every program built with GCC, can be linked with proprietary code — but only if no non-GPLv3 plugins were used in the compilation process. The addition of that rule gave the powers that be at the Free Software Foundation the confidence that they could safely add a plugin mechanism to GCC.

Relatively few plugins have materialized in any setting, perhaps because writing one requires a fairly deep understanding of how GCC works and the documentation available is not entirely helpful. (LWN ran an introduction to creating GCC plugins back in 2011). One group that did jump onto the plugin bandwagon, though, is grsecurity, where the ability to analyze — and transform — kernel code was quickly recognized as having a lot of potential. There were four plugins in the grsecurity patch set when LWN took a look in 2011. The current testing patch set from grsecurity shows twelve of them, performing a variety of functions:

  • Checker incorporates some address-space checks normally performed separately with the sparse tool.

  • Colorize simply adds color to some diagnostic output.

  • Constify makes structures containing only function pointers const.

  • Initify moves string constants that are only referenced in __init or __exit functions to the appropriate ELF sections.

  • Kallocstat generates information on sizes passed to kmalloc().

  • Kernexec is there to "to make KERNEXEC/amd64 almost as good as it is on i386"; it ensures that, for example, user-space pages are not executable by the kernel.

  • Latent_entropy tries to generate entropy (randomness) from the kernel's execution; more on this one below.

  • Randomize_layout reorganizes structure layout randomly.

  • Rap implements grsecurity's "return address protection" mechanism, described in this presentation [PDF].

  • Size_overflow (described on this page) detects some integer overflows.

  • Stackleak tracks kernel-stack usage so that the stack can be cleared on return to user space.

  • Structleak forcibly clears structure fields if they might be copied to user space.

These plugins have clear value to developers wishing to harden the kernel, and they are all free software (though many of them are GPLv2-only, meaning that they cannot be used to compile code needing the GCC runtime library; fortunately, the kernel does not use that library). So far, though, they remain unavailable to kernel developers and distributors, living only in the grsecurity patch set. There are no serious technical or legal obstacles keeping them out of the mainline, but nobody has made the effort to move them over — until now.

Plugins go mainline

Recently, interest in hardening the mainline kernel has increased — or, perhaps more accurately, resistance to doing so has decreased. One obvious way of doing so is to try to bring some of the ideas found in grsecurity into the mainline kernel; that includes the plugin mechanism. To that end, the Linux Foundation's Core Infrastructure Initiative has funded Emese Révfy, the developer of some of the above-listed plugins, to bring this functionality into the mainline kernel. The resulting patch set has been through several rounds of review and is currently staged in linux-next for a probable 4.8 merge.

Emese's patch set does not include all of the plugins listed above; indeed, it includes none of them. Instead, there are two relatively simple plugins provided as a sort of demonstration of how things can be done. One of them, called "sancov," inserts a tracing call at the beginning of each basic block of code. This feature is useful for anything requiring coverage tracking; it is aimed at the syzkaller fuzz tester in particular.

The other included plugin calculates the "cyclomatic complexity" of each function in the kernel. This metric is a simple count of the number of possible paths through the function; a higher complexity count indicates more twisted code that, perhaps, is a more likely hiding place for bugs. Emese has suggested that it could be incorporated into the build-testing systems, where it could emit warnings when somebody adds a new function above a given complexity threshold.

Your editor built an allyesconfig kernel with this plugin enabled; the result was nearly 620,000 complexity values printed to the output. According to this metric, the most complex function in the kernel, with a score of 817, is cache_alloc() in drivers/md/bcache/super.c — a demonstration of just how much complexity can be hidden in macros. Perhaps a more convincing demonstration is rt2800_init_registers(), a 450-line function weighing in at 586. The most complex core-kernel function is alloc_large_system_hash(), with a score of 278.

The latent_entropy plugin from grsecurity has been posted as a separate patch set. This plugin tries to address the problem that systems often have very little entropy available immediately after boot. It adds code to initialization-time functions; each of those functions will generate a pseudo-random value when called and mix it into the entropy pool. That did not seem particularly random to a number of observers; the key, according to "PaX Team", is that the timing and sequencing of this mixing varies according to the interrupts raised during system boot. Ted Ts'o commented that this "entropy" might merely duplicate that obtained from the interrupt-timing measurements that are already done. He noted that mixing it in twice won't hurt, but it may not help much either.

See this 2012 message from PaX Team for more information on how the latent_entropy plugin works.

As noted above, the plugin infrastructure and two simple plugins are currently poised to be merged for 4.8. The latent_entropy plugin is not in linux-next as of this writing, so it is likely to arrive later, if at all. But there is a whole set of existing plugins waiting for somebody to make the effort to bring them over and, even better, the potential for many other plugins to be written in the future. A pluggable compiler can be a potent tool for the checking and hardening of kernel code; the kernel community may have a lot to gain from making use of it.

Index entries for this article
KernelBuild system/GCC plugins


to post comments

Kernel building with GCC plugins

Posted Jun 15, 2016 15:30 UTC (Wed) by SEJeff (guest, #51588) [Link]

I've always wondered why something akin to tracepoints couldn't be inserted by the compiler to help tools such as perf/sysdig/utrace.

Kernel building with GCC plugins

Posted Jun 16, 2016 11:19 UTC (Thu) by wildea01 (subscriber, #71011) [Link] (1 responses)

> These plugins have clear value to developers wishing to harden the kernel, and they are all free software (though many of them are GPLv2-only, meaning that they cannot be used to compile code needing the GCC runtime library; fortunately, the kernel does not use that library).

As far as I can tell, there are a handful of architectures that appear to link against libgcc. Are they going to find themselves in trouble here?

Kernel building with GCC plugins

Posted Jun 16, 2016 20:37 UTC (Thu) by PaXTeam (guest, #24616) [Link]

if an arch's vmlinux links to code from libgcc that is under the 'GCC Runtime Library Exception' (not all the code of libgcc is under that license) then it will not have met the 'Eligible Compilation' condition and the end result cannot be propagated.

Only for kernel?

Posted Jun 17, 2016 11:49 UTC (Fri) by NAR (subscriber, #1313) [Link]

I was wondering if do some of these extensions make sense for user space programs? For example Constify or Initify seems to be useful.

Kernel building with GCC plugins

Posted Jun 17, 2016 15:36 UTC (Fri) by dskoll (subscriber, #1630) [Link] (4 responses)

Does Randomize_layout change only the padding of a structure? Or does it also change the ordering? Because the C standard does guarantee that structure members are laid out in memory in the order they are declared. I'm struggling to think of a use-case for something that changes the order of structure members.

Kernel building with GCC plugins

Posted Jun 21, 2016 12:29 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I'm not sure, but I imagine it means that exploits would need to discover what the order of struts were before they could exfiltrate data or chain function calls together.

Kernel building with GCC plugins

Posted Jun 21, 2016 15:46 UTC (Tue) by kdave (subscriber, #44472) [Link]

The exploit code searches for a pattern uid and gid in creds and the overwrites it to zeros (http://lxr.free-electrons.com/source/include/linux/cred.h...). With random layout, the pattern won't be found. Examples:
https://github.com/offensive-security/exploit-database/bl...
https://github.com/offensive-security/exploit-database/bl...

To make such exploits "work" and not hang, a fake copy of the credentials can be added to the expected location, all updates to the true creds would update both values. This needs more changes to code, so it's out of scope of the plugin.

Kernel building with GCC plugins

Posted Jan 4, 2017 6:12 UTC (Wed) by kaiwantech (subscriber, #108966) [Link] (1 responses)

Won't changing the order of members in a structure adversely affect cpu caching (thus violating the 'keep important/hotspot members together and at the top' rule)? AFAIK there are several instances of data structures in the kernel codebase which rely on a particular ordering, if nothing else then for cache optimization.

Kernel building with GCC plugins

Posted Jan 4, 2017 6:39 UTC (Wed) by kaiwantech (subscriber, #108966) [Link]

Okay, looks like I found the answer to my question (above) in a comment by 'joib' here [ https://lwn.net/Articles/705262/ ]:
"[...] It says in the article that it only affects structs which contain only function pointers (which all have the same size, AFAICS). Furthermore, the slides say that it can be configured to randomize only within a cache line."

Kernel building with GCC plugins

Posted Jun 17, 2016 16:54 UTC (Fri) by shemminger (subscriber, #5739) [Link] (11 responses)

It would be good if constify and initify could be used to produce warnings which then could be used to modify original code.
Not sure I like compiler silently fixing and changing things.

Kernel building with GCC plugins

Posted Jun 17, 2016 17:16 UTC (Fri) by vonbrand (subscriber, #4458) [Link] (2 responses)

Compilers do silently fix and change things. E.g. get rid of dead code, propagate constants and use the knowledge so gained to generate different code, reorder computations, and so on. It is called "optimization", and without it your code would be as fast (or lack thereof) as what tcc generates...

Kernel building with GCC plugins

Posted Jun 17, 2016 22:15 UTC (Fri) by PaXTeam (guest, #24616) [Link] (1 responses)

didn't you vehemently argue against all of this back in 2011 already: https://lwn.net/Articles/463283/ ? or are you just going with the flow, turning the coat, a day as any other?

Kernel building with GCC plugins

Posted Jun 29, 2016 8:17 UTC (Wed) by oldtomas (guest, #72579) [Link]

> didn't you vehemently argue against all of this back in 2011 already [...]

I actually went back and I do read vonbrandt's argument differently. In a nutshell, what I read there is that there's a cost in changing the semantic interpretation of C source from what the "standard" does. And that still makes sense.

> or are you just going with the flow, turning the coat, a day as any other?

See, and that is what spoils the interaction with you: you are extremely smart people, and do contribute significantly to our wellbeing. But this constant insinuation of malice on other people's parts is not really helpful.

Kernel building with GCC plugins

Posted Jun 17, 2016 22:37 UTC (Fri) by PaXTeam (guest, #24616) [Link] (7 responses)

the initify plugin has a verbose mode that will emit a message (not a 'warning' but a 'note') whenever it changes something and it's trivial to add something like that to the constify plugin as well. however i don't see anyone producing source code patches based on that information, it's just not feasible. we tried this with constify a few years ago where Emese submitted patches produced by a coccinelle script and the reception of that series is probably an object lesson of why this will never work. also some of the changes made by a plugin don't have an equivalent source code change, say initifying __func__.

Kernel building with GCC plugins

Posted Jun 18, 2016 1:08 UTC (Sat) by andresfreund (subscriber, #69562) [Link] (6 responses)

> we tried this with constify a few years ago where Emese submitted patches produced by a coccinelle script and the reception of that series is probably an object lesson of why this will never work.

The main outcome of that discussion appears to have been "please submit per-subsystem patches, instead of a huge patchseries doing everything". With a few exceptions Acked-By's seems to have easy to come by. I don't think the discussions support your conclusion.

Kernel building with GCC plugins

Posted Jun 18, 2016 12:42 UTC (Sat) by PaXTeam (guest, #24616) [Link] (5 responses)

there never was 'a huge patchseries doing everything'. it started with individual patches per ops type (only a small subset of the types that could have been constified). the first fork on the road occured right there as some people ack'd some patches while others flat out refused to even accept the usefulness of ops constification. then some people requested a further breakup per subsystem *and* per ops type which as expected multiplied the number of patches. this is where the second fork occured where some people objected to *that*. there was never any consensus reached as to how exactly these constification patches should be submitted and IIRC, even those that got ack'ed didn't always get applied. the icing on the cake was that checkpatch did get modified to find a few more of the writable ops structures but as you can verify it yourself, not many took it seriously and the kernel continues to have such ops variables left writable that by policy should be const.

Kernel building with GCC plugins

Posted Jun 19, 2016 18:05 UTC (Sun) by alonz (subscriber, #815) [Link] (1 responses)

I wonder if a half-way solution for constify would go better:
  1. Have the plugin add a "const_instances" attribute, which when applied to a struct causes all instances of this struct to automatically become const. (Hopefully, this means the size of the patch to actually add this attribute to all ops structures would be much smaller than what Emese had to try and push through previously…)
  2. Have another plugin (or an option to the same one) that produces warnings when a struct looks like it should be auto-const'able; make this shut up when the struct has a "non_const_instances" attribute (for ops structs that we do modify and cannot fix).
For initify—there is probably no other viable approach except the current one.

Another useful plugin would be something that causes errors on writes to ro_after_init variables from functions not marked __init… I wonder how feasible it is to implement.

Kernel building with GCC plugins

Posted Jun 19, 2016 19:26 UTC (Sun) by PaXTeam (guest, #24616) [Link]

the constify plugin already provides a do_const attribute for exactly that purpose (that we make use of for non-ops types already) and reporting on candidate types is also trivial to add in the constifiable() function. as for your last suggestion, it's doable but it will take a fair amount of static analysis that will effectively have to duplicate existing gcc logic in the C/C++ frontends. the easier way out is to mark these variables 'const' for a test compilation and examine the fallout.

Kernel building with GCC plugins

Posted Jun 21, 2016 19:40 UTC (Tue) by ebiederm (subscriber, #35028) [Link]

Changes that start with write a coccinelle script and update everything in the kernel. Always seem to be merged most easily if you have your own tree and send Linus the pull request. If the change is really as trivial as adding const where needed that shouldn't be a problem.

Now if a handful of those structures are not const for a some reason I can imaging problems. Otherwise it should be one of the boring things like big kernel lock push down, that just happens.

Kernel building with GCC plugins

Posted Jun 21, 2016 22:15 UTC (Tue) by tao (subscriber, #17563) [Link] (1 responses)

Having patches broken up per subsystem is *always* the norm for patch submission, especially for people who don't maintain any subsystems already. There are exceptions (people with a proven track record, such as Alexander Viro, for instance, can have their patches accepted even if they touch multiple subsystems at once), but unless a patch is of the "everything *has* to change in lockstep"
type, then you should expect having to submit a patch series per subsystem.

Sure, you might get lucky and have it accepted anyway, but if you get upset that your patch series doesn't accepted, without being willing to split it up by subsystem, then you might need to re-evaluate your notion that you're really doing what you can to get it merged.

Kernel building with GCC plugins

Posted Jun 21, 2016 23:37 UTC (Tue) by spender (guest, #23067) [Link]

Had you read any of the mailing list posts your exposition is based on, you would know that the "advice" you've provided is exactly what Emese ended up doing (as the PaX Team already explained in the comment you're replying to). I'll save you the hassle of using Google and provide you with direct links to the results of Emese breaking up the patches by subsystem and structure type. Feel free to read the rest of the thread.

Greg's request to have them split up by subsystem and structure type: http://marc.info/?l=linux-kernel&m=126020923129594&...
Responses to split up patches:
http://marc.info/?l=linux-kernel&m=126075717420046&...
http://marc.info/?l=linux-kernel&m=126075113315052&...
http://marc.info/?l=linux-kernel&m=126080632115907&...
http://marc.info/?l=linux-kernel&m=126009146214087&...

-Brad


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds