A kernel debugger in Python: drgn

By Jake Edge
May 29, 2019

A kernel debugger that allows Python scripts to access data structures in a running kernel was the topic of Omar Sandoval's plenary session at the 2019 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM). In his day job at Facebook, Sandoval does a fair amount of kernel debugging and he found the existing tools to be lacking. That led him to build drgn, which is a debugger built into a Python library.

Sandoval began with a quick demo of drgn (which is pronounced "dragon"). He was logged into a virtual machine (VM) and invoked the debugger on the running kernel with drgn -k. With some simple Python code in the REPL (read-eval-print loop), he was able to examine the superblock of the root filesystem and loop through the inodes cached in that superblock—with their paths. Then he did "something a little fancier" by only listing the inodes for files that are larger than 1MB. It showed some larger kernel modules, libraries, systemd, and so on.

He mostly works on Btrfs and the block layer, but he also tends to debug random kernel problems. Facebook has so many machines that there are "super rare, one-in-a-million bugs" showing up all the time. He often volunteers to take a look. In the process he got used to tools like GDB, crash, and eBPF, but found that he often wanted to be able to do arbitrarily complex analysis of kernel data structures, which is why he ended up building drgn.

GDB has some nice features, he said, including the ability to pretty-print types, variables, and expressions. But it is focused on a breakpoint style of debugging, which he cannot do on production systems. It has a scripting interface, but it is clunky and just wraps the existing GDB commands.

Crash is purpose built for kernel debugging; it knows about linked lists, structures, processes, and so on. But if you try to go beyond those things, you will hit a wall, Sandoval said. It is not particularly flexible; when he used it, he often had to dump a bunch of state and then post-process it.

BPF and BCC are awesome and he uses them all the time, but they are limited to times when you can reproduce the bug live. Many of the bugs he looks at are something that happened hours ago and locked up the machine, or he got a core dump and wants to understand why. BPF doesn't really cover this use case; it is more for tracing and is not really an interactive debugger.

Drgn makes it possible to write actual programs in a real programming language—depending on one's opinion of Python, anyway. It is much better than dumping things out to a text file and using shell scripts to process them or to use the Python bindings for GDB. He sometimes calls drgn a "debugger as a library" because it doesn't just provide a command prompt with a limited set of commands; instead, it magically wraps the types, variables, and such so that you can do anything you want with them. The User Guide and home page linked above are good places to start looking into all that it can do.

He launched into another demo that showed some of the power of drgn. It has both interactive and scripting modes. He started in an interactive session by looking at variables and noted that drgn returns an object that represents the variable; that object has additional information like the type (which is also an object), address, and, of course, value. But one can also implement list iteration, which he showed by following the struct task_struct chain from the init task down to its children.

While he had written the list iteration live in the demo, he pointed out that it would get tedious if you had to do so all of the time. Drgn provides a bunch of helper functions that can do those kinds of things. Currently, most of those are filesystem and block-layer helpers, but more could be added for networking and other subsystems.

He replayed an actual investigation that he and a colleague had done on a production server in a VM where the bug was reproduced. The production workload was a storage server for cold data; on it, disks that have not been used in a while are powered down to save power. So its disks tend to turn on and off a lot, which exposes kernel bugs. The cold-storage service ran in a container and it was reported that stopping the container would sometimes take forever.

When he started looking at it, he realized that the container would eventually finish, but that it took a long time. That suggested some kind of a leak. He showed the process of working his way down through the block control group data structures and used the Python Set object type to track the number of unique request queues associated with the block control groups. He was also able to dig around in the radix tree associated with the ID allocator (IDA) used for identifying request queues to double check some of his results. In the end, it was determined that the request queues were leaking due to a reference cycle.

He mentioned another case where he used drgn to debug a problem with Btrfs unexpectedly returning ENOSPC. It turned out that it was reserving extra metadata space for orphaned files. Once he determined that, it was straightforward to figure out which application was creating these orphaned files; it could be restarted periodically until a real fix could be made to Btrfs. In addition, when he encounters a new subsystem in the kernel, he will often go in with drgn to figure out how all of the pieces fit together.

The core of drgn is a C library called libdrgn. If you hate Python and like error handling, you can use it directly, he said. There are pluggable backends for reading memory of various sorts, including /proc/kcore for the running kernel, a crash dump, or /proc/PID/mem for a running program. It uses DWARF to get the types and symbols, which is not the most convenient format to work with. He spent a surprising amount of time optimizing the access to the DWARF data. That interface is also pluggable, but he has only implemented DWARF so far.

That optimization work allows drgn to come up in about half a second, while crash takes around 15s. Because drgn comes up quickly, it will get used more; he still dreads having to start up crash.

There is a subset of a C interpreter embedded into drgn. That allows drgn to properly handle a bunch of corner cases, such as implicit conversions and integer promotion. It is prickly and took some effort, but it means that he has not run into any cases where the translated code does not work the way it does in the kernel.

The biggest missing feature is backtrace support, he said. You can only access global variables at this point, which is not a huge limitation, but he does sometimes have to use crash to get addresses and other information to plug into drgn. It is something that is "totally possible to do in drgn", but he has not gotten there yet. He would like to use BPF Type Format (BTF) instead of DWARF because it is much smaller and simpler. But the main limitation is that BTF does not handle variables; if and when it does, he will use it. A repository of useful drgn scripts and tools is in the works as well.

Integration with BPF and BCC is something that has been nagging at him. The idea would be to use BPF for live debugging and drgn for after-the-fact debugging in some way. There is some overlap between the two, which he has not quite figured out how to unify. BPF is somewhat painful to work with due to its lack of loops, but drgn cannot really catch things as they happen. He has a "crazy insane idea" to have BPF breakpoints that call out to a user-space drgn program, but he is not at all sure it is possible.

That was the last session I was able to sit in on and this article completes LWN's LSFMM coverage. The talk on drgn made a nice segue for me, as I had to leave to catch a plane to (eventually) end up in Cleveland for PyCon.

Index entries for this article
Kernel	Debugging
Kernel	Development tools/Kernel debugging
Conference	Storage, Filesystem, and Memory-Management Summit/2019
Python	Applications

to post comments

A kernel debugger in Python: drgn

Posted May 29, 2019 22:16 UTC (Wed) by Paf (subscriber, #91811) [Link] (2 responses)

I am curious if epython/pykdump was mentioned? It’s a freely available python extension for crash that works quite well and allows much of what he’s describing. I wonder what Sandoval feels the advantages are of drgn... it sounds quite similar.

https://sourceforge.net/p/pykdump/wiki/Home/

A kernel debugger in Python: drgn

Posted May 30, 2019 20:27 UTC (Thu) by osandov (subscriber, #97963) [Link] (1 responses)

I wasn't aware of pykdump (and I couldn't get it to run despite my best efforts). It does look similar. As Jeff commented below, he's also been working on another similar tool, crash-python. Clearly, there's a need for this sort of thing and we haven't done a great job of publicizing our efforts, which is part of the reason I presented it at LSF/MM.

A kernel debugger in Python: drgn

Posted May 31, 2019 1:26 UTC (Fri) by Paf (subscriber, #91811) [Link]

Yeah, I’m getting that in the comments section here - lots of tools.

Pykdump has one decent set of instructions and several bad ones. These work, I used them the other day:
https://sourceforge.net/p/pykdump/wiki/Building%20From%20...

Those are python2, but I used them just fine for python3 changing only version numbers.

A kernel debugger in Python: drgn

Posted May 29, 2019 23:15 UTC (Wed) by kbingham (subscriber, #92041) [Link] (1 responses)

We already have python scripting support in scripts/gdb?

That includes a set of python interfaces such as lsmod, dmesg, and iterators and can all be extended as necessary?

Looking into drgn, there are certainly some nice looking helpers which would merit replicating into the in-kernel scripts.

A kernel debugger in Python: drgn

Posted May 30, 2019 0:40 UTC (Thu) by Paf (subscriber, #91811) [Link]

You said for gdb, but not for crash? Are these available for crash as well? (I'm aware it's a super fancy gdb wrapper)

A kernel debugger in Python: drgn

Posted May 30, 2019 3:34 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (19 responses)

> It has a scripting interface, but it is clunky and just wraps the existing GDB commands.

That's not the case at all. You *can* execute GDB commands with Python, yes, but you can also access a much richer object-based API that provides access to GDB's actual data model. See https://sourceware.org/gdb/onlinedocs/gdb/Python-API.html

Since the author of drgn is at FB, he should look up "agdb".

A kernel debugger in Python: drgn

Posted May 30, 2019 20:02 UTC (Thu) by jeffm (subscriber, #29341) [Link] (13 responses)

I've been leading development of a project at SUSE called crash-python that rides on top of GDB's Python interface (though I've extended it quite a bit).

It has classic crash-style commands, fully symbolic backtraces through GDB's thread interface (that are also accessible programmatically), a fairly rich set of helpers that the commands (and custom analysis scripts, etc) are built upon. Of course, the usual suite of GDB commands still work as expected. At its core is a kdumpfile debugging target that allows GDB to see the crash dump as a native core dump so, technically, the full semantic component of the debugger that loads modules and tasks isn't even required if the use case is quick information gathering.

https://github.com/jeffmahoney/crash-python

Our developers and support engineers have been adopting it more and more to diagnose crash dumps. As we use it, the one-off scripts get formalized into infrastructure and, ultimately, leveraged into commands. It isn't yet capable of debugging live systems, but there's interest in adding support to do it. I haven't done the testing yet, but I believe it should also be possible to cross-debug if the underlying gdb is capable of it.

A kernel debugger in Python: drgn

Posted May 30, 2019 20:47 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (1 responses)

Cool. That sounds similar to the work I did, except that instead of supporting kdumpfiles as native "core"-type files, I implemented support for breakpad-generated minidumps. I also added support for symbol server lookup (because GDB's filesystem-based approach for finding symbols is lacking). IMHO, something like your project is definitely the way to go.

And once you have something like this --- something that can load crash dumps and analyze them using the full power of the debugger --- it starts to make a lot of sense to implement all automated crash diagnosis and triaging as debugger scripts.

A kernel debugger in Python: drgn

Posted May 30, 2019 21:40 UTC (Thu) by vbabka (subscriber, #91706) [Link]

Exactly. I have implemented very extensive SLAB integrity checking in crash-python (reports much more problems than crash), and I plan to do the same for SLUB soon. And the number of one-off scripts is increasing, I should consolidate them into commands - recently it was a check for page allocator pcplists integrity, for example. Perhaps the most complex one was a script that walked page tables of every process and calculated the mapcount for each page, to check against the one stored in the struct page. Imagine doing that with crash and the usual process of dumping stuff, postprocessing it and perhaps generating more commands to dump more stuff etc.

A kernel debugger in Python: drgn

Posted May 31, 2019 1:28 UTC (Fri) by Paf (subscriber, #91811) [Link] (10 responses)

It sounds cool but also very similar to epython/pykdump for boring old crash. What’s different about it that led you to create it?

A kernel debugger in Python: drgn

Posted May 31, 2019 1:29 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (1 responses)

Symbol server support and better integration with the Android process model. I was really hurting for a native debugging solution that would Just Work if I tried running it on a random build.

A kernel debugger in Python: drgn

Posted May 31, 2019 1:30 UTC (Fri) by quotemstr (subscriber, #45331) [Link]

Err, wrong parent comment.

A kernel debugger in Python: drgn

Posted May 31, 2019 2:41 UTC (Fri) by jeffm (subscriber, #29341) [Link] (1 responses)

I'm not familiar enough with epython/pykdump to be able to answer that effectively. The big thing that draws us to develop and use it is the ease of extensibility and that the process tasks are integrated into GDB's ideas of threads, so you can do things like select a task, get a symbolic back trace, examine arguments, locals, etc. That's only half of it, though. Since this is all plumbed within GDB to the Python interface, we can programatically access the stacks and do things like the following script:

https://github.com/jeffmahoney/crash-python/blob/master/c...

This one had a very specific application and obviously would only work on the dump I was using. It cross references the contents of the XFS AIL with the inodes found in task stacks to see what was stuck in the log and why.

It also allows the debugger itself to be extended quickly as new kernel versions come out.

A kernel debugger in Python: drgn

Posted Jun 11, 2019 16:56 UTC (Tue) by alexsid (guest, #98432) [Link]

Hi Jeff,

I am the main developer of PyKdump and would like to provide several comments:

1. While we can execute crash and gdb commands and get results as text, this is not how PyKdump is intended to be used. API is built directly on crash/GDB internals so that we do not process text information but rather use direct API. The performance is quite nice - typically traversing a list of structures and dereferencing some fields of these structures can be done at about 100,000 structs per second rate.

2. Just like with your tool, the main reason for PyKdump is writing programs. At this moment we already have many programs (1st pass analysis, NFS analysis, hangs analysis etc.) - about 24,000 lines of Python code.

See e.g https://sourceforge.net/p/pykdump/code/ci/master/tree/pro... for an example of how code looks like.

3. The tools are written so that they work with different distributions (I work in HPE and we have to analyze vmcores from SLES, RHEL and Ubuntu). This needs significant effort as kernel structures change all the time - but our tools work for anything from 2.6.18 kernels to 5.0 kernels.

4. While programs/libraries can be located as separate files on your host, everything including the framework and programs can be packaged in a single file that is both .so and ZIP. It does not depend on anything except GLIBC so it is portable between different hosts/distributions. I build new binary versions regularly and upload them to https://sourceforge.net/projects/pykdump/files/mpykdump-x...

5. Current version of PyKdump is based on Python-3.7.3

Alex

A kernel debugger in Python: drgn

Posted May 31, 2019 6:35 UTC (Fri) by vbabka (subscriber, #91706) [Link] (5 responses)

I looked briefly into pykdump code and got the impression (maybe wrong? from the file wrapcrash.py) that it works by executing crash commands and parsing their output. And AFAIK crash does similar thing with the embedded gdb. Building on top of gdb's Python API seems much cleaner and powerful than that. You don't split the kernel-specific knowledge into a binary built from C with commands producing plain text, and your python scripting on top. Instead the whole kernel-specific knowledge is in Python, which your scripts can reuse and extend.

A kernel debugger in Python: drgn

Posted Jun 1, 2019 7:10 UTC (Sat) by togga (subscriber, #53103) [Link] (4 responses)

I can agree that going to plain text and back is bad but putting the logic in Python is even worse as Python is an old slow scripting language which will quickly limit your use-cases and make the whole system complex and heavy (read bloated). If you put the logic in a clean library it'll be reusable from both scripting languages (C++ could be one of them nowadays) and tools with less overhead like live-monitoring.

A kernel debugger in Python: drgn

Posted Jun 1, 2019 11:45 UTC (Sat) by jeffm (subscriber, #29341) [Link] (3 responses)

The idea is to make it quickly extensible for both infrastructure and one-off analysis tools. Doing it a compiled language, especially one like C/C++, throws that out the window. Debugging a python script raising exceptions and terminating is a whole lot nicer than debugging a seg fault before you can start the work that was the real purpose of using the tool.

As long as the fault isn’t in the core code, failing scripts can be run repeatedly without restarting the debugger.

A kernel debugger in Python: drgn

Posted Jun 1, 2019 12:32 UTC (Sat) by jeffm (subscriber, #29341) [Link] (1 responses)

Also, the heavy lifting in both crash-python and drgn is C/C++ code already. GDB and libkdumpfile for the former and libdrgn for the latter.

A kernel debugger in Python: drgn

Posted Jun 2, 2019 21:26 UTC (Sun) by togga (subscriber, #53103) [Link]

Yes. DRGN seems to be a healthy mix. I didn't question that, I just argued against your statement to put all logic in python.

A kernel debugger in Python: drgn

Posted Jun 2, 2019 18:33 UTC (Sun) by togga (subscriber, #53103) [Link]

That might be the case 5 years ago, today the "two language problem" is basically solved, and tomorrow the scene will shift even further. I wouldn't call the debug-logic we're discussing here a "one-off". There are lots of synergies to be had between "infrastructure" and "analysis", these can easily end up in simulations, tests, live-monitoring or just reusing some code for different purposes in several places etc.

For real "one-off" things (fire and forget, no reuse) the discussion is pretty much pointless as you could use anything at hand to get from A to B which probably ends up in "it depends on multiple factors".

>> Doing it a compiled language, especially one like C/C++, throws that out the window."

Language and runtime environments are different things. C/C++ itself throws nothing out the window:
[cling]$ #include <stdexcept>
[cling]$ throw std::runtime_error("no segfault here")
>>> Caught a std::exception!
>>> no segfault here
[cling]$ auto x = (int *)nullptr
(int *) nullptr
[cling]$ auto no_segfault_here_either = *x
input_line_7:2:34: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
auto no_segfault_here_either = *x
^

"Debugging a python script raising exceptions and terminating is a whole lot nicer than debugging a seg fault before you can start the work that was the real purpose of using the tool."

What you also get with Python is:
* one layer of complexity having to wrap/convert everything between "infrastructure world" and "script world"
* code very hard to reuse elsewhere, doesn't scale very well (GIL still there for instance) and brings a lots of baggage, especially a "forced" runtime environment

When your get stuck in above constraints or a Python encode/decode-hell you may have to chew and swallow that "lot nicer" statement and it won't be tasty (been there multiple times). Python has not moved away from the "scripting" corner, rather sunken in deeper, whereas other languages is moving in other directions. Everything is evolving (changing) rapidly though.

A kernel debugger in Python: drgn

Posted May 30, 2019 21:50 UTC (Thu) by tpo (subscriber, #25713) [Link] (1 responses)

Is agdb something internal to FB? Because I wasn't able to find it via Duckduckgo?

A kernel debugger in Python: drgn

Posted May 30, 2019 21:52 UTC (Thu) by quotemstr (subscriber, #45331) [Link]

Yeah. Basically what I described in the other comment.

A kernel debugger in Python: drgn

Posted May 31, 2019 1:36 UTC (Fri) by osandov (subscriber, #97963) [Link] (2 responses)

Huh, I must've missed that part of the GDB API when I started building drgn. It certainly could've replaced a good chunk of the code I wrote. However, the more compelling part of drgn is all of the bells and whistles built on top of the core, which it seems took Jeff a similar amount of effort to build on top of GDB. I think the user experience overall benefits from the from-scratch implementation, especially the startup time and the higher-level API without any leaky GDB abstractions.

A kernel debugger in Python: drgn

Posted May 31, 2019 1:44 UTC (Fri) by jeffm (subscriber, #29341) [Link] (1 responses)

Is drgn cross-platform?

A kernel debugger in Python: drgn

Posted May 31, 2019 5:26 UTC (Fri) by osandov (subscriber, #97963) [Link]

In theory, although I haven't tested it on anything other than x86-64 yet.

A kernel debugger in Python: drgn

Posted Jun 3, 2019 11:31 UTC (Mon) by mishuang2018 (subscriber, #129176) [Link]

I wrote a crash command in C to dump the flow table entries of mlx5_core module. Using drgn, it is greatly simplified and it is very fast. This is what I want for years.

A kernel debugger in Python: drgn

Posted Jun 19, 2019 21:48 UTC (Wed) by nix (subscriber, #2304) [Link]

Side note: CTF has handled variables for years (though it implements them as a map from name -> type, so the caller has to read /proc/kallmodsyms -- an augmented /proc/kallsyms that shows module names even for built-in modules, and sizes -- and map from name -> address to complete the process.)

(Right now, the DWARF->CTF converter for kernel code is not terribly brilliant, though it does work. I'll be working on one that should be much better in the next few weeks.)