BTF, Rust, and the kernel toolchain
BPF Type Format (BTF), BPF's debugging information format, has undergone rapid evolution to match the evolving needs of BPF programs. José Marchesi spoke at Kangrejos about some of that work — and how it could impact Rust, specifically. He discussed debug information, kernel-specific relocations, and the planned changes to kernel stack unwinding. Each of these will require some amount of work to fully support in Rust, but preliminary signs look promising.
BTF
Marchesi described BTF as a format to denote the compiled form of C types. He said that it was similar to DWARF, but "way, way simpler". BTF is designed for a particular use case: efficient, online operations on C types and functions as they exist in memory. DWARF information is concerned with mapping debugging information to the source-level constructs of a programming language; BTF is concerned with what is in the compiled object and "not much related to the source language". At run time, this information is used by BPF programs to access kernel structures correctly, among other uses.
The process of generating BTF for a given kernel is somewhat tortured. When the kernel is compiled with BTF support, it is built with DWARF information. Then pahole converts the DWARF to BTF. One consequence of this approach is that BTF can only include information that is also present in DWARF — a problem for some of the kernel's structure attributes that aren't properly represented, so Marchesi is working toward being able to generate BTF directly. This is already mostly working in GCC, but the kernel is not yet built that way.
When the C compiler does start producing BTF directly, though, it will cause problems for the parts of the kernel written in Rust: the Rust compiler will also need to generate BTF. There are benefits to having Rust generate it as well — BTF could be used for genksyms, the tool that generates lists of kernel symbols to check loadable module compatibility — but it will certainly require some work as well.
The Rust compiler will not have to start from scratch, Marchesi said. People do already write BPF programs in Rust, and LLVM emits "correct-enough BTF". "But that's not by design," he warned, just a result of supporting BTF for C. Properly supporting BTF for Rust will mean making sure it lines up with the BTF generated for the rest of the kernel, that it works even for obscure corner cases, and that it can fully capture the richness of Rust types.
Right now, pahole is sidestepping the issue by just ignoring DWARF generated for Rust code, not creating BTF from it. This has already caused problems for some users. Carlos Bilbao asked whether anyone had tried generating BTF from a program written in a mix of C and Rust, and seen what the problem is. Marchesi explained that Rust generates DWARF with some structures that pahole doesn't support. Miguel Ojeda expanded on that, saying that Rust uses some DWARF types that were originally introduced for C++ support, and that therefore pahole doesn't have existing support for.
Björn Roy Baron and Gary Guo listed some problems with Rust enums that might apply to BTF. In particular, Rust enums are more like tagged unions in C — they have a discriminant and then a set of fields. The Rust compiler doesn't guarantee any particular representation, however; it uses this freedom to optimize some types to take less space. For example, Option<T> is an enum that contains either None or a value of type T. When values of type T can never be zero, the compiler can save the space needed by the enum tag by using zero to represent None.
This means that unlike structures, which can be annotated with #[repr(C)] to instruct the compiler to lay them out exactly like C structures, native Rust enums can't be forced to have a stable layout. The Rust compiler can, each time it is run, choose a different layout for each enum. In practice, a given version of the compiler always uses the same layout, but that isn't guaranteed. If BTF needs to refer to enum types, that freedom could complicate the implementation.
Marchesi also highlighted the difficulty that link-time optimization (LTO) poses. DWARF distinguishes between different compilation units, whereas BTF does not. So name clashes across compilation units are potentially a problem for using BTF in an LTO build of the kernel. Alice Ryhl raised a different problem — LTO can inline Rust code into C compilation units, meaning that the DWARF info can be mixed. That causes a problem for LTO builds today, since pahole can't handle the mixed DWARF info.
CO-RE
After laying out his basic concerns, Marchesi raised the topic of compile once - run everywhere (CO-RE), the approach that lets the kernel load BPF programs without requiring an exact match between the kernel headers the program was compiled against and the running kernel. In order to make this work, the compiler for the BPF program needs to take some special steps. In C, an attribute called preserve_access_index causes the compiler to generate loads and stores in a way that can be patched, and a relocation entry that tells the loader how to patch the program if the layout of the structure has changed from a different version of the kernel. Both GCC and LLVM have support for CO-RE; Marchesi wanted to know if the same approach made sense for Rust, given that the compiler can reorder fields of Rust structures (that aren't marked as using the C layout).
Andreas Hindborg thought that support like that would be great to have in Rust, since it could potentially allow for linking object files from different compilers — something that currently requires explicitly using the C calling convention, since Rust lacks a stable ABI of its own. He did have some questions about how it could work in practice, however, including what happens if a BPF program is built against an incompatible version of the kernel headers.
"Nothing good", Marchesi answered. But in the case of BPF, the verifier would complain about any bad accesses. After some discussion, during which Ojeda and Guo clarified some details of Rust's layout semantics, Marchesi suggested that perhaps a good first step would be generating CO-RE relocations only for #[repr(C)] structures. Guo questioned how that would interact with the offset_of!() macro, which can be used to find the offset of a field within a structure. Marchesi explained that the value would have to change with the relocation, but that this meant that any math that depended on the offset would be broken. Baron suggested that this might require an opaque wrapper type to prevent things from breaking.
Unwinding
Marchesi had one last topic: the potential switch from ORC to SFrame for stack unwinding in the kernel. He wanted to check that the switch would not cause problems for the Rust parts of the kernel. Guo assured him that Rust does support unwinding, currently with the same DWARF-based methods that C programs largely use. The important part is that compiled functions have unwinding information that matches what the C code does, so any potential compiler change might work out of the box. Marchesi called that "very good news", and wrapped up the session on a positive note.
Overall, BTF is unlikely to pose insurmountable challenges to the inclusion of Rust in the Linux kernel, but there are some areas that will need additional work. At the least, there will need to be testing for LLVM's BTF support, for applying CO-RE to the Rust parts of the kernel, and for ensuring that Rust's unwinding support remains working. Some of those areas may also need additional attention to ensure that the kernel can continue working smoothly as a conglomerate of C, BPF, and Rust.
| Index entries for this article | |
|---|---|
| Conference | Kangrejos/2024 |