A new realtime tree
The realtime patch set has not gone away, though. If nothing else, the fact that a number of distributors are shipping this code is enough to ensure continued interest in its development. So your editor noted with interest the recent announcement of a new -rt tree with an updated set of realtime patches. This tree will be of interest for anybody wanting to look at the realtime work in the context of the 2.6.28 kernel or beyond.
One of the core technologies in the realtime tree is a change to how spinlocks work. Spinlocks in the mainline will busy-wait until the required lock becomes available; they thus occupy the processor to no useful end when acquiring a contended lock. Holding a spinlock will also prevent a thread from being preempted. This behavior is generally best for system throughput; it also makes it easier to write correct code. But anything which prevents a CPU from immediately servicing the highest-priority process runs counter to the chief design goal of a realtime operating system: providing deterministic response times in all situations. So, for the realtime patches, classic spinlocks had to go.
The solution was to turn most spinlocks into a form of mutex with priority inheritance. A process which attempts to acquire a contended "spinlock" will no longer spin; instead, it goes to sleep and waits for the lock to become free, making the processor available to another thread. Code which holds one of these non-spinlocks is no longer immune to preemption; a higher-priority thread can always push it out of the way. By changing spinlocks in this way, the realtime hackers were able to eliminate one of the largest sources of latency in the mainline kernel. Much of that work found its way into the mainline some time ago in the form of the mutex API, but spinlocks themselves have not been changed in the mainline.
To minimize the pain of maintaining the realtime patches, the developers simply redefined the spinlock_t type to be the new mutex type instead. Except that, as it turns out, some spinlocks in low-level parts of the kernel really do need to be spinlocks still. So those were switched to a new raw_spinlock_t type - but without changing the various spin_lock() calls. Instead, some truly frightening macro trickery was introduced to cause the spinlock API to do the right thing when passed either of two entirely different mutual exclusion primitives. This bit of macro magic was always going to be an impediment to mainline inclusion, so the realtime developers never really expected to merge the lock code in that form.
The new realtime tree now shows how the realtime developers think this work might get into the mainline. It involves a more explicit separation of the two types of "spinlocks" - and a lot of code churn. In the realtime tree, most locks of type spinlock_t are changed to a new lock_t type. There is a new set of operations for this type:
#include <linux/lock.h>
lock_t lock;
acquire_lock(&lock);
release_lock(&lock);
For a normal, non-realtime kernel build, lock_t will be the same as spinlock_t, and things will work as they always have. On realtime kernels, instead, lock_t will be a mutex type. The other variants of the spinlock API will be represented in the new API (there is an acquire_lock_irqsave(), for example), but none of them will actually disable interrupts in a realtime kernel. Meanwhile, spinlock_t will remain a true spinlock type.
This change gets rid of the tricky macros, but at the cost of changing the declarations of and operations on almost all spinlocks in the kernel. That is a lot of code changes: a quick grep turns up over 20,000 spin_lock*() calls in the upcoming 2.6.28 kernel. That will make for some pain if and when this change is merged. But in the mean time, it can only make for a lot of pain for the people who have to maintain this patch out of tree. To make their lives a little easier, the realtime developers have created a couple of scripts to do the bulk of the work. First, all spinlocks in a pristine kernel are converted to lock_t, then the few locks which truly must be spinlocks are switched back. This work is kept in a separate branch which is regenerated when needed; in this way, the realtime developers avoid the need to do nasty merges to keep up with current kernels.
Your editor has heard talk of another locking change which does not, yet, appear in this tree. One problem with the realtime patch set is that it requires distributors to create yet another kernel build - something they hate doing - if they want to support realtime operation. In an effort to make life easier for distributors, the realtime developers are working on a scheme whereby a kernel would determine at run time whether it should be running in a realtime mode. If so, spinlocks will be changed to sleeping locks by patching the kernel binary as it boots. Kernels built this way will be able to run efficiently in either mode.
The branches of the realtime tree provide a quick guide to the other parts of the realtime work which remain outside of the mainline. The threaded interrupt handler code is one example; that change could be proposed (again) for merging in the near future. The priority workqueue mechanism sits in another branch, as do patches aimed at Java support, filesystem changes, memory management changes, and more. Then, there's a branch for stuff which will never be merged; for example, there is this patch which gives Java programs direct access to physical memory - not something which strikes most kernel developers as a good idea. All told, there is a great deal of work sitting in the realtime patch set; this work is finally being organized into a proper git tree.
The "upstream first" policy says that vendors should merge their code
upstream before shipping it to customers. The 2.6.x development model is
built on the idea that no change is too fundamental to be accepted into a
regular, 3-month development cycle. The realtime patches would appear to be
an exception to both rules. It has taken over four years to get to a point
where some of the fundamental realtime technologies are close to ready for the mainline,
but distributors have been shipping it for at least three of those years.
It has, in other words, been one of the biggest forks of the Linux kernel,
ever. The plan has always been to join this fork back with the mainline,
though; perhaps, finally, that goal is getting closer. With luck, it will
happen within about a year.
| Index entries for this article | |
|---|---|
| Kernel | Realtime |
| Kernel | Spinlocks |