Realtime Linux: academia v. reality

July 26, 2010

This article was contributed by Thomas Gleixner

The 20th Euromicro Conference on Real-Time Systems (ECRTS2010) was held in Brussels, Belgium from July 6-9, along with a series of satellite workshops which took place on July 6. One of those satellite workshops was OSPERT 2010 - the Sixth International Workshop on Operating Systems Platforms for Embedded Real-Time Applications, which was co-chaired by kernel developer Peter Zijlstra and Stefan M. Petters from the Polytechnic Institute of Porto, Portugal. Peter and Stefan invited researchers and practitioners from both industry and the Linux kernel developer community. I participated for the second year and tried, with Peter, to nurse the discussion between the academic and real worlds which started last year at OSPERT in Dublin.

Much to my surprise, I was also invited to give the opening keynote at the main conference, which I titled "The realtime preemption patch: pragmatic ignorance or a chance to collaborate?". Much to the surprise of the audience I did my talk without slides, as I couldn't come up with useful ones as much as I twisted my brain around it. The organizers of ECRTS asked me whether they could publish my writeup, but all I had to offer were my scribbled notes which outlined what I wanted to talk about. So I agreed to do a transcript from my notes and memory, without any guarantee that it's a verbatim transcript. Peter at least confirmed that it matches roughly the real talk.

An introduction

First of all I want to thank Jim Anderson for the invitation to give this keynote at ECRTS and his adventurous offer to let me talk about whatever I want. Such offers can be dangerous, but I'll try my best not to disappoint him too much.

The Linux Kernel community has a proven track record of being in disagreement with - and disconnected from - the academic operating system research community from the very beginning. The famous Torvalds/Tannenbaum debate about the obsolescence of monolithic kernels is just the starting point of a long series of debates about various aspects of Linux kernel design choices.

One of the most controversial topics is the question how to add realtime extensions to the Linux kernel. In the late 1990's, various research realtime extensions emerged from universities. These include KURT (Kansas University), RTAI (University of Milano), RTLinux (NMT, Socorro, New Mexico), Linux/RK (Carnegie Mellon University), QLinux (University of Massachusetts), and DROPS (University of Dresden - based on L4), just to name a few. There have been more, but many of them have only left hard-to-track traces in the net.

The various projects can be divided into two categories:

Running Linux on top of a micro/nano kernel
Improving the realtime behavior of the kernel itself

I participated in and watched several discussions about these approaches over the years; the discussion which is burned into my memory forever happened in summer 2004. In the course of an heated debate one of the participants stated: "It's impossible to turn a General Purpose Operating System into a Real-Time Operating System. Period." I was smiling then as I had already proven, together with Doug Niehaus from Kansas University, that it can be done even if it violates all - or at least most - of the rules of the academic OS research universe.

But those discussions were not restricted to the academic world. The Linux kernel mailing list archives provide a huge choice of technical discussions (as well as flame wars) about preemptability, latency, priority inheritance and approaches to realtime support. It was fun to read back and watch how influential developers changed their minds over time. Especially Linus himself provides quite a few interesting quotes. In May 2002 he stated:

With RTLinux, you have to split the app up into the "hard realtime" part (which ends up being in kernel space) and the "rest".

Which is, in my opinion, the only sane way to handle hard realtime. No confusion about priority inversions, no crap. Clear borders between what is "has to happen _now_" and "this can do with the regular soft realtime".

Four years later he said in a discussion about merging the realtime preemption patch during the Kernel Summit 2006:

Controlling a laser with Linux is crazy, but everyone in this room is crazy in his own way. So if you want to use Linux to control an industrial welding laser, I have no problem with your using PREEMPT_RT.

Equally interesting is his statement about priority inheritance in a huge discussion about realtime approaches in December 2005:

Friends don't let friends use priority inheritance. Just don't do it. If you really need it, your system is broken anyway.

Linus's clear statement that he wouldn't merge any PI code ever was rendered ad absurdum when he merged the PI support for pthread_mutexes without a single comment only half a year later.

Both are pretty good examples of the pragmatic approach of the Linux kernel development community and its key figures. Linus especially has always silently followed the famous words of the former German chancellor Konrad Adenauer: "Why should I care about my chatter from yesterday? Nothing prevents me from becoming wiser."

Adding realtime response to the kernel

But back to the micro/nano-kernel versus in-kernel approaches which emerged in the late 90es. From both camps emerged commercial products and, more or less, active open source communities, but none of those efforts was commercially sustainable or ever got close to being merged into the official mainline kernel code base due to various reasons. Let me look at some of those reasons:

Intrusiveness and maintainability: Most of those approaches lacked - and still lack - proper abstractions and smooth integration into the Linux kernel code base. #ifdef's sprinkled all over the place are neither an incentive for kernel developers to delve into the code nor are they suitable for long-term maintenance.
Complexity of usage: Dual-kernel approaches tend to be hard to understand for application programmers, who often have a hard time coping with a single API. Add a second API and the often backwards-implemented IPC mechanisms between the domains and failure is predictable.
I'm not saying that it can't be done, it's just not suitable for the average programmer.
Incompleteness: Some of those research approaches solve only parts of the problem, as this was their particular area of interest. But that prevents them from becoming useful in practice.
Lack of interest: Some of the projects never made any attempt to approach the Linux kernel community, so the question of inclusion, or even partial merging of infrastructure, never came up.

In October 2004, the real time topic got new vigor on the Linux kernel mailing list. MontaVista had integrated the results of research at the University of the German Federal Armed Forces at Munich into the kernel, replacing spinlocks with priority-inheritance-enabled mutexes. This posting resulted in one of the lengthiest discussions about realtime on the Linux kernel mailing list as almost everyone involved in efforts to solve the realtime problem surfaced and praised the superiority of their own approach. Interestingly enough, nobody from the academic camp participated in this heated argument.

A few days after the flame fest started, the discussion was driven to a new level by kernel developer Ingo Molnar, who, instead of spending time with rhetoric, had implemented a different patch which, despite being clumsy and incomplete, built the starting point for the current realtime preemption patch. In no time quite a few developers interested in realtime joined Ingo's effort and brought the patch to a point which allowed real-world deployment within two years. During that time a huge number of interesting problems had to be solved: efficient priority inheritance, solving per cpu assumptions, preemptible RCU, high resolution timers, interrupt threading etc. and, as a further burden, the fallout from sloppily-implemented locking schemes in all areas across the kernel.

Help from academia?

Those two years were mostly spent with grunt work and twisting our brains around hard-to-understand and hard-to-solve locking and preemption problems. No time was left for theory and research. When the dust settled a bit and we started to feed parts of the realtime patch to the mainline, we actually spent some time reading papers and trying to leverage the academic research results.

Let me pick out priority inheritance and have a look at how the code evolved and why we ended up with the current implementation. The first version which was in Ingo's patchset was a rather simple approach with long-held locks, deep lock nesting and other ugliness. While it was correct and helped us to go forward it was clear that the code had to be replaced at some point.

A first starting point for getting a better implementation was of course reading through academic papers. First I was overwhelmed by the sheer amount of material and puzzled by the various interesting approaches to avoid priority inversion. But, the more papers I read, the more frustrated I got. Lots of theory, proof-of-concept implementations written in Ada, micro improvements to previous papers, you all know the academic drill. I'm not at all saying that it was waste of time as it gave me a pretty good impression of the pitfalls and limitations which are expected in a non-priority-based scheduling environment, but I have to admit that it didn't help me to solve my real world problem either.

The code was rewritten by Ingo Molnar, Esben Nielsen, Steven Rostedt and myself several times until we settled on the current version. The way led from the classic lock-chain walk with instant priority boosting through a scheduler-driven approach, then back to the lock-chain walk as it turned out to be the most robust, scalable and efficient way to solve the problem. My favorite implementation, though, would have been based on proxy execution, which already existed in Doug Niehaus's Kansas University Real Time project at that time, but unfortunately it lacked SMP support. Interestingly enough, we are looking into it again as non-priority-based scheduling algorithms are knocking at the kernel's door. But in hindsight I really regret that nobody—including myself—ever thought about documenting the various algorithms we tried, the up- and down-sides, the test results and related material.

So it seems that there is the reverse problem on the real world developer side: we are solving problems, comparing and contrasting approaches and implementations, but we are either too lazy or too busy to sit down and write a proper paper about it. And of course we believe that it is all documented in the different patch versions and in the maze of the Linux kernel mailing list archives which are freely available for the interested reader.

Indeed it might be a worthwhile exercise to go back and extract the information and document it, but in my case this probably has to wait until I go into retirement, and even then I fear that I have more favorable items on my ever growing list of things which I want to investigate. On the other hand, it might be an interesting student project to do a proper analysis and documentation on which further research could be based.

On the value of academic research

I do not consider myself in any way to be representative of the kernel developer community, so I asked around to learn who was actually influenced by research results when working on the realtime preemption patch. Sorry for you folks, the bad news is that most developers consider reading research results not to be a helpful and worthwhile exercise in order to get real work done. The question arises why? Is academic OS research useless in general? Not at all. It's just incredibly hard to leverage. There are various reasons for this and I'm going to pick out some of them.

First of all—and I have complained about this before—it's often hard to get access to papers because they are hidden away behind IEEE's paywall. While dealing with IEEE, a fact of life for the academic world, I personally consider it as a modern form of robber barony where tax payers have to pay for work which was funded by tax money in the first place. There is another problem I have with the IEEE monopoly. Universities' rankings are influenced by the number of papers written by their members and accepted at a IEEE conferences, which I consider to be one of the most idiotic quality measurement rules on the planet. And it's not only my personal opinion; it's also provable.

I actually took the time to spend a day at a university where I could gain access to IEEE papers without wasting my private money. I picked out twenty recent realtime related papers and did a quick survey. Twelve of the papers were a rehash of well-known and well-researched topics, and at least half of them were badly written as well. From the remaining eight papers, six were micro improvements based on previous papers where I had a hard time figuring out why the papers had been written at all. One of those was merely describing the effects of converting a constant which influences resource partitioning into a runtime configurable variable. So that left two papers which seemed actually worthwhile to read in detail. Funny enough, I had already read one of those papers as it was publicly accessible in a slightly modified form.

That survey really convinced me to stay away from IEEE forever and to consider the university ranking system even more suspicious.

There are plenty of other sources where research papers can be accessed, but unfortunately the signal-to-noise ratio there is not significantly better. I have no idea how researchers filter that, but on the other hand most people wonder how kernel developers filter out the interesting stuff from the Linux kernel mailing list flood.

One interesting thing I noticed while skimming through paper titles and abstracts is that the Linux kernel seems to have become the most popular research vehicle. On one site I found roughly 600 Linux-based realtime and scheduling papers which were written in the last 18 months. About 10% of them utilized the realtime preemption patch as their baseline operating system. Unfortunately almost none of the results ever trickled through to the kernel development community, not to mention actually working code being submitted to the Linux kernel mailing list.

As a side note: one paper even mentioned a hard-to-trigger longstanding bug in the kernel which the authors fixed during their research. It took me some time to map the bug to the kernel code, but I found out that it got fixed in the mainline about three months after the paper was published—which is a full kernel release cycle. The fix was not related to this research work in any way, it just happened that some unrelated changes made the race window wider and therefore made the bug surface. I was a bit grumpy when I discovered this, but all I can ask for is: please send out at least a description of a bug you trip over in your research work to the kernel community.

Another reason why it's hard for us to leverage research results is that academic operating system research has, as probably any other academic research area, a few interesting properties:

Base concepts in research are often several decades old, but they don't show up in the real world even if they would be helpful to solve problems which have been worked around for at least the same number of decades more or less.
We discussed the sporadic server model yesterday at OSPERT, but it has been around for 27 years. I assume that hundreds of papers have been written about it, hundreds of researchers and students have improved the details, created variations, but there is almost no operating system providing support for it. As far as I know Apple's OSX is the only operating system which has a scheduling policy which is not based on priorities but, as I learned, it's well hidden away from the application programmer.
Research often happens on narrow aspects of an already narrow problem space. That's understandable as you often need to verify and contrast algorithms on their own merit without looking at other factors. But that leaves the interested reader like me with a large amount of puzzle pieces to chase and fit together, which often enough made me give up.
Research often happens on artificial application scenarios. While again understandable from the research point of view, it makes it extremely hard, most of the time, to expand the research results into generalized application scenarios without shooting yourself in the foot and without either spending endless time or giving up. I know that it's our fault that we do not provide real application scenarios to the researchers, but in our defense I have to say that in most of the cases we don't know what downstream users are actually doing. We only get a faint idea of it when they complain about the kernel not doing what they expect.
Research often tries to solve yesterday's problems over and over while the reality of hardware and requirements have already moved to the next levels of complexity. I can understand that there are still interesting problems to solve, but seeing the gazillionst paper about priority ceilings on uniprocessor systems is not really helpful when we are struggling with schedulability, lock scaling and other challenges on 64- (and more) core machines.
Comparing and contrasting research results is almost impossible. Even if a lot of research happens on Linux there is no way to compare and contrast the results as researchers, most of the time, base their work on completely different base kernel versions. We talked about this last year and I have to admit that neither Peter nor myself found enough spare time to come up with an approach to create a framework on which the various research groups could base their scheduler of the day. We haven't forgotten about this, but while researchers have to write papers, we get our time occupied by other duties.
Research and education seem to happen in different universes. It seems that operating system and realtime research have little influence on the education of Joe Average Programmer. I'm always dumbstruck when talking to application programmers who have not the faintest idea of resources and their limitations. It seems that the resource problems on their side are all solvable by visiting the hardware shop across the street and buying the next-generation machine. That approach also manifests itself pretty well in the "enterprise realtime" space where people send us test cases which refuse to even start on anything smaller than a machine equipped with 32GB of RAM and at least 16 cores.
If you have any chance to influence that, then please help to plant at least some clue on the folks who are going to use the systems you and we create.
A related observation is the inability of hardware and software engineers to talk to each other when a system is designed. While I observe that disconnect mainly on the industry side, I have the feeling that it is largely true in the universities as well. No idea how to address this issue, but it's going to be more important the more the complexity of systems increases.

I'll stop bashing on you folks now, but I think that there are valid questions and we need to figure out answers to them if we want to get out of the historically grown state of affairs someday.

In conclusion

We are happy that you use Linux and its extensions for your research, but we would be even more happy if we could deal with the outcome of your work in an easier way. In the last couple of years we started to close the gap between researchers and the Linux kernel community at OSPERT and at the Realtime Linux Workshop and I want to say thanks to Stefan Petters, Jim Anderson, Gerhard Fohler, Peter Zijlstra and everyone else involved. It's really worthwhile to discuss the problems we face with the research community and we hope that you get some insight into the problems we face and requirements which are behind our pragmatic approach to solve them.

And of course we appreciate that some code which comes out straight of the research laboratory (the EDF scheduler from ReTiS, Pisa) actually got cleaned up and published on the Linux kernel mailing list for public discussion and I really hope that we are going to see more like this in the foreseeable future. Problem complexity is increasing, unfortunately, and we need all the collective brain power to address next year's challenges. We already started the discussion and first interesting patches have shown up, so really I hope we can follow down that road and get the best out of it for all of us.

Thanks for your attention.

Feedback

I got quite a bit of feedback after the talk. Let me answer some of the questions.

Q: Is there any place outside LKML where discussion between academic folks and the kernel community can take place?

A: Björn Brandenberg suggested setting up a mailing list for research related questions, so that the academics are not forced to wade through the LKML noise. If a topic needs a broader audience we always can move it to LKML. I'm already working on that. It's going to be low traffic, so you should not be swamped in mail.

Q: Where can I get more information about the realtime preemption patch ?

A: General information can be found on the realtime Linux wiki, this LWN article, and this Linux Symposium paper [PDF].

Q: Which technologies in the mainline Linux kernel emerged from the realtime preemption patch?

A: The list includes:

the Generic interrupt handling framework. See: Linux/Documentation/DocBook/genericirq and this LWN article.
Threaded interrupt handlers, described in LWN and again in LWN.
The mutex infrastructure. See: Linux/Documentation/mutex-design.txt
High-resolution timers, including NOHZ idle support. See: Linux/Documentation/timers/highres.txt and these presentation slides.
Priority inheritance support for user space pthread_mutexes. See: Linux/Documentation/pi-futex.txt, Linux/Documentation/rt-mutex.txt, Linux/Documentation/rt-mutex-design.txt, this LWN article, and this Realtime Linux Workshop paper [PDF].
Robustness support for user-space pthread_mutexes. See: Linux/Documentation/robust-futexes.txt and this LWN article.
The lock dependency validator, described in LWN.
The kernel tracing infrastructure, as described in a series of LWN articles: 1, 2, 3, and 4.
Preemptible and hierarchical RCU, also documented in LWN: 1, 2, 3, and 4.

Q: Where do I get information about the Realtime Linux Workshop?

A: The 2010 realtime Linux Workshop (RTLWS) will be in Nairobi, Kenya, Oct. 25-27th. The 2011 RTLWS is planned to be at Kansas University (not confirmed yet). Further information can be found on the RTLWS web page. General information about the organisation behind RTLWS can be found on the OSADL page, and information about it's academic members is on this page.

Conference impressions

I stayed for the main conference, so let me share my impressions. First off the conference was well organized and, in general, the atmosphere was not really different from an open source conference. The realtime researchers seem to be a well-connected and open-minded community. While they take their research seriously, at least most of them admit freely that the ivory tower they are living in can be a complete different universe. This was pretty much observable in various talks where the number of assumptions and the perfectly working abstract hardware models made it hard for me to figure out how the results of this work could be applied to reality.

The really outstanding talks were the keynotes on day two and three.

On Thursday, Norbert When from the Technical University Kaiserslautern gave an interesting talk titled Hardware modeling: A critical assessment with case studies [PDF]. Norbert is working on hardware modeling and low-level software for embedded devices, so he is not the typical speaker you would expect at a realtime-focused conference. But it seems that the program committee tried to bring some reality into the picture. Norbert gave an impressive overview over the evolution of hardware and the reasons why we have to deal with multi-core hardware and have to face the fact that today's hardware is not designed for predictability and reliability. So realtime folks need to rethink their abstract models and take more complex aspects of the overall system into account.

One of the interesting aspects was his view on energy efficient computing: A cloud of 1.7 million AMD Opteron cores consumes 179MW while a cloud of 10 million Xtensa cores provides the same computing power at 3MW. Another aspect of power-aware computing is the increasing role of heterogeneous systems. Dedicated hardware for video decoding is about 100 times more power efficient than a software-based solution on a general-purpose CPU. Even specialized DSPs consume about 10 times more power for the same task than the optimized hardware solution.

But power optimized hardware has a tradeoff: the loss of flexibility which is provided by software. But the mobile space has already arrived in the heterogeneous world, and researchers need to become aware of the increased complexity to analyze such hybrid constructs and develop new models to allow the verification of these systems in the hardware design phase. Workarounds for hardware design failures in application specific systems are orders of magnitudes more complex than on general purpose hardware. All in all, he gave his colleagues from the operating system and realtime research communities quite a list of homework assignments and connected them back to earth.

The Friday morning keynote was a surprising reality check as well. Sanjoy Baruah from the University of North Carolina at Chapel Hill titled his talk "Why realtime scheduling theory still matters". Given the title one would assume that the talk would be focused on justifying the existence of the ivory tower, but Sanjoy was very clear about the fact that the realtime and scheduling research has focused for too long on uniprocessor systems and is missing answers to the challenges of the already-arrived multi-core era. He gave pretty clear guidelines about which areas research should focus on to prove that it still matters.

In addition to the classic problem space of verifiable safety-critical systems, he was calling for research which is relevant to the problem space and built on proper abstractions with a clear focus on multi-core systems. Multi-core systems bring new—and mostly unresearched—challenges like mixed criticalities, which means that safety critical, mission critical and non critical applications run on the same system. All of them have different requirements with regard to meeting their deadlines, resource constraints, etc., and therefore bring a new dimension into the verification problem space. Other areas which need care, according to Sanjoy, are component-based designs and power awareness.

It was good to hear that despite our usual perception of the ivory tower those folks have a strong sense of reality, but it seems they need a more or less gentle reminder from time to time. ECRTS was a real worthwhile conference and I can only encourage developers to attend such research-focused events and keep the communication and discussion between our perceived reality and the not-so-disconnected other universe alive.

Index entries for this article
Kernel	Academic systems
Kernel	Realtime
GuestArticles	Gleixner, Thomas

to post comments

Realtime Linux: academia v. reality

Posted Jul 26, 2010 23:03 UTC (Mon) by fuhchee (guest, #40059) [Link] (27 responses)

"... Both are pretty good examples of the pragmatic approach of the Linux kernel development community and its key figures. Linus especially has always silently followed the famous words of the former German chancellor Konrad Adenauer: "Why should I care about my chatter from yesterday? Nothing prevents me from becoming wiser." ..."

This appears to necessitate the use of a special LKML dictionary when reading categorical superlatives. For example, it would map "never, ever, this is stupid" means "I currently don't see why, sell it to me better.". Perhaps an unLKML decoder script should ship in kernel/tools/, together with a coefficient-of-exaggeration database for main participants.

Realtime Linux: academia v. reality

Posted Jul 26, 2010 23:12 UTC (Mon) by SEJeff (guest, #51588) [Link] (19 responses)

don't feed the trolls people.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 3:53 UTC (Tue) by butlerm (subscriber, #13312) [Link]

Exaggeration aside, I think the GP has a point. This has happened more than once.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 12:26 UTC (Tue) by nye (subscriber, #51576) [Link]

Sense of humour malfunction?

Realtime Linux: academia v. reality

Posted Jul 27, 2010 13:00 UTC (Tue) by fuhchee (guest, #40059) [Link] (16 responses)

In plain English, it is a problem if leaders regularly make categorical statements about the future, and then flip-flop. It means that any particular statement / prediction has a credibility cloud over it. This makes planning more difficult, never mind the unnecessary social friction.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 15:28 UTC (Tue) by martinfick (subscriber, #4455) [Link]

It is a worse problem when leaders refuse to change their opinion simply for the sake of being consistent with the past when assumably they were less knowledgeable.

And, of course, one has to consider the amount of times (%centage wise) this happens before passing judgment. I would suspect that this is rather low for Linus, and that the whole point (which seems to have been missed) is: that he doesn't sway easily, and that the few subjects where he has being swayed drastically show both his original good judgment on most things, along with the ability to be convinced of opinions better than his first (usually correct) opinions.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 17:49 UTC (Tue) by pflugstad (subscriber, #224) [Link] (8 responses)

In plain English...

Actually, this statement identifies another problem - many non-native-English speaking readers won't understand Linus' comments, or take them at face value.

Or in other cultures, it's literally the worst possible thing to change your mind as publicly as Linus' occasionally does, so they assume he really means it.

So I certainly don't see the OP as a troll, but asking a legitimate question.

Realtime Linux: academia v. reality

Posted Jul 28, 2010 10:02 UTC (Wed) by nye (subscriber, #51576) [Link] (6 responses)

>Or in other cultures, it's literally the worst possible thing to change your mind as publicly as Linus' occasionally does

The unwillingness for a public figure to acknowledge a past mistake and fix it is the cause of a great variety of societal ills. I don't believe anyone should ever pander to a culture that encourages this kind of poisonous behaviour.

Realtime Linux: academia v. reality

Posted Jul 28, 2010 22:39 UTC (Wed) by njs (subscriber, #40338) [Link] (5 responses)

> The unwillingness for a public figure to acknowledge a past mistake and fix it is the cause of a great variety of societal ills.

Okay, I'm with you so far...

> I don't believe anyone should ever pander to a culture that encourages this kind of poisonous behaviour.

...but you lose me here. We're talking about people who have certain expectations, and those expectations are, in fact, valid for their culture. These people may or may not like this aspect of their culture, but it doesn't matter -- so long as they have those expectations, then they are not worthy to participate in kernel development? Explaining to them that things work differently here is somehow "pandering" to their culture?

Realtime Linux: academia v. reality

Posted Jul 29, 2010 1:15 UTC (Thu) by dlang (guest, #313) [Link] (1 responses)

the problem is that if changing your opinion is the worst possible thing you can do you are in one of two categories

1. you make the perfect decision the first time, every time

2. you do things wrong, even when you know better.

Since there is no developer who ever qualifies for #1, avoiding changing decisions at all costs would lead to having a bad system, and knowing that it was bad.

so yes, the kernel development _is_ better off by being willing to change decisions, even if that excludes some cultures from participating.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 3:41 UTC (Thu) by njs (subscriber, #40338) [Link]

I've seen two suggestions made in this thread:

1) It's somewhat problematic if leaders regularly make very strong, emphatic statements, specifically saying that this is not an ordinary decision but rather one that will never be changed under any circumstances, and then change their minds.

2) If they're going to do that anyway, then maybe that should be explained to newcomers, since their default understandings will otherwise be wildly miscalibrated.

You seem to be arguing about something else, not entirely sure what, and I don't have much to say about it.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 12:25 UTC (Thu) by nye (subscriber, #51576) [Link] (2 responses)

>We're talking about people who have certain expectations, and those expectations are, in fact, valid for their culture. These people may or may not like this aspect of their culture, but it doesn't matter -- so long as they have those expectations, then they are not worthy to participate in kernel development? Explaining to them that things work differently here is somehow "pandering" to their culture?

It seems I had misinterpreted the previous post in haste. An explanation of the differences between cultures to the honestly ignorant (I use this word in a purely descriptive way with no unstated implications intended) would be a worthwhile exercise in education (especially when, as I believe, the one culture is clearly superior to the other in a particular way).

So, I retract that statement.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 12:27 UTC (Thu) by nye (subscriber, #51576) [Link] (1 responses)

>So, I retract that statement

Which, now that I think of it, is sort of ironic in the circumstances.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 13:05 UTC (Thu) by fuhchee (guest, #40059) [Link]

two words: epic win

Realtime Linux: academia v. reality

Posted Aug 2, 2010 12:52 UTC (Mon) by miku (guest, #35152) [Link]

Reality doesn't care about 'other' cultures. That is why Linux kernel is so good: it tries to map reality instead of bow to authority/ego.

It is best possible thing for a person to change their opinion when it contradicts with reality. Linus, thankfully, is among the sane people who practice this =)

Realtime Linux: academia v. reality

Posted Jul 27, 2010 23:37 UTC (Tue) by aliguori (subscriber, #30636) [Link]

Categorical statements are never accurate in the long term. You can either avoid them entirely which makes life dull, embrace them with unwaivering dedication to avoid the appearance of "flip-flopping", or simply admit when you're wrong and keep going.

The last characteristic is one that's extremely appreciated in a maintainer. It's much more fun to contribute to a project if you think that with a sufficiently compelling argument, you can convince the leadership to agree with you.

Categorical statements

Posted Jul 27, 2010 23:48 UTC (Tue) by jejb (subscriber, #6654) [Link] (4 responses)

> In plain English, it is a problem if leaders regularly make categorical
> statements about the future, and then flip-flop. It means that any
> particular statement / prediction has a credibility cloud over it. This
> makes planning more difficult, never mind the unnecessary social friction.

Actually, I'm afraid history really doesn't support this view.

Stevenson was told by all medical authorities that people would suffer seizures if they travelled at more than 30mph, so the Rocket was a stupid idea.

Max Planck despised Ludwig Boltzmann's statistical mechanics because of the challenge it gave to classical thermodynamics. He went as far as to attack Boltzmann both verbally and in print for the heresy. Planck was ultimately forced to use statistical mechanics to solve the ultraviolet catastrophe and lay the basis for quantum mechanics ...

Einstein famously and vehemently denied the conclusions of the EPR paradox with his "spooky action at a distance" comment. He recanted very reluctantly when the Bell inequalities proved it.

I could go on, but you get the idea.

Great discoveries are made by challenging the accepted and laid down "facts". The corollary to this is that if no-one lays down the "facts" to be challenged, the human instinct for contrariness doesn't get aroused as much as it should and some of our brilliance sinks into the mire of mediocratic reasonableness.

Being wrong is a recoverable error. Never daring to be wrong is an opportunity missed and a life never lived

Categorical statements

Posted Jul 28, 2010 2:15 UTC (Wed) by fuhchee (guest, #40059) [Link] (3 responses)

I don't think your analogy works the way you intended. My comment is equivalent to cautioning those who told Stevenson about seizures, or Max Planck for his accusations of "heresy", or Einstein about his vehemence. One should be more humble.

"The corollary to this is that if no-one lays down the "facts" to be challenged, the human instinct for contrariness doesn't get aroused as much"

Now this "corollary" needs somewhat more evidence to convince that naysayers are a necessary (or necessarily positive) factor in innovation.

Categorical statements

Posted Jul 28, 2010 15:29 UTC (Wed) by jejb (subscriber, #6654) [Link] (2 responses)

> I don't think your analogy works the way you intended. My comment is equivalent to cautioning those who told Stevenson about seizures, or Max Planck for his accusations of "heresy", or Einstein about his vehemence. One should be more humble.

I'm failing to see your point. I gave Einstein and Planck as examples of people who made categorical negative but wrong statements and later admitted they were wrong (without, incidentally, incurring a "credibility cloud"). You seem to now be saying that it's OK for the likes of Einstein and Planck to do this, but everyone else should be humble?

> Now this "corollary" needs somewhat more evidence to convince that naysayers are a necessary (or necessarily positive) factor in innovation.

A corollary is a logical deduction from a proposition. If there's enough evidence to support the proposition then, ipso facto, there's enough to support the corollary.

If you think the proposition needs more evidence, there's enough in google to supply virtually any amount of it going back to the beginnings of recorded history.

Categorical statements

Posted Jul 28, 2010 15:46 UTC (Wed) by fuhchee (guest, #40059) [Link]

> You seem to now be saying that it's OK for the likes of Einstein and
> Planck to do this, but everyone else should be humble?

You misread. "cautioning" is not saying "it's OK".

> A corollary is a logical deduction from a proposition.

Well, thanks for the lesson, but it doesn't quite work here. This "corollary" is your main point, and it is supported by exactly three historical anecdotes. By the way, in none of those stories has there been any indication that the "should-have-been-humbler" people performed a useful service in naysaying. IOW, there has been no argument that without those "laid-down-wrong-facts", the discoveries would not have been made.

If you can't make that argument stick, perhaps your original argument/corollary is not quite as logically sound as you believe it is.

Categorical statements

Posted Jul 28, 2010 15:50 UTC (Wed) by egk (guest, #50799) [Link]

Although this does not have much bearing on the general discussion, it should be said that the story about Einstein seems very doubtful. He died in 1955 and Bell's paper only appeared in 1964. Moreover, Bell's work is theoretical: it could not "prove" anything about the real world, except the fact that something was testable, in principle. Einstein, if he had been alive, could very well have said that he expected experiments to go one way instead of another. And the actual experiments came quite a bit later.

Realtime Linux: academia v. reality

Posted Jul 26, 2010 23:29 UTC (Mon) by dlang (guest, #313) [Link]

it's less 'sell it to me better' than it is 'fix it so it won't hurt anyone else (including those maintaining the rest of the kernel)'

what was eventually merged was not the same thing that was discussed earlier.

Linus is very pragmatic about accepting oddball things if they really don't hurt anything else. Unfortunately many of the things presented this way really do hurt the kernel, but mostly less in performance than in maintainability.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 10:59 UTC (Tue) by nix (subscriber, #2304) [Link] (5 responses)

systemtap will never ever ever go in until one day it suddenly does.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 14:35 UTC (Tue) by mjthayer (guest, #39183) [Link] (4 responses)

> systemtap will never ever ever go in until one day it suddenly does.

SystemTap or uprobes? For SystemTap itself, it looks to me as though Frank and friends have worked around requiring anything they can avoid in the kernel itself. And done it pretty well given the stones laid in the path of anyone doing kernel modules outside the official kernel tree.

Realtime Linux: academia v. reality

Posted Jul 28, 2010 13:01 UTC (Wed) by nix (subscriber, #2304) [Link] (3 responses)

Yeah, but even now you have people like Christoph Hellwig turning up on e.g. the glibc list when probe point adding patches were proposed, saying 'no, uprobes will never exist in this form'. So uprobes may exist but some kernel hackers are trying as hard as possible to make sure they stay useless and out of *every* package's upstream.

Realtime Linux: academia v. reality

Posted Jul 28, 2010 16:08 UTC (Wed) by mjthayer (guest, #39183) [Link] (2 responses)

> Yeah, but even now you have people like Christoph Hellwig turning up on e.g. the glibc list when probe point adding patches were proposed, saying 'no, uprobes will never exist in this form'.

I thought I read about work to make SystemTap work with procfs. Any idea what is happening there (Frank?) Although something seems to be happening currently with uprobes, and Christoph Hellwig seemed to be actively (positively) involved in the discussion (see http://lkml.org/lkml/2010/7/27/121).

Realtime Linux: academia v. reality

Posted Jul 28, 2010 16:18 UTC (Wed) by fuhchee (guest, #40059) [Link] (1 responses)

> I thought I read about work to make SystemTap work with procfs.
> Any idea what is happening there (Frank?)

I'm not quite sure which effort you may be referring to.

> ... Christoph Hellwig seemed to be actively (positively) involved

Yes. It is unfortunate though that instructions of the form "Don't SYMBOL_EXPORT this facility [since you-know-who might use it]" are still the order of the day.

Realtime Linux: academia v. reality

Posted Jul 28, 2010 16:27 UTC (Wed) by mjthayer (guest, #39183) [Link]

>> I thought I read about work to make SystemTap work with procfs.
>> Any idea what is happening there (Frank?)
> I'm not quite sure which effort you may be referring to.
I can't seem to find any reference to it now. If you are not aware of it then I strongly assume I misunderstood something at some point.

Realtime Linux: academia v. reality

Posted Jul 26, 2010 23:24 UTC (Mon) by dlang (guest, #313) [Link] (3 responses)

this is why it's a good idea to record presentations.

If the organizers aren't able to set this up (and for many settings they cannot), it's still a good idea to set a voice recorder on the stand. It's not going to be nearly as good as a more professional setup, but it can be surprisingly good and incredibly useful for recreating a talk (and especially for recreating the Q&A session)

Realtime Linux: academia v. reality

Posted Jul 27, 2010 7:17 UTC (Tue) by tzafrir (subscriber, #11501) [Link] (2 responses)

Transcribing this recording is still quite some work. If the author is willing to help there, that's great.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 16:39 UTC (Tue) by dlang (guest, #313) [Link] (1 responses)

agreed, but it's _far_ easier even with the presenter than recreating the presentation without a recording.

Conference audio

Posted Aug 1, 2010 19:22 UTC (Sun) by dmarti (subscriber, #11625) [Link]

See the audio person in advance, and carry a 1/8in.-2RCA patch cable and two RCA-1/4 in. adapters, and you can plug your recorder into "aux out" (usually 1/4 in.) or "tape out" (usually RCA) connectors on the mixer. Ask nicely and don't dork with anything.

Realtime Linux: academia v. reality

Posted Jul 26, 2010 23:53 UTC (Mon) by niv (guest, #8656) [Link] (3 responses)

Thomas,

Great talk! Thanks for the history. Couldn't agree with you more on the need to get better and easier access to Research papers/journals. Getting academia and industry to work more closely together is a noble cause (and important).

I know that the extensive use of Linux in RT academia is not a unique happenstance - much of networking research, among others, also uses Linux. So it would be good to have that joint forum/list for research and kernel people to have a wider focus, not just real-time. Were you thinking this would be only a real-time forum? Or something that other areas of research /
Linux could share?

Realtime Linux: academia v. reality

Posted Jul 27, 2010 16:46 UTC (Tue) by zander76 (guest, #6889) [Link] (2 responses)

Hey,

I would hazard a guess by saying that perhaps one of the problems is that they *don't* know what to research. They are required to come up with ideas so they do but based on what?

Perhaps an idea would be to create a list of things that *need* to be researched so when people are struggling to come up with research topics they can tackle a real problem instead of an imaginary problem.

Ben

Realtime Linux: academia v. reality

Posted Jul 30, 2010 2:42 UTC (Fri) by vonbrand (subscriber, #4458) [Link] (1 responses)

Problem is that "real" problems are usually messy and very hard to solve (and you have to convince "real" people that you solved them too!), while "imaginary" problems you can set up so they are solvable.

The old quip on the difference between theory and practice, in another guise ;-)

Realtime Linux: academia v. reality

Posted Jul 30, 2010 14:58 UTC (Fri) by sorpigal (subscriber, #36106) [Link]

Once nice thing about academic research is that analyzing problems is just as useful as solving them.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 0:09 UTC (Tue) by squidgit (guest, #42190) [Link]

Excellent article, thanks.

Realtime Linux: academia v. reality

Posted Jul 27, 2010 5:32 UTC (Tue) by imcdnzl (guest, #28899) [Link]

I think that the "publish or perish" mentality holds people back from trying to get their kernel code accepted. It's only important for most researchers to publish - not to do anything with it.

Some people found it really strange when I wanted to take the DCCP code and help put it into the kernel. Thankfully I persevered.

Publications in computer sciences: use arXiv!

Posted Jul 27, 2010 8:05 UTC (Tue) by danielpf (guest, #4723) [Link] (13 responses)

Concerning scientific publications behind IEEE's paywall, the computer scientists might think to use more often arXiv.org, computational sciences part, to post their preprints. In other fields like physics and astronomy arXiv is very popular and most preprints are available there which is very convenient for all researchers. By now most (young) researchers in physics and astronomy read almost exclusively arXiv. Some studies have shown that publishing on arXiv is as efficient as publishing in prestigious journals in terms of visibility and citations, and this at no cost for authors as well as for readers.

Publications in computer sciences: use arXiv!

Posted Jul 27, 2010 8:36 UTC (Tue) by rsidd (subscriber, #2582) [Link] (2 responses)

I agree. Meanwhile, in biology the "open access" movement is gaining fast, and several funding agencies (NIH, Wellcome Trust and others) require work to be freely accessible on PubMed no later than 6 (WT) or 12 (NIH) months after publication. (This can be the final author version of the manuscript, not necessarily the published version.)

The pressure needs to come from the scientists. In the case of arXiv, high-energy physicists went ahead and set it up, and developed the culture of depositing preprints there before even submitting to a journal (let alone publication). The journals basically had to go along with it. In the case of biology, scientists actively campaigned for it, to the extent of setting up an entire new open-access publisher (PLoS) that is now very highly regarded. Computer scientists, it seems, are content to leave their preprints on their homepages and let Google do the indexing.

Publications in computer sciences: use arXiv!

Posted Jul 27, 2010 10:05 UTC (Tue) by stijn (subscriber, #570) [Link] (1 responses)

Same Here! I was going to write something very close to your response. I was previously in mathematics/computer science, which used to be almost entirely pay-walled. Likely/apparently this is still the case. I'm now in bioinformatics, and the change in attitude is very refreshing. Research should as a matter of principle be available to all - as well as the underlying data, acknowledging that there will always be context specific considerations.

Publications in computer sciences: use arXiv!

Posted Jul 27, 2010 13:51 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

The biggest problem in Computer Science was the fact that people care about the technology. The physicists did not care whether the system they were using supported the concept of transclusions, or whether it used a self-describing metadata format, or even if it could be proven to scale across a distributed system. They were pragmatic about it, because they're physicists not computer scientists and so the computers were just a tool.

Whereas for a Computer Scientist the open access technology itself is a potential research topic. So you get crazy stuff like a project to figure out how to perform searches across potentially hundreds of OA repositories in a distributed system, all of them with separate policies and metadata formats - instead of one working repository.

On the other hand, once things started to take off, this turned into an advantage. Can't get administrative funds for the much needed performance optimisations in your archive containing 25 years of data structures papers? Make it a research project and get a grant. This seems to have worked out OK for e.g. Southampton.

Publications in computer sciences: use arXiv!

Posted Aug 1, 2010 2:19 UTC (Sun) by bokr (guest, #58369) [Link] (9 responses)

"Paywalls" such as at the IEEE exist, IMO, because someone
is looking at this kind of publishing as selling a product
to consumers, with ordinary business thinking about how to
pay costs of the product, as if a technical paper were only
an entertainment for people with special tastes.

I argue that a penniless student with the intelligence, interest,
and desire to understand -- i.e., who has this special taste --
actually has the most valuable coin with which to "pay" for his
copy: his attention-time (especially golden if passion-driven).

It is therefore sad to see copyright used to milk a little revenue
from an intellectual seeding process, for apparent lack of fund-raising
imagination or appreciation for the value to the country of
free and open information.

Such use of copyright goes directly against the stated goal
of the Constitution when it empowered Congress to make laws
"To promote the Progress of Science and useful Arts..." (which
gave us copyright and patent laws, per Article 1, Section 8).

I don't see information paywalls as promoting any science or art.
Tuition is another form of the same.

OTOH, I do understand that lwn is not yet funded by a peer-petitioned
grant through the Library of Congress (for recognition as a valued
independent implementer of many LOC goals), so I subscribe ;-)

don't ascribe to malice....

Posted Aug 1, 2010 2:51 UTC (Sun) by dlang (guest, #313) [Link] (8 responses)

the standard quote isn't directly relevant to this situation, but it's close.

I don't think the requirements to use the research paper publishers are due to malice, I think it's inertia.

In the days before the Internet, publishing was a very expensive thing to do, In such an environment an organization dedicated to separating the wheat from the chaff and publishing the wheat was an incredibly valuable service to provide.

Over the years, as the organization providing this service was able to make a profit from publishing things, and the cost of publishing has dropped, I think they have become less critical about what they publish, and so their value as a filter for 'the good stuff' has been dropping.

With the cost of publishing now almost zero, there would still be value in the service of evaluating papers to find the good ones, but I don't think any of the research publishers are really providing that service effectively anymore.

As such, I think that publishing the papers in a place where Google can find them (and apply the pagerank type algorithm to them) is at least as effective an indication of the probably quality of the papers.

It would be good if the various industry organizations would recognize this and make all the papers available, and provide a service for their members by reading everything they can and provide feedback to the author and quality scores for their members (along with indexing services to help their members find things)

I think that simply the process having a lot of people reading disparate documents would be valuable as the readers would be able to spot things across the different documents that the authors of the documents themselves are unaware of.

peer reviews

Posted Aug 1, 2010 18:10 UTC (Sun) by marcH (subscriber, #57642) [Link] (7 responses)

> With the cost of publishing now almost zero, there would still be value in the service of evaluating papers to find the good ones, but I don't think any of the research publishers are really providing that service effectively anymore.

Yet career progression still depends on this service. It is not clear to me how this could be replaced by PageRank.

peer reviews

Posted Aug 1, 2010 21:26 UTC (Sun) by dlang (guest, #313) [Link] (6 responses)

it could be changed to be based on the number of citations of your papers by other papers (which is arguably a better indication of your works worth than simply the number of papers published)

but if you want to count the papers published, that's pretty simple to do, even without the current publishing companies, simply document what you've published.

this doesn't include information about how good the papers are, but I have my doubts about the existing publishers really doing that anyway.

peer reviews

Posted Aug 1, 2010 21:27 UTC (Sun) by dlang (guest, #313) [Link] (5 responses)

if you really need to hae things reviewed, it would be better to have a system where the person submiting the paper pays to have it reviewed rather than the current system where readers have to pay for access to it.

peer reviews

Posted Aug 1, 2010 23:27 UTC (Sun) by corbet (editor, #1) [Link] (4 responses)

Actually, much peer-reviewed publishing has "page charges" to be paid by the author(s) (or their institution). The publishing industry does its best to collect from everybody involved.

peer reviews

Posted Aug 1, 2010 23:45 UTC (Sun) by dlang (guest, #313) [Link] (3 responses)

how much, if any, of this money gets to the people doing the reviews?

peer reviews

Posted Aug 1, 2010 23:47 UTC (Sun) by corbet (editor, #1) [Link] (2 responses)

Zero.

peer reviews

Posted Aug 2, 2010 0:24 UTC (Mon) by dlang (guest, #313) [Link] (1 responses)

what I expected, so it sounds like there should be room for someone to setup something new in this space.

one problem is figuring out how to minimize abuse, but the bigger problem is getting academia to accept it.

University libraries

Posted Aug 2, 2010 2:53 UTC (Mon) by dmarti (subscriber, #11625) [Link]

Professors who contribute to non-Open-Access journals are likely to get the stink eye from the librarian every time they walk in the university library. Library budgets are getting clobbered by increasing subscription prices, as the publishers sell university's own work back to it.

Background: Open Access Overview

If you're in the USA, please support the Federal Research Public Access Act -- this would at least stop the abuses where federally-funded research is concerned.

real world?

Posted Jul 27, 2010 16:08 UTC (Tue) by deater (subscriber, #11746) [Link] (21 responses)

I'd hesitate to group kernel developers in with the "real world". Kernel developers live in an ivory tower all of their own.

Try explaining to the perf-events developers sometime that you can't just "simply install a devel kernel" on your production system sometime. They won't believe you.

real world?

Posted Jul 27, 2010 16:16 UTC (Tue) by tzafrir (subscriber, #11501) [Link] (20 responses)

What's wrong with installing kernel-devel?

(Installing build tools may be a different issue. I suppose you also dislike having perl and python on that production system).

real world?

Posted Jul 27, 2010 16:29 UTC (Tue) by deater (subscriber, #11746) [Link] (1 responses)

You'd install a git pre-release kernel on your production machine?

real world?

Posted Jul 27, 2010 16:43 UTC (Tue) by nix (subscriber, #2304) [Link]

Sure!

... inside qemu, in off hours while load is low.

On the machine itself: aircraft carriers would sooner fly.

real world?

Posted Jul 27, 2010 16:47 UTC (Tue) by mingo (subscriber, #31122) [Link] (12 responses)

Well, i guess i should not feed the trolls, but FYI Vince (deater) is a perfmon2 contributor who has attacked perf events (a subsystem competing with perfmon2) numerous times in the past.

This particular attack has no merit either: the 'perf' package on Fedora is 561K only. If you want the latest devel kernel you can download the kernel SRPM which is ~50 MB. Or if you want the bleeding edge unstable source you can pick up the -tip kernel repo which is a ~61 MB Git repository. (and that's just the first download - subsequent updates go via the highly compressed Git delta changes protocol.)

So the 'several hundreds of megabytes' statement is just plain out wrong.

In any case, this matter has nothing to do with -rt :-)

Thanks, Ingo

real world?

Posted Jul 27, 2010 16:56 UTC (Tue) by tzafrir (subscriber, #11501) [Link]

Kernel source? kernel-devel?

real world?

Posted Jul 27, 2010 17:05 UTC (Tue) by deater (subscriber, #11746) [Link] (10 responses)

ummm... what "several hundred megabytes" statement?

In any case, in the "real world" that I am forced to live in, most of our production machines are stuck at 2.6.32 (or earlier) due to vendor and hardware support issues. Thus perfectly avoidable issues caused by including code in the kernel that could better be handled in userspace becomes more or less impossible to fix in the short term. When this is reported, we were told that we should just upgrade to 2.6.35-rc4 or whatever, which really doesn't fly.

I don't think this is really a "troll"; or if it is, then so is the whole article which makes many generalizations about researchers in academia.

real world?

Posted Jul 27, 2010 18:33 UTC (Tue) by mingo (subscriber, #31122) [Link] (9 responses)

ummm... what "several hundred megabytes" statement?

Here's a link to your earlier statement about this topic from not so long ago. Quote:

It would also be nice to be able to build perf without having to download the entire kernel tree, I often don't have the space or the bandwidth for hundreds of megabytes of kernel source when I just want to quick build perf on a new machine.

(Plus your claim that there is no "perf only" mailing list is wrong as well, there's one at linux-perf-users@vger.kernel.org.)

Thanks,

Ingo

real world?

Posted Jul 28, 2010 2:46 UTC (Wed) by deater (subscriber, #11746) [Link] (8 responses)

Ummm who is being off topic now. Though I'm glad to hear that some of the issues I raised 4 months ago have finally been addressed. As you probably noticed I've moved on to different issues.

I wouldn't mind this article if it phrased things as academia being very different than kernel devel. I do object to the idea of kernel devel being the real-world common case. I'm pretty sure for most people the real world is being stuck in userspace, often without the ability to do things as root. As a user, building a custom library in my home dir and linking my tools against it is easy; getting the sysadmins to replace the kernel is hard.

I brought up the recent perf-events issue as there's a large overlap between the rt-linux developers and the perf developers, and the whole idea of what constitutes real-world to perf developers came up recently in this lkml thread.

real world?

Posted Jul 28, 2010 3:18 UTC (Wed) by foom (subscriber, #14868) [Link] (6 responses)

FWIW, in my real world, it's usually easier to push out a new kernel than to upgrade userspace. Got some Fedora 3 boxes running 2.6.26 kernels, for instance.

real world?

Posted Jul 28, 2010 15:52 UTC (Wed) by nix (subscriber, #2304) [Link] (5 responses)

This depends very much on the position of that bit of userspace in the dependency chain, for me.

Upgrade glibc? Much more hair-raising and rarely done than a kernel upgrade. Upgrade some heavily-used library like libpng to an incompatible release? Not common, and not often done. Upgrade a userspace performance counter library? This isn't going to be the most widely-used thing on earth, upgrading it should be easy. Plus, as deater points out, random users can do this and keep it out of the way of other users completely. Random users cannot upgrade kernels, no matter what they do. Only the machine's sysadmins can do that, and often refuse on production systems unless the need is utterly horrifically critical.

real world?

Posted Aug 1, 2010 7:03 UTC (Sun) by mingo (subscriber, #31122) [Link] (3 responses)

This depends very much on the position of that bit of userspace in the dependency chain, for me.

It largely depends on how serious the effects of a bad upgrade are and how hard it is to go back to the old component.

The kernel is unique there: there can be multiple kernel packages installed at once, and switching between them is as easy as selecting a different kernel on bootup.

With glibc (or with any other user-space library) there is no such multi-version capability: if the glibc upgrade went wrong and even /bin/ls is segfaulting then it's game over and you are on to a difficult and non-standard system recovery job.

So yes, i agree with the grandparent and i too see it in the real world that the kernel is one of the easiest components to upgrade and is one of the easiest components to downgrade. It's also very often dependency-less. (there's a small halo of user-space tools like mkinitrd but nothing that affects many apps)

Try to upgrade/downgrade Xorg or glibc from a rescue image. I've yet to see a distro that allows that in an easy way.

(The only inhibitor to kernel upgrades are environments where rebooting is not allowed: large, shared systems. Those are generally difficult and constrained environments and you cannot do many forms of bleeding-edge development of infrastructure packages in such environments.)

real world?

Posted Aug 8, 2010 12:33 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

With glibc (or with any other user-space library) there is no such multi-version capability: if the glibc upgrade went wrong and even /bin/ls is segfaulting then it's game over and you are on to a difficult and non-standard system recovery job.

Though a copy of sash helps immensely there.

xorg is pretty easy to upgrade and downgrade actually because its shared library versioning is so strictly maintained. If you downgrade a really long way you might get burnt by the Xi1 -> Xi2 transition or the loss of the Render extension, but that's about all.

The kernel is particularly easy to upgrade *if you run the system and can reboot it on demand* (which is a good thing given the number of security holes it has!), but if both of those conditions are not true it is nearly impossible to upgrade. (Let's leave out of this the huge number of people running RHEL systems who think they're forbidden from upgrading the kernel by their support contract, even though they aren't...)

real world?

Posted Aug 8, 2010 20:50 UTC (Sun) by mingo (subscriber, #31122) [Link] (1 responses)

I guess we are getting wildy off-topic, but my own personal experience is very different: on my main desktop i run bleeding edge everything (kernel, Xorg, glibc, etc.) and just this year i've been through 3 difficult Xorg breakages which required the identification of some prior version of Xorg and libdrm packages and the unrolling of other dependencies.

One of them was a 'black screen on login' kind of breakage. Xorg breakages typically take several hours to resolve because pre-breakage packages have to be identified, downloaded and the dependency chain figured out - all manually.

Current Linux distributions are utterly incapable of doing a clean 'go back in time on breakage, and do it automatically, and allow it even on a system which was rendered unusable by the faulty package'. This is a big bleeding-edge-testers handicap for any multi-package infrastructure component such as Xorg.

OTOH single-package, multiple-installed-versions packages (such as the kernel) are painless: i don't remember when i last had a kernel breakage that prevented me from using my desktop - if then it took me no more than 5 minutes to resolve via: 'reboot, select previous kernel, there you go'.

glibc is _mostly_ painless for me, because breakages are rare - it's a very well-tested project. But if glibc breaks it's horrible to resolve due to not having multiple versions installed: everything needs glibc. My last such experience was last year, and it required several hours of rescue image surgery on the box to prevent a full reinstall - and all totally outside the regular package management system.

Plain users or even bleeding edge developers generally don't have the experience or time to resolve such problems, and as a result we have very very few bleeding edge testers for most big infrastructure packages but the kernel.

Thanks,

Ingo

real world?

Posted Aug 8, 2010 21:35 UTC (Sun) by nix (subscriber, #2304) [Link]

Oh, gods, yes, the libdrm/Mesa/driver-version combination nightmare is a tricky one I'd forgotten about. Of course that itself is sort of kernel-connected, because the only reason most of that stuff is revving so fast is because of changes on the kernel side :)

The go-back-in-time stuff is probably best supported by the Nix distro (no relation); the downside of that is that it requires an insane amount of recompilation whenever anything changes because it has no understanding of ABI-preservation conventions (assuming that all library users need to be rebuilt whenever a library is).

real world?

Posted Aug 8, 2010 12:38 UTC (Sun) by nix (subscriber, #2304) [Link]

Let me elaborate on 'utterly horrifically critical' here. 'fork() fails due to the machine being a 64-bit box with 64Gb RAM yet running a 32-bit kernel and having run out of low memory' is not sufficiently critical: the database is still running, after a fashion, and that's what matters.

They're scared of going to 64-bit kernels no matter what the benefits because that's not what they currently have installed so 'stuff might break' (as if 'cannot fork()' is not breakage): 32->64, the kernels simply must be completely different, right? Have to retest everything.

This is not a rare attitude among people who run big production Oracle systems without really knowing what they're doing because they're Oracle DBAs at heart, who learnt to handle Solaris and have now been forced to Linux by the lower costs. Yes, you'd hope that everyone running big iron databases backing huge financial things with billions riding on them would have a sysadmin who understood the machine a bit as well as DBAs who understood the database: you'd be wrong.

real world?

Posted Jul 28, 2010 20:09 UTC (Wed) by tglx (subscriber, #31301) [Link]

> I brought up the recent perf-events issue as there's a large overlap between the rt-linux developers and the perf developers, and the whole idea of what constitutes real-world to perf developers came up recently in this lkml thread.

This LKML thread says in your own words:

"...This is why event selection needs to be in user-space... it could be fixed instantly there, but the way things are done now it will take months to years for this fix to filter down to those trying to use perf counters..."

That's complete and utter bullshit and you know that.

The interface allows you to specify raw event codes. So you can fix your problem entirely in user space even w/o writing a single line of code.

Stop spreading FUD.

Thanks,

tglx

real world?

Posted Jul 27, 2010 21:55 UTC (Tue) by roskegg (subscriber, #105) [Link] (3 responses)

Believer it or not, the Debian and Ubuntu kernel build systems are baroque. And if you just build your devel kernel in the standard Linux way, Debian and Ubuntu can break mysteriously. It isn't pretty.

real world?

Posted Jul 27, 2010 22:27 UTC (Tue) by dlang (guest, #313) [Link] (2 responses)

I've been running custom kernels on debian systems for about 7 years without running into any problems

it depends on how you define 'the standard linux way'

are you using make install to install the kernel? (works if you are installing it on the kernel you build it on)

are you using the make system to create a .deb/.rpm file (I don't remember the make command for this, is it make kpkg?)

if you are doing neither, then you have the same problem that you can have an any distro, the modules directory is per kernel version, and if you don't compile all the modules you need you may end up using modules that were compiled from a prior version.

I avoid these problems by not using modules for my production servers. This lets me just worry about installing the kernel file itself on the systems.

real world?

Posted Jul 28, 2010 2:37 UTC (Wed) by deater (subscriber, #11746) [Link] (1 responses)

I guess we all have different definitions of "real world". Unfortunately in my world, I don't have root access on most machines I use day to day, and I also don't control what kernel gets booted on them. The world of being a plain user is very different from that of a kernel dev.

Sure, but what this has to do with anything?

Posted Jul 28, 2010 6:39 UTC (Wed) by khim (subscriber, #9252) [Link]

If you can't replace kernel then that's your problem: escalate to someone who can replace kernel. If noone can replace kernel then kernel developers have nothing to do with it - it's your own mess and you must to something with it.

Try to ask motor mechanic to fix you car without opening the hood sometime and hear what he thinks about this idea.

real world?

Posted Jul 29, 2010 23:41 UTC (Thu) by chad.netzer (subscriber, #4257) [Link]

Not 'kernel-devel' (an rpm with header files), but a 'devel kernel' (ie. a development kernel, aka v2.6.35-rc6)

Realtime Linux: academia v. reality

Posted Jul 28, 2010 6:06 UTC (Wed) by jmspeex (guest, #51639) [Link] (1 responses)

Research often tries to solve yesterday's problems over and over

I think at least part of the reason for that is the publication process. Between the time you actually do some work and the time it gets published, you can easily have 3 years, sometimes more. Most of that is due to the often painful and slow review process. On top of that, you have to consider that PhD students have to choose a topic early on and then stick to it. So if you spend 5 years working on a problem and then at the end you submit a paper paper that takes 3 years to get published, you end up with something that solves a problem as it was 8 years ago. Then of course you have the ones that *keep* working on problems that have become irrelevant.

Realtime Linux: academics kind of scarce!

Posted Jul 28, 2010 23:51 UTC (Wed) by alison (subscriber, #63752) [Link]

The group I work in at Stanford Linear Accelerator Center is quite interested in real-time Linux and we've had a (long) series of meetings to discuss our strategy as we migrate at least in part from VxWorks/VME and RTEMS/m68K. My boss agreed with me that having a PhD or Master's student work on a related problem (say multicore scheduling) would be useful. I started poking around the conference proceedings looking for profs at UC Berkeley, Stanford, San Jose State or Santa Clara University to collaborate with, and I don't think there are any. As far as I can tell, there are no faculty interested in any aspect of Linux at these distinguished universities. That's sad given their histories. Is it a worrisome sign for Linux? I don't know, but certainly having local CS departments discussing Linux couldn't hurt. Industrial interest in Linux (including RT embedded) continues to grow locally.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 2:09 UTC (Thu) by emnahum (guest, #5130) [Link]

A note on IEEE and the paywall: USENIX has all of its content online, for free. It frequently includes slides and video of the talks. OSDI, NSDI, ATC, etc.

I thought it odd that Thomas only considered IEEE as a source for OS research. My experience has been that ACM and USENIX are much better (higher quality) sources. Perhaps the real-time OS community is more associated with IEEE.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 2:16 UTC (Thu) by deater (subscriber, #11746) [Link] (4 responses)

You complain that academics are too lazy to post their findings on linux-kernel, the accepted way of interacting with kernel people.

Yet you also admit you are too lazy to write-up an academic paper on what you've discovered from real-world Linux implementation, which would be the accepted way of interacting with academic people.

I see a failure on both sides here.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 12:17 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (2 responses)

Writing a publication-quality academic paper is a rather more specialized skill than writing an e-mail fit to be sent to LKML. As such, it is the academics who can more reasonably be asked to make the extra effort here.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 13:17 UTC (Thu) by deater (subscriber, #11746) [Link] (1 responses)

Having done both, I think the kernel community and academia are surprisingly similar.

In both cases if you are only proposing an incremental idea/patch, it's fairly easy to get that looked at and considered.

However if you are planning something drastic, expect months to years of "peer review" (be it by entrenched professors or else just no-nonsense kernel devs).

The process can be long and arbitrary, and it turns off enough people that they don't bother trying. And yes, you have to fight politics, on both ends, and outsiders are treated with suspicion at first.

Realtime Linux: academia v. reality

Posted Jul 30, 2010 15:44 UTC (Fri) by sorpigal (subscriber, #36106) [Link]

The difference is that for the lkml you can fire-and-forget, if you so choose, and not worry about whether it gets accepted.

Realtime Linux: academia v. reality

Posted Jul 29, 2010 20:57 UTC (Thu) by tglx (subscriber, #31301) [Link]

If you would have read what I wrote without your blinders on, then you would have noticed that I precisely pointed out that there is a problem on both sides of the fence.

So what's the point ?

Criticizing Research

Posted Jul 30, 2010 18:02 UTC (Fri) by daglwn (guest, #65432) [Link]

Comparing and contrasting research results is almost impossible.

This is absolutely true. Half of my Ph.D. thesis was about the great difficulty of doing this. The research practice in the computing fields is broken. The hard sciences require reproducible results. We don't even require all of the assumptions to be stated. In my experience, it is impossible to reproduce experimental results from any computing paper because the experimental setup is not described fully.

Free Software could be a big help. If researchers were required to submit their code along with the paper it would go a long way to allowing others to not only verify the results but build upon them. But there's a great cultural fear of getting "scooped" that needs to be addressed before this can happen. Academic research in computing is much more competitive than collaborative and that's really the fundamental problem.

Realtime Linux: academia v. reality

Posted Jul 31, 2010 23:44 UTC (Sat) by marcH (subscriber, #57642) [Link]

Thanks Thomas for a sharp description of the academic world from an outsider (among others).

As you and many others have noted, the incredible amount of noise and hair-splitting is simply due to the "publish or perish" pressure, and also to the large number of researchers compared to the limited number of innovations of general interest.

I am nevertheless convinced that a better and stronger collaboration between industry and academia would be extremely beneficial to both the industry and the public (= Academia's employer in theory). Researchers have analysis and prototyping skills often badly needed in the industry. Researchers are often ignoring interesting industry problems (especially interesting since they are real).

I think the only way to achieve such a collaboration would be to have a significant number of individuals "bridging" these two very different universes by moving back and forth between the two. This number is not zero but it will unfortunately never be significant since building a career in the academic world is very painful and slow. So you cannot be late to enter the academic game, and besides some exceptional individuals most researchers cannot afford to leave it for a while. In brief I think the too inflexible academic universe is the main one to blame here.