Dispatches from the compiler front
The up-and-coming LLVM compiler has been an irritation to some GCC developers for some time; LLVM apparently comes off as an upstart trying to muscle into territory which GCC has owned for a long time. So it's not surprising that occasionally the relationship between the two projects gets a little frosty.
Consider the case of DragonEgg, a GCC plugin which replaces the bulk of GCC's optimization and code-generation system with the LLVM implementation. DragonEgg is clearly a useful tool for LLVM developers, who can focus on improving the backend code while making use of GCC's well-developed front ends. Jack Howarth recently proposed the addition of DragonEgg as an official part of the GCC code base. Some developers welcomed the idea; Basile Starynkevitch, for example, thought it would make a good plugin example. But from others came complaints like this:
It's not clear that this is a majority opinion; some GCC developers see DragonEgg as an easy way to try out LLVM code and compare it against their own. If LLVM comes out on top, GCC developers can then figure out why or, possibly, just adopt the relevant LLVM code. Those developers see only benefit in some cooperative competition between the projects.
Others, though, see the situation as more of a zero-sum game; when viewed through that lens, cooperation with LLVM would appear to make little sense. But free software is not a zero-sum game; the more we can learn from each other, the better off we all are. GCC need not worry about being displaced by LLVM (or anything else) any time in the near future. Barring technical issues with the merging of DragonEgg (and none have been mentioned), accepting the code seems like it should be ultimately beneficial to the project.
In a side discussion, GCC developers wondered why LLVM seems to be more successful in attracting developers and mindshare in general. One suggestion was that LLVM has a clear leader who is able to set the direction of the project, while GCC is more scattered. Others have a different view; in this context, Ian Lance Taylor's notes are worth a look:
There is also the matter of the old code base, the lack of a clean separation between passes, and, most important, weak internal documentation.
Some of these issues are being fixed; others will take longer. It seems clear that attending to these problems is important for the long-term future of the project.
Lest things look too grim, though, it's worth perusing this posting from Taras Glek on his success with the GCC "profile-guided optimization" (PGO) feature. PGO works by instrumenting the binary, then rebuilding the program with optimization driven by the profile information. With Firefox, Taras was able to cut the startup time by one third and to reduce initial memory use considerably as well. Taras says:
There's no shortage of interesting, development-oriented tools being integrated into GCC, and the addition of the plugin architecture can only result in an acceleration of this process. Things have reached a point where more projects should probably be looking into the use of these tools to improve the experience for their users.
Meanwhile, on the LLVM side, the developers have recently unveiled the LLVM MC project. "MC" stands for "machine code" in this context; in short, the LLVM developers are trying to integrate the assembler directly into the compiler. There are a number of reasons for doing this, including performance (formatting text for a separate assembler and running that assembler are expensive operations), portability (not all target systems have an assembler out of the box), and the ability to easily add support for new processor instructions. Much of this functionality is required anyway for LLVM's just-in-time compiler features, so it makes sense to just finish the job.
This work appears to be fairly well advanced, with much of the basic functionality in place. Chris Lattner says:
In summary: there is currently a lot going on in the area of development
toolchains. Given that all of us - including those who do no development -
depend on those toolchains, this can only be a good thing. Computers can
do a lot to make the task of programming them easier and more robust;
despite the occasional glitch, developers for both GCC and LLVM appear to
be working hard to realize that potential.