The last word on SCons performance

My previous look at SCons performance compared SCons and gmake on a variety of build scenarios — full, incremental, and clean. A few people suggested that I try the tips given on the SCons ‘GoFastButton’ wiki page, which are said to significantly improve SCons performance (at the cost of some accuracy, of course). Naturally, I felt that I had to do one last follow-up exploring this avenue. And since that meant I would already be running a bunch of builds, I figured I’d try out SCons’ parallel build features too. My findings follow.
Read the rest of this entry »

What’s new in GNU make 3.82

GNU make 3.82 hit the streets last week, the first new release of the workhouse build tool in over four years. Why so long between releases? To me the answer is obvious: the tool Just Works ™, so there’s no need to churn out new releases chasing the latest development fad. But as this release shows, there is still room to innovate, without compromising on the points that make the tool so great. The two improvements I find most interesting are .ONESHELL, and changes to pattern-search behavior:
Read the rest of this entry »

A second look at SCons performance

UPDATE: In response to comments here and elsewhere, I’ve done another series of SCons builds using the tips on the SCons ‘GoFastButton’ wiki page. You can view the results here


A few months ago, I took a look at the scalability of SCons, a popular Python-based build tool. The results were disappointing, to say the least. That post stirred up a lot of comments, both here and in other forums. Several people pointed out that a comparison with other build tools would be helpful. Some suggested that SCons’ forte is really incremental builds, rather than the full builds I used for my test. I think those are valid points, so I decided to revisit this topic. This time around, I’ve got head-to-head comparisons between SCons and GNU make, the venerable old workhorse of build tools, as I use each tool to perform full, incremental, and clean builds. Read on for the gory details — and lots of graphs. Spoiler alert: SCons still looks pretty bad.
Read the rest of this entry »

Designing for high performance

Here’s the thing about high performance: you can’t just bolt it on at the end. It’s got to be baked in from day one. No doubt those of you who are experienced developers are now invoking the venerable Donald Knuth, who once said, “Premature optimization is the root of all evil.” But look at it this way: with very rare exceptions, no amount of performance tuning will turn an average system into a world class competitor.

Of course, high performance is the entire raison d’être for ElectricAccelerator. We knew from the start that parallelism would be the primary means of achieving our performance goals (although it’s not the only trick we used). Thanks to Amdahl’s law, we know that in order to accelerate a build by 100x, the serialized portion cannot be more than 1% of the baseline time. Thus it’s critical that absolutely everything that can be parallelized, is parallelized. And I mean everything, even the stuff that you don’t normally think about, because anything that doesn’t get parallelized disproportionately saps our performance. Anything that isn’t parallelized is a bottleneck.
Read the rest of this entry »

Using Markov Chains to Generate Test Input

One challenge that we’ve faced at Electric Cloud is how to verify that our makefile parser correctly emulates GNU Make. We started by generating test cases based on a close reading of the gmake manual. Then we turned to real-world examples: makefiles from dozens of open source projects and from our customers. After several years of this we’ve accumulated nearly two thousand individual tests of our gmake emulation, and yet we still sometimes find incompatibilities. We’re always looking for new ways to test our parser.

One idea is to generate random text and use that as a “makefile”. Unfortunately, truly random text is almost useless in this regard, because it doesn’t look anything like a real makefile. Instead, we can use Markov chains to generate random text that is very much like a real makefile. When we first introduced this technique, we uncovered 13 previously unknown incompatibilities — at the time that represented 10% of the total defects reported against the parser! Read on to learn more about Markov chains and how we applied them in practice.
Read the rest of this entry »

Makefile performance: built-in rules

Like any system that has evolved over many years, GNU Make is rife with appendages of questionable utility. One area this is especially noticeable is the collection of built-in rules in gmake. These rules make it possible to do things like compile a C source file to an executable without even having a makefile, or compile and link several source files with a makefile that simply names the executable and each of the objects that go into it.

But this convience comes at a price. Although some of the built-in rules are still relevant in modern environments, many are obsolete or uncommonly used at best. When’s the last time you compiled Pascal code, or used SCCS or RCS as your version control system? And yet every time you run a build, gmake must check every source file against each of these rules, on the off chance that one of them might apply. A simple tweak to your GNU Make command-line is all it takes to get a performance improvement of up to 30% out of your makefiles. Don’t believe me? Read on.
Read the rest of this entry »

Friday Fun: Generating Fibonacci Numbers with GNU Make

Nobody would ever claim that GNU Make is a general purpose programming language, but with a little work, we can coerce it into generating Fibonacci numbers for us. Why bother? Because we can.
Read the rest of this entry »

Rules with Multiple Outputs in GNU Make

I recently wrote an article for CM Crossroads exploring various strategies for handling rules that generate multiple output files in GNU make. If you’ve ever struggled with this problem, you should check out the article. I don’t want to spoil the exciting conclusion, but it turns out that the only way to really correctly capture this relationship in GNU make syntax is with pattern rules. That’s great if your input and output files share a common stem (eg, “parser” in parser.i, parser.c and parser.h), but if your files don’t adhere to that convention, you’re stuck with one of the alternatives, each of which have some strange caveats and limitations.

Here’s a question for you: if ElectricAccelerator had an extension that allowed you to explicitly mark a non-pattern rule as having multiple outputs, would you use it? For example:

#pragma multi
something otherthing: input
	@echo Generating something and otherthing from input...

What do you think? Comments encouraged.

ElectricAccelerator vs. distcc: samba reloaded

ElectricAccelerator vs distcc – samba reloaded

In an earlier post I compared the performance of ElectricAcclerator and distcc by building samba using each tool in turn on the same cluster. In that test I found that Accelerator bested distcc at suitably high levels of parallelism, but that distcc narrowly beat Accelerator at lower levels of parallelism. At the time I chalked the difference up to slightly higher overhead associated with Accelerator. But you must have known I couldn’t just leave it at that. I had to know where the overhead was coming from, and eliminate it, if possible. The exciting conclusion is after the break.
Read the rest of this entry »

Makefile performance: pattern-specific variables

If you’ve been using GNU make for some time, you are probably familiar with both pattern rules and target-specific variables. You may even be familiar with the intersection of these features: pattern-specific variables. But you may not be aware of a subtle change in gmake 3.81 which affects the processing of pattern-specific variables with potentially disastrous performance consequences.

Read the rest of this entry »