Wednesday 5 May 2010

The trouble with Buildbot

The trouble with Buildbot is that it encourages you to put rules into a Buildbot-specific build configuration that is separate from the normal configuration files that you might use to build a project (configure scripts, makefiles, etc.).

This is not a big problem if your Buildbot configuration is simple and just consists of, say, "svn up", "./configure", "make", "make test", and never changes.

But it is a problem if your Buildbot configuration becomes non-trivial and ever has to be updated, because the Buildbot configuration cannot be tested outside of Buildbot.

The last time I had to maintain a Buildbot setup, it was necessary to try out configuration changes directly on the Buildbot master. This doesn't work out well if multiple people are responsible for maintaining the setup! Whoever makes a change has to remember to check it in to version control after they've got it working, which of course doesn't always happen. It's a bit ironic that Buildbot is supposed to support automated testing but doesn't follow best practices for testing itself.

There is a simple way around this though: Instead of putting those separate steps -- "./configure", "make", "make test" -- into the Buildbot config, put them into a script, check the script into version control, and have the Buildbot config run that script. Then the Buildbot config just consists of doing "svn up" and running the script. It is then possible to test changes to the script before checking it in. I've written scripts like this that go as far as debootstrapping a fresh Ubuntu chroot to run tests in, which ensures your package dependency list is up to date.

Unfortunately, Buildbot's logging facilities don't encourage having a minimal Buildbot config.

If you use a complicated Buildbot configuration with many Buildbot steps, Buildbot can display each step separately in its HTML-formatted logs. This means:

  • you can see progress;
  • you can see which steps have failed;
  • you'd be able to see how long the steps take if Buildbot actually displayed that.

Whereas if you have one big build script in a single Buildbot step, all the output goes into one big, flat, plain text log file.

I think the solution is to decouple the structured-logging functionality from the glorified-cron functionality that Buildbot provides. My implementation of this is build_log.py (used in Plash's build scripts), which I'll write more about later.

Saturday 1 May 2010

Breakpoints in gdb using int3

Here is a useful trick I discovered recently while debugging some changes to the seccomp sandbox. To trigger a breakpoint on x86, just do:
__asm__("int3");

Then it is possible to inspect registers, memory, the call stack, etc. in gdb, without having to get gdb to set a breakpoint. The instruction triggers a SIGTRAP, so if the process is not running under gdb and has no signal handler for SIGTRAP, the process will die.

This technique seems to be fairly well known, although it's not mentioned in the gdb documentation. int3 is the instruction that gdb uses internally for setting breakpoints.

Sometimes it's easier to insert an int3 and rebuild than get gdb to set a breakpoint. For example, setting a gdb breakpoint on a line won't work in the middle of a chunk of inline assembly.

My expectations of gdb are pretty low these days. When I try to use it to debug something low level, it often doesn't work, which is why I have been motivated to hack together my own debugging tools in the past. For example, if I run gdb on the glibc dynamic linker (ld.so) on Ubuntu Hardy or Karmic, it gives:

$ gdb /lib/ld-linux-x86-64.so.2 
...
(gdb) run
Starting program: /lib/ld-linux-x86-64.so.2 
Cannot access memory at address 0x21ec88
(gdb) 

So it's nice to find a case where I can get some useful information out of gdb.