<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3888382143307542639</id><updated>2012-01-19T06:32:32.363Z</updated><category term='packaging'/><category term='automated-testing'/><category term='python'/><category term='native-client'/><title type='text'>Lacking Rhoticity</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>40</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-7725825717843602559</id><published>2011-11-19T20:24:00.007Z</published><updated>2011-12-02T00:02:33.569Z</updated><title type='text'>Stack unwinding risks on 64-bit Windows</title><content type='html'>&lt;p&gt;
Recently, I've been looking at how x86-64 Windows does stack unwinding in 64-bit processes, and I've found some odd behaviour.  If the stack unwinder finds a return address on the stack that does not have associated unwind info, it applies a fallback unwind rule that does not make much sense.

&lt;p&gt;
I've been wondering if this could make some x86-64 programs more easily exploitable if they corrupt the stack or if they use x86-64 assembly code that does not have unwind info.

&lt;p&gt;
In pseudocode, the current unwind logic looks something like this:

&lt;pre&gt;
void unwind_stack(struct register_state regs) {
  while (???) {
    unwind_info = get_unwind_info(regs.rip);
    if (unwind_info) {
      if (has_exception_handler(unwind_info) {
        // Run exception handler...
      }
      regs = unwind_stack_frame(unwind_info, regs);
    }
    else {
      &lt;strong&gt;// Fallback case for leaf functions:&lt;/strong&gt;
      &lt;strong&gt;regs.rip = *(uint64_t *) regs.rsp;&lt;/strong&gt;
      &lt;strong&gt;regs.rsp += 8;&lt;/strong&gt;
    }
  }
}
&lt;/pre&gt;

&lt;p&gt;
The issue is that the fallback case only makes sense for the first iteration of the loop.  The fact that it is applied to later iterations is probably just sloppy programming.

&lt;p&gt;
The fallback case is &lt;a href="http://msdn.microsoft.com/en-us/library/8ydc79k6(v=VS.100).aspx"&gt;intended to handle "leaf functions"&lt;/a&gt;.  This means functions that:
&lt;ol&gt;
 &lt;li&gt; do not adjust %rsp, and
 &lt;li&gt; do not call other functions.
&lt;/ol&gt;
&lt;p&gt;
These two properties are related: if a function calls other functions, it must adjust %rsp first, otherwise it does not conform to the x86-64 ABI.

&lt;p&gt;
Since the fallback case is applied repeatedly, the unwinder will happily interpret the stack as a series of return addresses with no gaps between them:

&lt;pre&gt;
     ...
     8 bytes   return address
     8 bytes   return address
     8 bytes   return address
     ...
&lt;/pre&gt;

&lt;p&gt;
However, those are not valid stack frames in the &lt;a href="http://msdn.microsoft.com/en-us/library/ew5tede7.aspx"&gt;x86-64 Windows ABI&lt;/a&gt;.  A valid stack frame for a non-leaf function Foo() (i.e. a function that calls other functions) looks like this:

&lt;pre&gt;
  ------------ 16-byte aligned
    32 bytes   "shadow space" (scratch space for Foo())
     8 bytes   return address (points into Foo()'s caller)
     8 bytes   scratch space for Foo()
  16*n bytes   scratch space for Foo() (for some n &gt;= 0)
    32 bytes   "shadow space" (scratch space for Foo()'s callees)
  ------------ 16-byte aligned
&lt;/pre&gt;

&lt;p&gt;
This layout comes from two requirements in the Windows x86-64 ABI:

&lt;ol type="a"&gt;
 &lt;li&gt; %rsp must be 8mod16 on entry to a function.  (This means %rsp should be 16-byte aligned before a CALL instruction.)  This requirement is &lt;a href="http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions"&gt;common to Windows and Linux&lt;/a&gt;.
 &lt;li&gt; The caller must also reserve 32 bytes of "&lt;a href="http://msdn.microsoft.com/en-us/library/zthk2dkh.aspx"&gt;shadow space&lt;/a&gt;" on the stack above the return address, which the callee can use as scratch space.  (The x86-64 ABI on Linux does not have this requirement.)
&lt;/ol&gt;

&lt;p&gt;
This means that a function that does this:
&lt;pre&gt;
  bar:
    call qux
    ret
&lt;/pre&gt;
cannot be valid, because:
&lt;ul&gt;
 &lt;li&gt; it does not align the stack correctly on entry to qux();
 &lt;li&gt; more seriously, it does not allocate shadow space, so the return address that bar()'s RET instruction jumps to could have been overwritten by qux().
&lt;/ul&gt;

&lt;p&gt;
Yet the Windows unwinder treats the resulting stack frame as valid.

&lt;p&gt;
The upshot of the the fallback rule is that when the unwinder reaches a return address that does not have unwind info, it will scan up the stack looking for a value that does, looking at each 64-bit value in turn.  The unwinder seems to lack basic sanity checking, so it does not stop even if it hits a zero value (which clearly cannot be a valid return address).

&lt;p&gt;
This has a tendency to mask mistakes.  If you have an assembly function with incorrect or missing unwind info, the unwinder will tend to scan past uninitialised values or scratch data on the stack and recover at a higher-up stack frame.

&lt;p&gt;
The risky part is that the unwinder will be interpreting values as return addresses when they aren't return addresses at all.  If the unwinder hits an address whose unwind info has an associated exception handler, the unwinder will jump to the handler.  This is extra risky because language-level exceptions (e.g. for C++) and hardware exceptions are handled by the same mechanism on Windows, known as &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms680657(v=vs.85).aspx"&gt;Structured Exception Handling (SEH)&lt;/a&gt;.  Both result in stack unwinding on x86-64.  This means that null pointer dereferences trigger stack unwinding.  This is unlike Linux, where hardware exceptions don't normally trigger stack unwinding.

&lt;p&gt;
This means that an attacker might be able to do the following:
&lt;ul&gt;
 &lt;li&gt; find a function F containing an exception handler that does something interesting;
 &lt;li&gt; guess the address of the function F (more specifically, the address inside F that's inside F's __try block);
 &lt;li&gt; find another function G with incorrect or missing unwind info;
 &lt;li&gt; arrange for the address of F to be in G's stack frame;
 &lt;li&gt; cause a hardware exception (e.g. a null pointer dereference) to occur while G is on the stack;
 &lt;li&gt; and therefore cause the exception handler in F to get run.
&lt;/ul&gt;

&lt;p&gt;
The x86-64 version of &lt;a href="http://en.wikipedia.org/wiki/MinGW"&gt;MinGW&lt;/a&gt; (a GNU toolchain for Windows) is susceptible to this, because its version of GCC doesn't generate unwind info.  Here's an example of how a null pointer dereference can cause an exception handler to be run:

&lt;p&gt;
prog.c (compile with x86-64 MinGW's gcc):

&lt;pre&gt;
#include &amp;lt;stdio.h&gt;
#include &amp;lt;stdlib.h&gt;

long long get_addr();

void exc_handler() {
  printf("in exception handler!\n");
  exit(0);
}

void my_func(long long x) {
  // This might be necessary to force x to be written to the stack:
  // printf("stack=%p\n", &amp;x);

  // This should normally kill the process safely, but it doesn't here:
  *(int *) 0 = 0;
}

int main() {
  my_func(get_addr());
  return 0;
}
&lt;/pre&gt;

&lt;p&gt;
get_unwind_ret.asm (assemble with Microsoft's x86-64 assembler):

&lt;pre&gt;
PUBLIC get_addr
EXTRN exc_handler:PROC

_TEXT SEGMENT

; Helper function for getting the address inside the function below.
get_addr PROC
        lea rax, ret_addr
        ret
get_addr ENDP

; Innocent function that uses an exception handler.
blah PROC FRAME :exc_handler
.endprolog
ret_addr LABEL PTR
        hlt
blah ENDP

_TEXT ENDS

END
&lt;/pre&gt;

&lt;p&gt;
Here's what I get on Cygwin:

&lt;pre&gt;
$ ml64 /c get_unwind_ret.asm
$ x86_64-w64-mingw32-gcc get_unwind_ret.obj prog.c -o prog
$ ./prog
in exception handler!
&lt;/pre&gt;

&lt;p&gt;
This works because GCC generates the following code without unwind info:

&lt;pre&gt;
$ x86_64-w64-mingw32-gcc prog.c -S -o -
...
my_func:
        pushq %rbp
        movq %rsp, %rbp
        movq %rcx, 16(%rbp)
        movl $0, %eax
        movl $0, (%rax)
        leave
        ret
...
&lt;/pre&gt;

&lt;p&gt;
my_func()'s argument is written to the shadow space above the return address that points to main(), but since main() doesn't have unwind info either, the spilled argument gets interpreted as a return address by the unwinder.

&lt;p&gt;
Some conclusions:

&lt;ul&gt;
 &lt;li&gt; Be extra careful if you're writing x86-64 assembly code on Windows.
 &lt;li&gt; Be careful if you're using GCC to generate x86-64 code on Windows and check whether it is producing unwind info.  (There is no problem on x86-32 because Windows does not use stack unwinding for exception handling on x86-32.)
 &lt;li&gt; Windows' x86-64 stack unwinder should be changed to be stricter.  It should stop if it encounters a return address without unwind info.
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-7725825717843602559?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/7725825717843602559/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=7725825717843602559' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7725825717843602559'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7725825717843602559'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2011/11/stack-unwinding-risks-on-64-bit-windows.html' title='Stack unwinding risks on 64-bit Windows'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-9082624547851455154</id><published>2011-11-17T08:40:00.002Z</published><updated>2011-11-17T08:44:43.547Z</updated><title type='text'>ARM cache flushing &amp; doubly-mapped pages</title><content type='html'>&lt;p&gt;
If you're familiar with the ARM architecture you'll probably know that self-modifying code has to be careful to flush the instruction cache on ARM.  (Back in the 1990s, the introduction of the StrongARM with its split instruction and data caches broke a lot of programs on RISC OS.)
&lt;/p&gt;
&lt;p&gt;
On ARM Linux there's a syscall, &lt;tt&gt;cacheflush&lt;/tt&gt;, for flushing the instruction cache for a range of virtual addresses.
&lt;/p&gt;
&lt;p&gt;
This syscall works fine if you map some code as RWX (read+write+execute) and execute it from the same virtual address that you use to modify it.  This is how JIT compilers usually work.
&lt;/p&gt;
&lt;p&gt;
In the Native Client sandbox, though, for dynamic code loading support, we have code pages that are mapped twice, once as RX (read+execute) and once as RW (read+write).  Naively you'd expect that after you write instructions to the RW address, you just have to call &lt;tt&gt;cacheflush&lt;/tt&gt; on the RX address.  However, that's not enough.  &lt;tt&gt;cacheflush&lt;/tt&gt; doesn't just clear the i-cache.  It also flushes the data cache to ensure that writes are committed to memory so that they can be read back by the i-cache.  The two parts are clear if you look at the kernel implementation of &lt;tt&gt;cacheflush&lt;/tt&gt;, which does two different MCRs on the address range.
&lt;/p&gt;
&lt;p&gt;
I guess the syscall interface was not designed with double-mapped pages in mind, since it doesn't allow the i-cache and d-cache to be flushed separately.  For the time being, Native Client will have to call &lt;tt&gt;cacheflush&lt;/tt&gt; on both the RW and RX mappings.
&lt;/p&gt;
&lt;p&gt;
See &lt;a href="http://code.google.com/p/nativeclient/issues/detail?id=2443"&gt;NaCl issue 2443&lt;/a&gt; for where this came up.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-9082624547851455154?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/9082624547851455154/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=9082624547851455154' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/9082624547851455154'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/9082624547851455154'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2011/11/arm-cache-flushing-doubly-mapped-pages.html' title='ARM cache flushing &amp; doubly-mapped pages'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-7246518557751554487</id><published>2011-08-23T16:59:00.003+01:00</published><updated>2011-08-23T17:29:32.338+01:00</updated><title type='text'>Fixing the trouble with Buildbot</title><content type='html'>Last year I wrote a blog post, &lt;a href="http://lackingrhoticity.blogspot.com/2010/05/trouble-with-buildbot.html"&gt;"The trouble with Buildbot"&lt;/a&gt;, about how Buildbot creates a dilemma for complex projects because it forces you to choose between two ways of describing a project's build steps:

&lt;ul&gt;
&lt;li&gt; You can describe build steps in the Buildbot config.   Buildbot configs are awkward to update -- someone has to restart the Buildbot master -- and hard to test, but you get the benefit that the build steps appear as separate steps in Buildbot's display.
&lt;li&gt; You can write a script which runs the build steps directly, and check it into the same repository as your project.  This is easier to maintain and test, but traditionally all the output from the script would appear as a single Buildbot build step, making the output hard to read.
&lt;/ul&gt;

&lt;p&gt;
Fortunately, Brad Nelson has addressed this problem with an extension to Buildbot known as &lt;a href="https://sites.google.com/a/chromium.org/dev/developers/testing/chromium-build-infrastructure/buildbot-annotations"&gt;"Buildbot Annotations"&lt;/a&gt;.  The Python code for this currently lives in &lt;a href="http://src.chromium.org/viewvc/chrome/trunk/tools/build/scripts/master/chromium_step.py?revision=97010&amp;view=markup"&gt;chromium_step.py&lt;/a&gt; (see AnnotatedCommand).

&lt;p&gt;
The idea is that your checked-in script will run multiple steps sequentially but output tags between them (e.g. "@@@BUILD_STEP tests@@@") so that the output can be parsed into chunks by the Buildbot master, and displayed as separate chunks.

&lt;p&gt;
For example, an early version of Native Client's Annotations-based buildbot script looked something like this:

&lt;pre&gt;
...
echo @@@BUILD_STEP gyp_compile@@@
make -C .. -k -j12 V=1 BUILDTYPE=${GYPMODE}

echo @@@BUILD_STEP scons_compile${BITS}@@@
./scons -j 8 -k --verbose ${GLIBCOPTS} --mode=${MODE}-host,nacl \
    platform=x86-${BITS}

echo @@@BUILD_STEP small_tests${BITS}@@@
./scons -k --verbose ${GLIBCOPTS} --mode=${MODE}-host,nacl small_tests \
    platform=x86-${BITS} ||
    { RETCODE=$? &amp;amp;&amp;amp; echo @@@STEP_FAILURE@@@;}
...
&lt;/pre&gt;

&lt;p&gt;
(More recently, this shell script has been replaced with a Python script.)

&lt;p&gt;
You can see this in use on the &lt;a href="http://build.chromium.org/p/client.nacl/console"&gt;Native Client Buildbot&lt;/a&gt; page (and also on the &lt;a href="http://build.chromium.org/p/tryserver.nacl/waterfall"&gt;trybot&lt;/a&gt; page, though that's less readable).

&lt;p&gt;
The logic for running NaCl's many build steps -- including a clobber step, a Scons build, a Gyp build, small_tests, medium_tests, large_tests, chrome_browser_tests etc. -- used to live in the Buildbot config, and we'd usually have to get Brad Nelson to change it on our behalf.  Brad would have to restart the buildbot masters manually, and this would halt any builds that were in progress, including trybot jobs.

&lt;p&gt;
Now the knowledge of these build steps has moved into scripts that are checked into the native_client repo, which can easily be updated.  We can change the scripts at the same time as changing other code, with an atomic SVN commit.  Changes can be tested via the trybots.

&lt;p&gt;
Chromium is not using Buildbot Annotations yet for its buildbot, but it would be good to switch it over.  One obstacle is timeout handling.  Buildbot's build steps can have separate timeouts, and the Buildbot build slave is responsible for terminating a build step's subprocess(es) if they take too long.  With Buildbot Annotations, the responsibility for doing per-step timeouts would move to the checked-in build script.

&lt;p&gt;
The current Annotations output format has some down sides:

&lt;ul&gt;
&lt;li&gt; The syntax is simple but kind of ugly.
&lt;li&gt; It's not possible to nest build steps.
&lt;li&gt; It's not possible to interleave output from two concurrent build steps.
&lt;/ul&gt;

&lt;p&gt;
Overall, Annotations reduces our dependence on Buildbot.  If there were a simpler, more scalable alternative to Buildbot that also supported the Annotations format, we could easily switch to it because we our Buildbot config is not as complex as it used to be.
&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-7246518557751554487?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/7246518557751554487/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=7246518557751554487' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7246518557751554487'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7246518557751554487'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2011/08/fixing-trouble-with-buildbot.html' title='Fixing the trouble with Buildbot'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5896682843241525872</id><published>2011-02-10T01:34:00.005Z</published><updated>2011-02-11T03:49:19.131Z</updated><title type='text'>Cookies versus the Chrome sandbox</title><content type='html'>&lt;p&gt;
Although Chrome's sandbox does not protect one web site from another in general, it can provide such protection in some cases.  Those cases are ones in which HTTP cookies are either reduced in scope or not used at all.  One lesson we could draw from this is that cookies reduce the usefulness of Chrome's sandbox.
&lt;/p&gt;

&lt;p&gt;
The scenario we are exploring supposes that there is a vulnerability in Chrome's renderer process, and that the vulnerability lets a malicious site take control of the renderer process.  This means that all the restrictions that are normally enforced on the malicious site by the renderer process are stripped away, and all we are left with are the restrictions enforced on the renderer process by the Chrome browser process and the Chrome sandbox.
&lt;/p&gt;

&lt;p&gt;
In my &lt;a href="http://lackingrhoticity.blogspot.com/2010/12/chrome-sandbox-common-misconception.html"&gt;previous blog post&lt;/a&gt;, I explained how an attacker site, evil.com, that manages to exploit the renderer process could steal the login cookies from another site, mail.com, and so gain access to the user's e-mail.
&lt;/p&gt;

&lt;p&gt;
The attack is made possible by the combination of two features:
&lt;/p&gt;

&lt;ol type="a"&gt;
&lt;li&gt; cookies
&lt;li&gt; frames
&lt;/ol&gt;

&lt;p&gt;
Chrome currently runs a framed page in the same renderer process as the parent page.  HTML standards allow framed pages to access cookies, so the browser process has to give the renderer process access to the cookies for both pages.
&lt;/p&gt;

&lt;p&gt;
Because this problem arises from the interaction of these features, one site is not &lt;em&gt;always&lt;/em&gt; vulnerable to other sites.  There should be a couple of ways that users and sites can mitigate the problem, without changing Chrome.  Firstly, the user can change how cookies are scoped within the browser by setting up multiple profiles.  Secondly, a site can skirt around the problem by not using cookies at all.  We discuss these possibilities below.
&lt;/p&gt;

&lt;ul&gt;

&lt;li&gt;&lt;strong&gt;Use multiple profiles:&lt;/strong&gt;  As a user, you can create multiple browser profiles, and access mail.com and evil.com in separate profiles.

&lt;p&gt;
Chrome does not make this very easy at the moment.  It provides a command line option (--user-data-dir) for creating more profiles, but this feature is not available through the GUI.  Chrome's GUI provides just two profiles:  one profile (persistent) for normal windows, plus an extra profile (non-persistent) for windows in "incognito" mode.
&lt;/p&gt;

&lt;p&gt;
So, you could log in to mail.com in a normal window and view evil.com in an incognito window, or vice versa.  This is safer because cookies registered by a web site in one profile are not accessible to sites visited in another profile.  Each profile has a separate pool of cookies.  This feature of browser profiles means you can log into one mail.com account in incognito mode and a different mail.com account in normal mode.
&lt;/p&gt;

&lt;p&gt;
It would be interesting to see if this profile separation could be automated.
&lt;/p&gt;

&lt;li&gt;&lt;strong&gt;Use web-keys instead of cookies:&lt;/strong&gt;  The developers of mail.com could defend against evil.com by not using cookies to track user logins.  Instead mail.com could use web-keys.  Logging in to mail.com would take you to a URL containing an unguessable token.  Such a URL is known as a "&lt;a href="http://waterken.sourceforge.net/web-key/"&gt;web-key&lt;/a&gt;".  (For &lt;a href="http://waterken.sourceforge.net/web-key/#where"&gt;technical reasons&lt;/a&gt;, the unguessable token should be in the "#" fragment part of the URL.)
&lt;/p&gt;

&lt;p&gt;
This is safer because even if evil.com compromises the renderer process, the Chrome browser process generally does not give the renderer process a way to enumerate other tabs, discover other tabs' URLs, or enumerate the user's browsing history and bookmarks.
&lt;/p&gt;

&lt;p&gt;
Using web-keys has been proposed before to address a different (but related) web security problem, clickjacking.  (See Tyler Close's essay, &lt;a href="http://waterken.sourceforge.net/clickjacking/"&gt;The Confused Deputy Rides Again&lt;/a&gt;.)
&lt;/p&gt;

&lt;p&gt;
Using web-keys would change the user interface for mail.com a little.  Cookies are the mechanism by which entering "mail.com" in the URL bar can take you directly to the inbox of the e-mail account you are logged in to.  Whenever the browser sees "mail.com" as the domain name it automatically adds the cookies for "mail.com" to the HTTP request.  (This automatic attachment of credentials makes this a type of &lt;a href="http://en.wikipedia.org/wiki/Ambient_authority"&gt;ambient authority&lt;/a&gt;.)  This mechanism adds some convenience for the user, but it is also a means by which evil.com can attack mail.com, because whenever evil.com links to mail.com, the cookies get added to the request too.  The browser does not distinguish between a URL entered by the user in the address bar and a URL provided by another site like evil.com.
&lt;/p&gt;

&lt;p&gt;
So if mail.com were changed to remove its use of cookies, you would lose the small convenience of being able to get to your inbox directly by typing "mail.com" without re-logging-in.  Is there a way to get this convenience back?  An alternative to using cookies to record logins is to use bookmarks.  If mail.com uses web-key URLs, you could bookmark the address of the inbox page.  To get to your inbox without re-logging-in, you would select the bookmark.
&lt;/p&gt;

&lt;p&gt;
These days the Chrome address bar accepts more than just URLs (which is why the address bar is actually called the "omnibar"), so you could imagine that entering "mail.com" would search your bookmarks and jump to your bookmarked inbox page rather than being treated as the URL "http://mail.com".
&lt;/p&gt;

&lt;/ul&gt;

&lt;h2&gt;Conclusions&lt;/h2&gt;

&lt;p&gt;
There are a couple of possible conclusions we could draw from this:
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Cookies weaken the Chrome sandbox.
&lt;li&gt; Frames weaken the Chrome sandbox.
&lt;/ul&gt;

&lt;p&gt;
If we blame cookies, we should look critically at other browser features that behave similarly to cookies by providing ambient authority.  For example, the new &lt;a href="http://www.w3.org/TR/file-system-api/"&gt;LocalFileSystem&lt;/a&gt; feature (an extension of the &lt;a href="http://www.w3.org/TR/FileAPI/"&gt;File API&lt;/a&gt;) provides local storage.  The file store it provides is per-origin and so is based on ambient authority.  If mail.com uses this to cache e-mail, and evil.com exploits the renderer process, then evil.com will be able to read and modify the cached e-mail.  There are other local storage APIs (&lt;a href="http://dev.w3.org/html5/webstorage/"&gt;IndexedDB&lt;/a&gt; and &lt;a href="http://dev.w3.org/html5/webstorage/"&gt;Web Storage&lt;/a&gt;), but they are based on ambient authority too.  From this perspective, the situation is getting worse.
&lt;/p&gt;

&lt;p&gt;
If we blame frames, this suggests that browsers should fix the problem by implementing site isolation.  Site isolation means that the browser would put a framed page in a different renderer process from a parent page.  Microsoft's experimental &lt;a href="http://en.wikipedia.org/wiki/Gazelle_(web_browser)"&gt;Gazelle&lt;/a&gt; browser implements site isolation but breaks compatibility with the web.  It remains to be seen whether a web browser can implement site isolation while retaining compatibility and good performance.
&lt;/p&gt;

&lt;p&gt;
Either way, concerned users and web app authors need to know how browser features are implemented if they are to judge how much security the browser can provide.  That's not easy, because the web platform is so complicated!
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5896682843241525872?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5896682843241525872/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5896682843241525872' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5896682843241525872'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5896682843241525872'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2011/02/cookies-versus-chrome-sandbox.html' title='Cookies versus the Chrome sandbox'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-3057611303502750435</id><published>2010-12-21T00:34:00.005Z</published><updated>2010-12-21T01:27:16.065Z</updated><title type='text'>A common misconception about the Chrome sandbox</title><content type='html'>A common misconception about the Chrome web browser is that its sandbox protects one web site from another.

&lt;p&gt;
For example, suppose you are logged into your e-mail account on mail.com in one tab, and have evil.com open in another tab.  Suppose evil.com finds an exploit in the renderer process, such as a memory safety bug, that lets it run arbitrary code there.  Can evil.com get hold of your HTTP cookies for mail.com, and thereby access your e-mail account?

&lt;p&gt;
Unfortunately, the answer is yes.

&lt;p&gt;
The reason is that mail.com and evil.com can be assigned to the same renderer process.  The browser does not only do this to save memory.  evil.com can cause this to happen by opening an iframe on mail.com.  With mail.com's code running in the same exploited renderer process, evil.com can take it over and read the cookies for your mail.com account and use them for its own ends.

&lt;p&gt;
There are a couple of reasons why the browser puts a framed site in the same renderer process as the parent site.  Firstly, if the sites were handled by separate processes, the browser would have to do costly compositing across renderer processes to make the child frame appear inside the parent frame.  Secondly, in some cases the DOM allows Javascript objects in one frame to obtain references to DOM objects in other frames, even across origins, and it is easier for this to be managed within one renderer process.

&lt;p&gt;
I don't say this to pick on Chrome, of course.  It is better to have the sandbox than not to have it.

&lt;p&gt;
Chrome has never claimed that the sandbox protects one site against another.  In the tech report &lt;a href="http://seclab.stanford.edu/websec/chromium/"&gt;"The Security Architecture of the Chromium Browser"&lt;/a&gt; (Barth, Jackson, Reis and the Chrome Team; 2008), "Origin isolation" is specifically listed under "Out-of-scope goals".  They state that "an attacker who compromises the rendering engine can act on behalf of any web site".

&lt;p&gt;
There are a couple of ways that web sites and users can mitigate this problem, which I'll discuss in another post.  However, in the absence of those defences, what Chrome's multi-process architecture actually gives you is the following:

&lt;ul&gt;

&lt;li&gt; &lt;em&gt;Robustness if a renderer crashes.&lt;/em&gt;  Having multiple renderer processes means that a crash of one takes down only a limited number of tabs, and the browser and the other renderers will survive.  It also helps memory management.

&lt;p&gt;
  But we can get this without sandboxing the renderers.

&lt;li&gt; &lt;em&gt;Protection of the rest of the user's system from vulnerabilities in the renderer process.&lt;/em&gt;  For example, the sandboxed renderer cannot read any of the user's files, except for those the user has granted through a "File Upload" file chooser.

&lt;p&gt;
  But we can get this by sandboxing the whole browser (including any subprocesses), without needing to have the browser separated from the renderer.

&lt;p&gt;
  For example, since 2007 I have been running Firefox under &lt;a href="http://plash.beasts.org"&gt;Plash&lt;/a&gt; (a sandbox), on Linux.

&lt;p&gt;
  In principle, such a sandbox should be more effective at protecting applications and files outside the browser than the Chrome sandbox, because the sandbox covers all of the browser, including its network stack and the so-called browser "chrome" (this means the parts of the GUI outside of the DOM).

&lt;p&gt;
  In practice, Plash is not complete as a sandbox for GUI apps because it does not limit access to the X Window System, so apps can do things that X allows such as screen scraping other apps and sending them input.
&lt;/ul&gt;

&lt;p&gt;
The main reason Chrome was developed to sandbox its renderer processes but not the whole browser is that this is easier to implement with sandboxing technologies that are easily deployable today.  Ideally, though, the whole browser would be sandboxed.  One of the only components that would stay unsandboxed, and have access to all the user's files, would be the "File Open" dialog box for choosing files to upload.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-3057611303502750435?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/3057611303502750435/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=3057611303502750435' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3057611303502750435'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3057611303502750435'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/12/chrome-sandbox-common-misconception.html' title='A common misconception about the Chrome sandbox'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-6476327201052017521</id><published>2010-12-18T19:06:00.002Z</published><updated>2010-12-18T19:19:33.728Z</updated><title type='text'>When printf debugging is a luxury</title><content type='html'>Inserting printf() calls is often considered to be a primitive fallback when other debugging tools are not available, such as stack backtraces with source line numbers.

&lt;p&gt;
But there are some situations in low-level programming where most libc calls don't work and so even printf() and assert() are unavailable luxuries.  This can happen:
&lt;ul&gt;
 &lt;li&gt; when libc is not properly initialised yet;
 &lt;li&gt; when we writing code that is called by libc and cannot re-enter libc code;
 &lt;li&gt; when we are in a signal handler;
 &lt;li&gt; when only limited stack space is available;
 &lt;li&gt; when we cannot allocate memory for some reason; or
 &lt;li&gt; when we are not even linked to libc.
&lt;/ul&gt;

&lt;p&gt;
Here's a fragment of code that has come in handy in these situations.  It provides a simple assert() implementation:

&lt;pre&gt;
#include &amp;lt;string.h&gt;
#include &amp;lt;unistd.h&gt;

static void debug(const char *msg) {
  write(2, msg, strlen(msg));
}

static void die(const char *msg) {
  debug(msg);
  _exit(1);
}

#define TO_STRING_1(x) #x
#define TO_STRING(x) TO_STRING_1(x)

#define assert(expr) {                                                        \
  if (!(expr)) die("assertion failed at " __FILE__ ":" TO_STRING(__LINE__)    \
                   ": " #expr "\n"); }
&lt;/pre&gt;

&lt;p&gt;
By using preprocessor trickery to construct the assertion failure string at compile time, it avoids having to format the string at runtime.  So it does not need to allocate memory, and it doesn't need to do multiple write() calls (which can become interleaved with other output in the multi-threaded case).

&lt;p&gt;
Sometimes even libc's write() is a luxury.  In some builds of GNU libc on Linux, glibc's syscall wrappers use the TLS register (%gs on i386) to fetch the address of a routine for making syscalls.

&lt;p&gt;
However, if %gs is not set up properly for some reason, this will fail.  For example, for &lt;a href="http://code.google.com/p/nativeclient"&gt;Native Client&lt;/a&gt;'s i386 sandbox, %gs is set to a different value whenever sandboxed code is running, and %gs stays in this state if sandboxed code faults and triggers a signal handler.  In Chromium's &lt;a href="http://code.google.com/p/seccompsandbox"&gt;seccomp-sandbox&lt;/a&gt;, %gs is set to zero in the trusted thread.

&lt;p&gt;
In those situations we have to bypass libc and do the system calls ourselves.  The following snippet comes from &lt;a href="http://code.google.com/p/seccompsandbox/source/browse/trunk/reference_trusted_thread.cc?r=153"&gt;reference_trusted_thread.cc&lt;/a&gt;.  The sys_*() functions are defined by &lt;a href="http://code.google.com/p/seccompsandbox/source/browse/trunk/linux_syscall_support.h?r=153"&gt;linux_syscall_support.h&lt;/a&gt;, which provides wrappers for many Linux syscalls:

&lt;pre&gt;
#include "linux_syscall_support.h"

void die(const char *msg) {
  sys_write(2, msg, strlen(msg));
  sys_exit_group(1);
}
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-6476327201052017521?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/6476327201052017521/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=6476327201052017521' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6476327201052017521'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6476327201052017521'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/12/when-printf-debugging-is-luxury.html' title='When printf debugging is a luxury'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-3545453971462747842</id><published>2010-11-04T12:48:00.004Z</published><updated>2010-11-04T13:31:45.238Z</updated><title type='text'>An introduction to FreeBSD-Capsicum</title><content type='html'>In my &lt;a href="http://lackingrhoticity.blogspot.com/2010/10/process-descriptors-in-freebsd-capsicum.html"&gt;last blog post&lt;/a&gt;, I described one of the features in &lt;a href="http://www.cl.cam.ac.uk/research/security/capsicum/"&gt;FreeBSD-Capsicum&lt;/a&gt;:  process descriptors.  Now it's time for an overview of Capsicum.

&lt;p&gt;
Capsicum is a set of new features for FreeBSD that adds better support for sandboxing, using a capability model in which the capabilities are Unix file descriptors (FDs).

&lt;p&gt;
Capsicum takes a fairly conservative approach, in that it does not make operations on file descriptors virtualisable.  This approach has some limitations -- we do not get the advantages of having purely &lt;a href="http://lackingrhoticity.blogspot.com/2010/01/why-system-calls-should-be-message.html"&gt;message-passing syscalls&lt;/a&gt;.  However, it does mean that the new features are orthogonal.

&lt;p&gt;
The main new features are:

&lt;ul&gt;
&lt;li&gt;
   &lt;em&gt;A per-process "capability mode"&lt;/em&gt;, which is turned on via a new cap_enter() syscall.

&lt;p&gt;
   This mode disables any system call that provides ambient authority.  So it disables system calls that use global namespaces, including the file namespace (e.g. open()), the PID namespace (e.g. kill()) and the network address namespace (e.g. connect()).

&lt;p&gt;
   This is not just a syscall filter, though.  Some system calls optionally use a global namespace.  For example, sendmsg() and sendto() optionally take a socket address.  For openat(), the directory FD can be omitted.  Capability mode disables those cases.

&lt;p&gt;
   Furthermore, capability mode disallows the use of ".." (parent directory) in filenames for openat() and the other *at() calls.  This changes directory FDs to be limited-authority objects that convey access to a specific directory and not the whole filesystem.  (It is interesting that this appears to be a property of the process, via capability mode, rather than of the directory FD itself.)

&lt;p&gt;
   Capability mode is inherited across fork and exec.

&lt;li&gt;
   &lt;em&gt;Finer-grained permissions for file descriptors.&lt;/em&gt;  Each FD gets a large set of permission bits.  A less-permissive copy of an FD can be created with cap_new().  For example, you can have read-only directory FDs, or non-seekable FDs for files.

&lt;li&gt;
   &lt;em&gt;&lt;a href="http://lackingrhoticity.blogspot.com/2010/10/process-descriptors-in-freebsd-capsicum.html"&gt;Process descriptors.&lt;/a&gt;&lt;/em&gt;  Capsicum doesn't allow kill() inside the sandbox because kill() uses a global namespace (the PID namespace).  So Capsicum introduces process descriptors (a new FD type) as a replacement for process IDs, and adds pdfork(), pdwait() and pdkill() as replacements for fork(), wait() and kill().
&lt;/ul&gt;

&lt;p&gt;
Plus there are a couple of smaller features:

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Message-based sockets.&lt;/em&gt;  The Capsicum guys implemented Linux's SOCK_SEQPACKET interface for FreeBSD.

&lt;li&gt;&lt;em&gt;An fexecve() system call&lt;/em&gt; which takes a file descriptor for an executable.  This replaces execve(), which is disabled in capability mode because execve() takes a filename.

&lt;p&gt;
Capsicum's fexecve() ignores the implicit filename that is embedded in the executable's PT_INTERP field, so it is only good for loading the dynamic linker directly or for loading other statically linked executables.
&lt;/ul&gt;

&lt;p&gt;
Currently, the only programs that run under Capsicum are those that have been ported specially:

&lt;ul&gt;
&lt;li&gt;The Capsicum guys ported Chromium, and it works much the same way as on Linux.  On both systems, Chromium's renderer process runs sandboxed, but the browser process does not.  On both systems, Chromium needs to be able to turn on sandboxing after the process has started up, because it relies on legacy libraries that use open() during startup.

&lt;li&gt;Some Unix utilities, including gzip and dhclient, have been extended to use sandboxing internally (privilege separation).  Like Chromium, gzip can open files and then switch to capability mode.
&lt;/ul&gt;

&lt;p&gt;
However, it should be possible to run legacy Unix programs under Capsicum by porting &lt;a href="http://plash.beasts.org"&gt;Plash&lt;/a&gt;.

&lt;p&gt;
At first glance, it looks like Plash would have to do the same tricks under FreeBSD-Capsicum as it does under Linux to run legacy programs.  Under Linux, Plash uses a &lt;a href="http://plash.beasts.org/wiki/PlashGlibc"&gt;modified version of glibc&lt;/a&gt; in order to intercept its system calls and convert them to system calls that work in the sandbox.  That's because the Linux kernel doesn't provide any help with intercepting the system calls.  The situation is similar under FreeBSD -- Capsicum does not add any extensions for bouncing syscalls back to a user space handler.

&lt;p&gt;
However, there are two aspects of FreeBSD that should make Plash easier to implement there than on Linux:

&lt;ul&gt;
&lt;li&gt;FreeBSD's libc is friendlier towards overriding its functions.  On both systems, it is possible to override (for example) open() via an LD_PRELOAD library that defines its own "open" symbol.  But with glibc on Linux, this doesn't work for libc's internal calls to open(), such as from fopen().  For a small gain in efficiency, these calls don't go through PLT entries and so cannot be intercepted.

&lt;p&gt;FreeBSD's libc doesn't use this optimisation and so it allows the internal calls to be intercepted too.

&lt;li&gt;FreeBSD's dynamic linker and libc are not tightly coupled, so it is possible to change the dynamic linker to open its libraries via IPC calls without having to rebuild libc in lockstep.

&lt;p&gt;In contrast, Linux glibc's ld.so and libc.so are built together, share some data structures (such as TLS), and cannot be replaced independently.
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-3545453971462747842?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/3545453971462747842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=3545453971462747842' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3545453971462747842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3545453971462747842'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/11/introduction-to-freebsd-capsicum.html' title='An introduction to FreeBSD-Capsicum'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-6710537790772900281</id><published>2010-10-23T16:10:00.003+01:00</published><updated>2010-10-25T15:43:33.847+01:00</updated><title type='text'>Process descriptors in FreeBSD-Capsicum</title><content type='html'>&lt;a href="http://www.cl.cam.ac.uk/research/security/capsicum/"&gt;Capsicum&lt;/a&gt; is a set of new features for FreeBSD that adds better support for sandboxing, adding a capability mode in which the capabilities are Unix file descriptors (FDs).  The features Capsicum adds are orthogonal, which is nice.  One of the new features is &lt;a href="http://www.cl.cam.ac.uk/research/security/capsicum/man/pdfork.2.pdf"&gt;process descriptors&lt;/a&gt;.

&lt;p&gt;
Capsicum adds a replacement for fork() called pdfork(), which returns a process descriptor (a new type of FD) rather than a PID.  Similarly, there are replacements for wait() and kill() -- pdwait() and pdkill() -- which take FDs as arguments instead of PIDs.

&lt;p&gt;
The reason for the new interface is that kill() is not safe to allow in Capsicum's sandbox, because it provides ambient authority: it looks up its PID argument in a global namespace.

&lt;p&gt;
But even if you ignore sandboxing issues, this new interface is a significant improvement on POSIX process management:

&lt;ol&gt;

&lt;li&gt; It allows the ability to wait on a process to be delegated to another process.  In contrast, with wait()/waitpid(), a process's exit status can only be read by the process's parent.

&lt;li&gt; Process descriptors can be used with poll().  This avoids the awkwardness of having to use SIGCHLD, which doesn't work well if multiple libraries within the same process want to wait() for child processes.

&lt;li&gt; It gets rid of the race condition associated with kill().  Sending a signal to a PID is dodgy because the original process with this PID could have exited, and the kernel could have recycled the PID for an unrelated process, especially on a system where processes are spawned and exit frequently.

&lt;p&gt;
kill() is only really safe when used by a parent process on its child, and only when the parent makes sure to use it before wait() has returned the child's exit status.  pdkill() gets rid of this problem.

&lt;li&gt; In future, process descriptors can be extended to provide access to the process's internal state for debugging purposes, e.g. for reading registers and memory, or modifying memory mappings or the FD table.  This would be an improvement on Linux's ptrace() interface.

&lt;/ol&gt;

&lt;p&gt;
However, there is one aspect of Capsicum's process descriptors that I think was a mistake:  Dropping the process descriptor for a process causes the kernel to kill it.  (By this I mean that if there are no more references to the process descriptor, because they have been close()'d or because the processes holding them have exited, the kernel will terminate the process.)

&lt;p&gt;
The usual principle of garbage collection in programming languages is that GC should not affect the observable behaviour of the program (except for resource usage).  Capsicum's kill-on-close() behaviour violates that principle.

&lt;p&gt;
kill-on-close() is based on the assumption that the launcher of a subprocess always wants to check the subprocess's exit status.  But this is not always the case.  Exit status is just one communications channel, but not one we always care about.  It's common to launch a process to communicate with it via IPC without wanting to have to wait() for it.  (Double fork()ing is a way to do this in POSIX without leaving a zombie process.)  Many processes are fire-and-forget.

&lt;p&gt;
The situation is analogous for threads.  pthreads provides pthread_detach() and PTHREAD_CREATE_DETACHED as a way to launch a thread without having to pthread_join() it.

&lt;p&gt;
In Python, when you create a thread, you get a Thread object back, but if the Thread object is GC'd the thread won't be killed.  In fact, finalisation of a thread object is implemented using pthread_detach() on Unix.

&lt;p&gt;
I know of one language implementation that (as I recall) will GC threads, &lt;a href="http://en.wikipedia.org/wiki/Oz_(programming_language)"&gt;Mozart/Oz&lt;/a&gt;, but this only happens when a thread can have no further observable effects.  In Oz, if a thread blocks waiting for the resolution of a logic variable that can never be resolved (because no other thread holds a reference to it), then the thread can be GC'd.  So deadlocked threads can be GC'd just fine.  Similarly, if a thread holds no communications channels to the outside world, it can be GC'd safely.

&lt;p&gt;
Admittedly, Unix already violates this GC principle by the finalisation of pipe FDs and socket FDs, because dropping one endpoint of the socket/pipe pair is visible as an EOF condition on the other endpoint.  However, this is usually used to free up resources or to unblock a process that would otherwise fail to make progress.  Dropping a process descriptor would halt progress -- the opposite.  Socket/pipe EOF can be used to do this too, but it is less common, and I'm not sure we should encourage this.

&lt;p&gt;
However, aside from this complaint, process descriptors are a good idea.  It would be good to see them implemented in Linux.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-6710537790772900281?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/6710537790772900281/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=6710537790772900281' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6710537790772900281'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6710537790772900281'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/10/process-descriptors-in-freebsd-capsicum.html' title='Process descriptors in FreeBSD-Capsicum'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-9202139100767967131</id><published>2010-08-11T12:49:00.005+01:00</published><updated>2010-08-11T13:01:32.633+01:00</updated><title type='text'>My workflow with git-cl + Rietveld</title><content type='html'>Git's model of changes (which is shared by Mercurial, Bazaar and Monotone) makes it awkward to revise earlier patches.  This can make things difficult when you are sending out multiple, dependent changes for code review.

&lt;p&gt;
Suppose I create changes A and B.  B depends functionally on A, i.e. tests will not pass for B without A also being applied.  There might or might not be a textual dependency (B might or might not modify lines of code modified by A).

&lt;p&gt;
Because code review is slow (high latency), I need to be able to send out changes A and B for review and still be able to continue working on further changes.  But I also need to be able to revisit A to make changes to it based on review feedback, and then make sure B works with the revised A.

&lt;p&gt;
What I do is create separate branches for A and B, where B branches off of A.  To revise change A, I "git checkout" its branch and add further commits.  Later I can update B by checking it out and rebasing it onto the current tip of A.  Uploading A or B to the review system or committing A or B upstream (to SVN) involves squashing their branch's commits into one commit.  (This squashing means the branches contain micro-history that reviewers don't see and which is not kept after changes are pushed upstream.)

&lt;p&gt;
The review system in question is Rietveld, the code review web app used for Chromium and Native Client development.  Rietveld does not have any special support for patch series -- it is only designed to handle one patch at a time, so it does not know about dependencies between changes.  The tool for uploading changes from Git to Rietveld and later committing them to SVN is "git-cl" (part of depot_tools).

&lt;p&gt;
git-cl is intended to be used with one branch per change-under-review.  However, it does not have much support for handling changes which depend on each other.

&lt;p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_IE-Ip3EU3_Y/TGKQj-6mg5I/AAAAAAAAAC4/65EaMyGSFt0/s1600/gitk-screenshot.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 186px;" src="http://2.bp.blogspot.com/_IE-Ip3EU3_Y/TGKQj-6mg5I/AAAAAAAAAC4/65EaMyGSFt0/s400/gitk-screenshot.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5504120642458780562" /&gt;&lt;/a&gt;

&lt;p&gt;
This workflow has a lot of problems:

&lt;ul&gt;

&lt;li&gt; When using git-cl on its own, I have to manually keep track that B is to be rebased on to A.  When uploading B to Rietveld, I must do "git cl upload A".  When updating B, I must first do "git rebase A".  When diffing B, I have to do "git diff A".  (I have written a tool to do this.  It's not very good, but it's better than doing it manually.)

&lt;li&gt; Rebasing B often produces conflicts if A has been squash-committed to SVN.  That's because if branch A contained multiple patches, Git doesn't know how to skip over patches from A that are in branch B.

&lt;li&gt; Rebasing loses history.  Undoing a rebase is not easy.

&lt;li&gt; In the case where B doesn't depend on A, rebasing branch B so that it doesn't include the contents of branch A is a pain.  (Sometimes I will stack B on top of A even when it doesn't depend on A, so that I can test the changes together.  An alternative is to create a temporary branch and "git merge" A and B into it, but creating further branches adds to the complexity.)

&lt;li&gt; If there is a conflict, I don't find out about it until I check out and update the affected branch.

&lt;li&gt; This gets even more painful if I want to maintain changes that are not yet ready for committing or posting for review, and apply them alongside changes that are ready for review.

&lt;/ul&gt;

&lt;p&gt;
There are all reasons why I would not recommend this workflow to someone who is not already very familiar with Git.

&lt;p&gt;
The social solution to this problem would be for code reviews to happen faster, which would reduce the need to stack up changes.  If all code reviews reached a conclusion within 24 hours, that would be an improvement.  But I don't think that is going to happen.

&lt;p&gt;
The technical solution would be better patch management tools.  I am increasingly thinking that Darcs' set-of-patches model would work better for this than Git's DAG-of-commits model.  If I could set individual patches to be temporarily applied or unapplied to the working copy, and reorder and group patches, I think it would be easier to revisit changes that I have posted for review.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-9202139100767967131?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/9202139100767967131/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=9202139100767967131' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/9202139100767967131'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/9202139100767967131'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/08/my-workflow-with-git-cl-rietveld.html' title='My workflow with git-cl + Rietveld'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_IE-Ip3EU3_Y/TGKQj-6mg5I/AAAAAAAAAC4/65EaMyGSFt0/s72-c/gitk-screenshot.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-4558945185323701077</id><published>2010-08-06T21:12:00.002+01:00</published><updated>2010-08-06T21:41:55.989+01:00</updated><title type='text'>CVS's problems resurface in Git</title><content type='html'>Although modern version control systems have improved a lot on CVS, I get the feeling that there is a fundamental version control problem that the modern VCSes (Git, Mercurial, Bazaar, and I'll include Subversion too!) haven't solved.  The curious thing is that CVS had sort of made some steps towards addressing it.

&lt;p&gt;
In CVS, history is stored per file.  If you commit a change that crosses multiple files, CVS updates each file's history separately.  This causes a bunch of problems:
&lt;ul&gt;
&lt;li&gt; CVS does not represent changesets or snapshots as first class objects.  As a result, many operations involve visiting every file's history.  

&lt;p&gt; Reconstructing a changeset involves searching all files' histories to match up the individual file changes.  (This was just about possible, though I hear there are tricky corner cases.  Later CVS added a commit ID field that presumably helped with this.)

&lt;p&gt;
   Creating a tag at the latest revision involves adding a tag to every file's history.  Reconstructing a tag, or a time-based snapshot, involves visiting every file's history again.

&lt;li&gt; CVS does not represent file renamings, so the standard history tools like "cvs log" and "cvs annotate" are not able to follow a file's history from before it was renamed.
&lt;/ul&gt;


&lt;p&gt;
In the DAG-based decentralised VCSes (Git, Mercurial, Monotone, Bazaar), history is stored per repository.  The fundamental data structure for history is a Directed Acyclic Graph of commit objects.  Each commit points to a snapshot of the entire file tree plus zero or more parent commits.  This addresses CVS's problems:
&lt;ul&gt;
&lt;li&gt; Extracting changesets is easy because they are the same thing as commit objects.
&lt;li&gt; Creating a tag is cheap and easy.  Recording any change creates a commit object (a snapshot-with-history), so creating a tag is as simple as pointing to an already-existing commit object.
&lt;/ul&gt;


&lt;p&gt;
However, often it is not practical to put all the code that you're interested in into a single Git repository!  (I pick on Git here because, of the DAG-based systems, it is the one I am most familar with.)  While it &lt;em&gt;can&lt;/em&gt; be practical to do this with Subversion or CVS, it is less practical with the DAG-based decentralised VCSes:

&lt;ul&gt;
&lt;li&gt; In the DAG-based systems, branching is done at the level of a repository.  You cannot branch and merge subdirectories of a repository independently:  you cannot create a commit that only partially merges two parent commits.

&lt;li&gt; Checking out a Git repository involves downloading not only the entire current revision, but the entire history.  So this creates pressure against putting two partially-related projects together in the same repository, especially if one of the projects is huge.

&lt;li&gt; Existing projects might already use separate repositories.  It is usually not practical to combine those repositories into a single repository, because that would create a repo that is incompatible with the original repos.  That would make it difficult to merge upstream changes.  Patch sharing would become awkward because the filenames in patches would need fixing.
&lt;/ul&gt;


&lt;p&gt;
This all means that when you start projects, you have to decide how to split your code among repositories.  Changing these decisions later is not at all straightforward.


&lt;p&gt;
The result of this is that CVS's problems have not &lt;em&gt;really&lt;/em&gt; been solved: they have just been pushed up a level.  The problems that occurred at the level of individual files now occur at the level of repositories:

&lt;ul&gt;
&lt;li&gt; The DAG-based systems don't represent changesets that cross repositories.  They don't have a type of object for representing a snapshot across repositories.
&lt;li&gt; Creating a tag across repositories would involve visiting every repository to add a tag to it.
&lt;li&gt; There is no support for moving files between repositories while tracking the history of the file.
&lt;/ul&gt;

&lt;p&gt;
The funny thing is that since CVS hit this problem all the time, the CVS tools were better at dealing with multiple histories than Git.

&lt;p&gt;
To compare the two, imagine that instead of putting your project in a single Git repository, you put each one of the project's files in a separate Git repository.  This would result in a history representation that is roughly equivalent to CVS's history representation.  i.e. Every file has its own separate history graph.

&lt;ul&gt;
&lt;li&gt; To check in changes to multiple files, you have to "cd" to each file's repository directory, and "git commit" and "git push" the file change.
&lt;li&gt; To update to a new upstream version, or to switch branch, you have to "cd" to each file's repository directory again to do "git pull/fetch/rebase/checkout" or whatever.
&lt;li&gt; Correlating history across files must be done manually.  You could run "git log" or "gitk" on two repositories and match up the timelines or commit messages by hand.  I don't know of any tools for doing this.
&lt;/ul&gt;

&lt;p&gt;
In contrast, for CVS, "cvs commit" works across multiple files and (if I remember rightly) even across multiple working directories.  "cvs update" works across multiple files.

&lt;p&gt;
While "cvs log" doesn't work across multiple files, there is a tool called "CVS Monitor" which reconstructs history and changesets across files.

&lt;p&gt;
Experience with CVS suggests that Git could be changed to handle the multiple-repository case better.  "git commit", "git checkout" etc. could be changed to operate across multiple Git working copies.  Maybe "git log" and "gitk" could gain options to interleave histories by timestamp.

&lt;p&gt;
Of course, that would lead to cross-repo support that is &lt;em&gt;only as good&lt;/em&gt; as CVS's cross-file support.  We might be able to apply a textual tag name across multiple Git repos with a single command just as a tag name can be applied across files with "cvs tag".  But that doesn't give us an immutable tag &lt;em&gt;object&lt;/em&gt; than spans repos.

&lt;p&gt;
My point is that the fundamental data structure used in the DAG-based systems doesn't solve CVS's problem, it just postpones it to a larger level of granularity.  Some possible solutions to the problem are DEPS files (as used by Chromium), Git submodules, or Darcs-style set-of-patches repos.  These all introduce new data structures.  Do any of these solve the original problem?  I am undecided -- this question will have to wait for another post. :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-4558945185323701077?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/4558945185323701077/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=4558945185323701077' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/4558945185323701077'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/4558945185323701077'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/08/cvss-problems-resurface-in-git.html' title='CVS&apos;s problems resurface in Git'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-6475401455074061265</id><published>2010-05-05T22:42:00.002+01:00</published><updated>2010-05-05T23:55:27.829+01:00</updated><title type='text'>The trouble with Buildbot</title><content type='html'>The trouble with &lt;a href="http://buildbot.net"&gt;Buildbot&lt;/a&gt; is that it encourages you to put rules into a Buildbot-specific build configuration that is separate from the normal configuration files that you might use to build a project (configure scripts, makefiles, etc.).

&lt;p&gt;
This is not a big problem if your Buildbot configuration is simple and just consists of, say, "svn up", "./configure", "make", "make test", and never changes.

&lt;p&gt;
But it is a problem if your Buildbot configuration becomes non-trivial and ever has to be updated, because the Buildbot configuration cannot be tested outside of Buildbot.

&lt;p&gt;
The last time I had to maintain a Buildbot setup, it was necessary to try out configuration changes directly on the Buildbot master.  This doesn't work out well if multiple people are responsible for maintaining the setup!  Whoever makes a change has to remember to check it in to version control after they've got it working, which of course doesn't always happen.  It's a bit ironic that Buildbot is supposed to support automated testing but doesn't follow best practices for testing itself.

&lt;p&gt;
There is a simple way around this though:  Instead of putting those separate steps -- "./configure", "make", "make test" -- into the Buildbot config, put them into a script, check the script into version control, and have the Buildbot config run that script.  Then the Buildbot config just consists of doing "svn up" and running the script.  It is then possible to test changes to the script before checking it in.  I've written scripts like this that go as far as debootstrapping a fresh Ubuntu chroot to run tests in, which ensures your package dependency list is up to date.

&lt;p&gt;
Unfortunately, Buildbot's logging facilities don't encourage having a minimal Buildbot config.

&lt;p&gt;
If you use a complicated Buildbot configuration with many Buildbot steps, Buildbot can display each step separately in its HTML-formatted logs.  This means:
&lt;ul&gt;
&lt;li&gt; you can see progress;
&lt;li&gt; you can see which steps have failed;
&lt;li&gt; you'd be able to see how long the steps take if Buildbot actually displayed that.
&lt;/ul&gt;

&lt;p&gt;
Whereas if you have one big build script in a single Buildbot step, all the output goes into one big, flat, plain text log file.

&lt;p&gt;
I think the solution is to decouple the structured-logging functionality from the glorified-cron functionality that Buildbot provides.  My implementation of this is &lt;a href="http://svn.gna.org/viewcvs/plash/trunk/build-tools/build_log.py?rev=874&amp;view=auto"&gt;build_log.py&lt;/a&gt; (used in Plash's build scripts), which I'll write more about later.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-6475401455074061265?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/6475401455074061265/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=6475401455074061265' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6475401455074061265'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6475401455074061265'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/05/trouble-with-buildbot.html' title='The trouble with Buildbot'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-649405965458119143</id><published>2010-05-01T12:59:00.002+01:00</published><updated>2010-05-01T13:03:21.132+01:00</updated><title type='text'>Breakpoints in gdb using int3</title><content type='html'>Here is a useful trick I discovered recently while debugging some changes to the &lt;a href="http://code.google.com/p/seccompsandbox/"&gt;seccomp sandbox&lt;/a&gt;.  To trigger a breakpoint on x86, just do:

&lt;pre&gt;
__asm__("int3");
&lt;/pre&gt;

&lt;p&gt;
Then it is possible to inspect registers, memory, the call stack, etc. in gdb, without having to get gdb to set a breakpoint.  The instruction triggers a SIGTRAP, so if the process is not running under gdb and has no signal handler for SIGTRAP, the process will die.

&lt;p&gt;
This technique seems to be &lt;a href="http://sourceware.org/ml/gdb/2009-02/msg00001.html"&gt;fairly well known&lt;/a&gt;, although it's not mentioned in the gdb documentation.  int3 is the instruction that gdb uses internally for setting breakpoints.

&lt;p&gt;
Sometimes it's easier to insert an int3 and rebuild than get gdb to set a breakpoint.  For example, setting a gdb breakpoint on a line won't work in the middle of a chunk of inline assembly.

&lt;p&gt;
My expectations of gdb are pretty low these days.  When I try to use it to debug something low level, it often doesn't work, which is why I have been motivated to hack together my own &lt;a href="http://lackingrhoticity.blogspot.com/2009/05/really-simple-tracing-debugger-part-2.html"&gt;debugging tools&lt;/a&gt; in the past.  For example, if I run gdb on the glibc dynamic linker (ld.so) on Ubuntu Hardy or Karmic, it gives:

&lt;pre&gt;
$ gdb /lib/ld-linux-x86-64.so.2 
...
(gdb) run
Starting program: /lib/ld-linux-x86-64.so.2 
Cannot access memory at address 0x21ec88
(gdb) 
&lt;/pre&gt;

&lt;p&gt;
So it's nice to find a case where I can get some useful information out of gdb.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-649405965458119143?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/649405965458119143/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=649405965458119143' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/649405965458119143'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/649405965458119143'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/05/breakpoints-in-gdb-using-int3.html' title='Breakpoints in gdb using int3'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-101099774012859351</id><published>2010-02-01T14:48:00.003Z</published><updated>2010-02-01T15:32:14.633Z</updated><title type='text'>How to build adb, the Android debugger</title><content type='html'>&lt;a href="http://developer.android.com/guide/developing/tools/adb.html"&gt;adb&lt;/a&gt; is the Android debugger (officially the "Android debug bridge" I think).  It is a tool for getting shell access to an Android phone across a USB connection.  It can also be used to copy files to and from the Android device and do port-forwarding.  In short, it is similar to ssh, but is &lt;em&gt;not&lt;/em&gt; ssh.  (Why couldn't they have just used ssh?)

&lt;p&gt;
I have not been able to find any Debian/Ubuntu packages for adb.  The reason why it has not been packaged becomes apparent if you try to build the thing.  Android has a &lt;a href="http://lackingrhoticity.blogspot.com/2009/12/modular-vs-monolithic-build-systems.html"&gt;monolithic build system&lt;/a&gt; which wants to download a &lt;a href="http://android.git.kernel.org/"&gt;long list of Git repositories&lt;/a&gt; and build everything.  If you &lt;a href="http://source.android.com/download"&gt;follow the instructions&lt;/a&gt;, it will download 1.1Gb from Git and leave you with a source+build directory of 6Gb.  It isn't really designed for building subsets of components, unlike, say, &lt;a href="http://live.gnome.org/Jhbuild"&gt;JHBuild&lt;/a&gt;.  It's basically a huge makefile.  It doesn't know about dependencies between components.  However, it does have some idea about dependencies between output files.

&lt;p&gt;
Based on a build-of-all-Android, I figured out how to build a much smaller subset containing adb.  This downloads a more manageable 11Mb and finishes with a source+build directory containing 40Mb.  This is also preferable to downloading the pre-built Android SDK, which has &lt;a href="http://developer.android.com/sdk/terms.html"&gt;a non-free licence&lt;/a&gt;.

&lt;p&gt;
Instructions:

&lt;pre&gt;
$ sudo apt-get install build-essential libncurses5-dev
$ git clone git://android.git.kernel.org/platform/system/core.git system/core
$ git clone git://android.git.kernel.org/platform/build.git build
$ git clone git://android.git.kernel.org/platform/external/zlib.git external/zlib
$ git clone git://android.git.kernel.org/platform/bionic.git bionic
$ echo "include build/core/main.mk" &gt;Makefile
&lt;/pre&gt;

&lt;p&gt;
Now edit build/core/main.mk and comment out the parts labelled
&lt;pre&gt;
 # Check for the correct version of java
&lt;/pre&gt;
and
&lt;pre&gt;
 # Check for the correct version of javac
&lt;/pre&gt;
Since adb doesn't need Java, these checks are unnecessary.

&lt;p&gt;
Also edit build/target/product/sdk.mk and comment out the "include" lines after
&lt;pre&gt;
 # include available languages for TTS in the system image
&lt;/pre&gt;
I don't know exactly what this is about but it avoids having to download language files that aren't needed for adb.

Then building the adb target should work:

&lt;pre&gt;
make out/host/linux-x86/bin/adb
&lt;/pre&gt;

If you try running "adb shell" you might get this:

&lt;pre&gt;
ubuntu$ ./out/host/linux-x86/bin/adb shell
* daemon not running. starting it now *
* daemon started successfully *
error: insufficient permissions for device
&lt;/pre&gt;

So you probably need to do "adb start-server" as root first:

&lt;pre&gt;
ubuntu$ sudo ./out/host/linux-x86/bin/adb kill-server
ubuntu$ sudo ./out/host/linux-x86/bin/adb start-server
* daemon not running. starting it now *
* daemon started successfully *
ubuntu$ ./out/host/linux-x86/bin/adb shell
$
&lt;/pre&gt;

For the record, here are the errors I got that motivated each step:

&lt;ul&gt;

&lt;li&gt;
&lt;pre&gt;
make: *** No rule to make target `external/svox/pico/lang/PicoLangItItInSystem.mk'.  Stop.
&lt;/pre&gt;
- hence commenting out the picolang includes.

&lt;li&gt;
&lt;pre&gt;
system/core/libzipfile/zipfile.c:6:18: error: zlib.h: No such file or directory
&lt;/pre&gt;
- I'm guessing adb needs libzipfile which needs zlib.

&lt;li&gt;
&lt;pre&gt;
system/core/libcutils/mspace.c:59:50: error: ../../../bionic/libc/bionic/dlmalloc.c: No such file or directory
&lt;/pre&gt;
- This is why we need to download bionic (the C library used on Android), even though we aren't building any code to run on an Android device.  This is the ugliest part and it illustrates why this is not a modular build system.  The code does
&lt;pre&gt;
#include "../../../bionic/libc/bionic/dlmalloc.c"
&lt;/pre&gt;
to #include a file from another module.  It seems any part of the build can refer to any other part, via relative pathnames, so the modules cannot be build separately.  I don't know whether this is an isolated case, but it makes it difficult to put adb into a Debian package.

&lt;li&gt;
&lt;pre&gt;
host Executable: adb (out/host/linux-x86/obj/EXECUTABLES/adb_intermediates/adb)
/usr/bin/ld: cannot find -lncurses
&lt;/pre&gt;
- hence the ncurses-dev dependency above.  However, this error is a little odd because if adb really depended on ncurses, it would fail when it tried to #include a header file.  Linking with "-lncurses" is probably superfluous.

&lt;/ul&gt;


&lt;p&gt;
The instructions above will probably stop working as new versions are pushed to the public Git branches.  (However, this happens infrequently because Android development is not done in the open.)  For reproducibility, here are the Git commit IDs:

&lt;pre&gt;
$ find -name "*.git" -exec sh -c 'echo "`git --git-dir={} rev-parse HEAD` {}"' ';'
91a54c11cbfbe3adc1df2f523c75ad76affb0ae9 ./system/core/.git
95604529ec25fe7923ba88312c590f38aa5e3d9e ./bionic/.git
890bf930c34d855a6fbb4d09463c1541e90381d0 ./external/zlib/.git
b7c844e7cf05b4cea629178bfa793321391d21de ./build/.git
&lt;/pre&gt;

It looks like the current version is Android 1.6 (Donut):

&lt;pre&gt;
$ find -name "*.git" -exec sh -c 'echo "`git --git-dir={} describe` {}"' ';'
android-1.6_r1-80-g91a54c1 ./system/core/.git
android-1.6_r1-43-g9560452 ./bionic/.git
android-1.6_r1-7-g890bf93 ./external/zlib/.git
android-sdk-1.6-docs_r1-65-gb7c844e ./build/.git
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-101099774012859351?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/101099774012859351/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=101099774012859351' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/101099774012859351'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/101099774012859351'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/02/how-to-build-adb-android-debugger.html' title='How to build adb, the Android debugger'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-3587286408912446512</id><published>2010-01-25T13:06:00.004Z</published><updated>2010-01-26T00:03:16.155Z</updated><title type='text'>Why system calls should be message-passing</title><content type='html'>Message-passing syscalls on Linux include read(), write(), sendmsg() and recvmsg().  These are message-passing because:
&lt;ol&gt;
 &lt;li&gt; They take a file descriptor as an explicit argument.  This specifies the object to send a message to or receive a message from.
 &lt;li&gt; The message to send (or receive) consists of an array of bytes, and maybe an array of file descriptors too (via SCM_RIGHTS).  The syscall interacts with the process's address space (or file descriptor table) in a well-defined, uniform way.  The caller specifies which locations are read or written.  The syscall acts &lt;em&gt;as if&lt;/em&gt; it takes a copy of the message.
&lt;/ol&gt;

&lt;p&gt;Linux has a lot of syscalls that are not message-passing because the object they operate on is not specified explicitly through a reference that authorises use of the object (such as a file descriptor).  Instead they operate using the process's ambient authority.  Examples:

&lt;ul&gt;
 &lt;li&gt; open(), stat(), etc.: These operate on the file namespace (a combination of the process's root directory, current directory, and mount table; and for /proc the contents of the file namespace is also influenced by the process's identity).
 &lt;li&gt; kill(), ptrace():  These operate on the process ID namespace.  (Unlike file descriptors, process IDs are not strong references.  The mapping from process IDs to processes is ambiently available.)
 &lt;li&gt; mmap(), mprotect():  These operate on the process's address space, which is not a first class object.
&lt;/ul&gt;

Here are some advantages of implementing syscalls on top of a message-passing construct:

&lt;ol&gt;
&lt;li&gt;&lt;em&gt;
It allows syscalls to be intercepted.
&lt;/em&gt;

&lt;p&gt;
  Suppose that open() were just a library call implemented using sendmsg()/recvmsg() (as in &lt;a href="http://plash.beasts.org"&gt;Plash&lt;/a&gt;).  It would send a message to a file namespace object (named via a file descriptor).  This object can be replaced in order to tame the huge amount of authority that open() usually provides.

&lt;li&gt;&lt;em&gt;
It allows syscalls to be disabled.
&lt;/em&gt;

&lt;p&gt;
  open() could be disabled by providing a file namespace object that doesn't implement an open() method, or by not providing a file namespace object.

&lt;li&gt;&lt;em&gt;
It can avoid race conditions in filtering syscalls.
&lt;/em&gt;

&lt;p&gt;
  In the past people have attempted to use ptrace() to sandbox processes and give them limited access to the filesystem, by checking syscalls such as open() and allowing them through selectively (&lt;a href="http://subterfugue.org/"&gt;Subterfugue&lt;/a&gt; is an example).  This is difficult or impossible to do securely because of a &lt;a href="http://en.wikipedia.org/wiki/Time-of-check-to-time-of-use"&gt;TOCTTOU&lt;/a&gt; race condition.  open() doesn't take a filename; it takes &lt;em&gt;an address&lt;/em&gt;, in the current process's address space, of a filename.  It is not enough to catch the start of the open() syscall, check the filename, and allow the syscall through.  Another thread might change the filename in the mean time.  (This is aside from the race conditions involved in interpreting symlinks.)

&lt;p&gt;
  &lt;a href="http://www.systrace.org/"&gt;Systrace&lt;/a&gt; went to some trouble to copy filenames in the kernel to allow a tracing process to see and provide a consistent snapshot.  This would have been less ad-hoc if the kernel had a uniform message-passing system.

&lt;p&gt;
  See &lt;a href="http://www.watson.org/~robert/2007woot/"&gt;"Exploiting Concurrency Vulnerabilities in System Call Wrappers"&lt;/a&gt;.

&lt;li&gt;&lt;em&gt;
It aids logging of syscalls.
&lt;/em&gt;

&lt;p&gt;
  On Linux, strace needs to have logic for interpreting every single syscall, because each syscall passes arguments in different ways, including how it reads and writes memory and the file descriptor table.

&lt;p&gt;
  If all syscalls went through a common message-passing interface, strace would only need one piece of logic for recording what was read or written.  Furthermore, logging could be separated from decoding and formatting (such as turning syscall numbers into names).

&lt;li&gt;&lt;em&gt;
It allows consistency of code paths in the kernel, avoiding bugs.
&lt;/em&gt;

&lt;p&gt;
  Mac OS X had a &lt;a href="http://butnotyet.tumblr.com/post/175132533/the-story-of-a-simple-and-dangerous-kernel-bug"&gt;vulnerability in the TIOCGWINSZ ioctl()&lt;/a&gt;, which reads the width and height of a terminal window.  The bug was that it would write directly to the address provided by the process, without checking whether the address was valid.  This allowed any process to take over the kernel by writing to kernel memory.

&lt;p&gt;
  This wouldn't happen if ioctl() were message-passing, because all writing to the process's address space would be done in one place, in the syscall's return path.  Forgetting the check would be much less likely.

&lt;p&gt;
  This bug demonstrates why ioctl() is dangerous.  ioctl() should really be considered as a (huge) family of syscalls, not a single syscall, because each ioctl number (such as TIOCGWINSZ) can read or write address space, and sometimes the file descriptor table, in a different way.

&lt;li&gt;&lt;em&gt;
It enables implementations of interfaces to be moved between the kernel and userland.
&lt;/em&gt;

&lt;p&gt;
  If the mechanism used to talk to the kernel is the same as the mechanism used to talk to other userland processes, processes should be agnostic as to whether the interfaces they use are implemented by the kernel.

&lt;p&gt;
  For example, &lt;a href="http://en.wikipedia.org/wiki/NLTSS"&gt;NLTSS&lt;/a&gt; allowed the filesystem to be in-kernel (faster) or in a userland process (more robust and secure).  So it was possible to flip a switch to trade off speed and robustness.

&lt;li&gt;&lt;em&gt;
It allows implementations of interfaces to be in-process too.
&lt;/em&gt;

&lt;p&gt;
  This allows further performace tradeoffs.  The pathname lookup logic of open() can be moved between the calling process and a separate process.  For speed, pathname lookup can be placed in the process that implements the filesystem (as in Plash currently) in order to avoid doing a cross-process call for each pathname element.  Alternatively, pathname lookup can be done in libc (as in the Hurd).

&lt;li&gt;&lt;em&gt;
It can help with versioning of system interfaces.
&lt;/em&gt;

&lt;p&gt;
  Stable interfaces are nice, but the ability to evolve interfaces is nice too.

&lt;p&gt;
  Using object-based message-passing interfaces instead of raw syscalls can help with that.  You can introduce new objects, or add new methods to existing objects.  Old, obsolete interfaces can be defined in terms of new interfaces, and transparently implemented outside the kernel.  New interfaces can be exposed selectively rather than system-wide.

&lt;li&gt;&lt;em&gt;
It does not have to hurt performance.
&lt;/em&gt;

&lt;p&gt;
  Objects can still be implemented in the kernel.  For example, in &lt;a href="http://www.eros-os.org/"&gt;EROS&lt;/a&gt; (and &lt;a href="http://en.wikipedia.org/wiki/KeyKOS"&gt;KeyKOS&lt;/a&gt;/&lt;a href="http://www.capros.org/"&gt;CapROS&lt;/a&gt;/&lt;a href="http://www.coyotos.org/"&gt;Coyotos&lt;/a&gt;), various object types are implemented by the kernel, but are invoked through the same capability invocation mechanism as userland-implemented objects.

&lt;p&gt;
  Object invocations can be synchronous (call-return).  They do not have to go via an asynchronous message queue.  The kernel can provide a send-message-and-wait-for-reply syscall that is equivalent to a sendmsg()+recvmsg() combo but faster.  &lt;a href="http://en.wikipedia.org/wiki/L4_microkernel_family"&gt;L4&lt;/a&gt; and EROS provide syscalls like this.

&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-3587286408912446512?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/3587286408912446512/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=3587286408912446512' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3587286408912446512'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3587286408912446512'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2010/01/why-system-calls-should-be-message.html' title='Why system calls should be message-passing'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5138322273230482528</id><published>2009-12-07T14:19:00.000Z</published><updated>2009-12-07T14:20:01.445Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='packaging'/><title type='text'>Modular vs. monolithic build systems</title><content type='html'>I have spent some time packaging software as Debian packages.  While the Debian packaging system has its faults, it has the nice property that it is modular.  This post is an attempt to articulate what aspects of Debian packaging -- and other modular build systems -- are worth replicating, and why it is worth co-operating with these systems rather than ignoring them or working around them.

&lt;p&gt;
What is a modular build?

&lt;ol&gt;
&lt;li&gt; A modular build consists of a set of modules,
&lt;li&gt; each of which can be built separately.
&lt;li&gt; Each module's build produces some output (a directory tree).
&lt;li&gt; A module may depend on the outputs of other modules, but it can't reach inside the others' build trees.
&lt;li&gt; There is a common interface that each module provides for building itself.
&lt;li&gt; The build tool can be replaced with another.  The description of the module set is separate from the modules themselves.
&lt;/ol&gt;

&lt;p&gt;
What is a non-modular, monolithic build?
&lt;ol&gt;
&lt;li&gt; A monolithic build consists of one big build tree.
&lt;li&gt; Any part of the build can reference any other part via relative filenames.
&lt;li&gt; It might consist of multiple checkouts from version control, but they have to be checked out to specific directory tree locations (as in the Chromium build).
&lt;/ol&gt;

&lt;p&gt;
Some examples of modular builds:

&lt;ul&gt;
&lt;li&gt; Build systems/tools:
  &lt;ul&gt;
    &lt;li&gt; Debian packages (and presumably RPMs too)
    &lt;li&gt; JHBuild
    &lt;li&gt; Zero-Install
    &lt;li&gt; Nix
  &lt;/ul&gt;

&lt;li&gt;Module interfaces:
  &lt;ul&gt;
    &lt;li&gt; GNU autotools (./configure &amp;&amp; make &amp;&amp; make install)
    &lt;li&gt; Python distutils (setup.py)
  &lt;/ul&gt;

&lt;li&gt;Software collections:
  &lt;ul&gt;
    &lt;li&gt; GNOME
    &lt;li&gt; Xorg (7.0 onwards)
    &lt;li&gt; Sugar
  &lt;/ul&gt;
&lt;/ul&gt;

Examples of monolithic builds:
&lt;ul&gt;
&lt;li&gt; XFree86 (and Xorg 6.9):  Before Xorg was modularised, there was a big makefile that built everything, from Xlib to the X server to example X clients.
&lt;li&gt; Chromium web browser:  This uses a tool called "gyp" to generate a big makefile which compiles individual source files from several libraries, including WebKit, V8 and the Native Client IPC library.  It ignores WebKit's own build system.
&lt;li&gt; Native Client:  One SCons build builds the core code as well as the NPAPI browser plugin and example code; it needs to know how to cross-compile NaCl code as well as compile host system code.  Another makefile builds the compiler toolchain from tarballs and patch files that are checked into SVN.
&lt;li&gt; CPython:  The standard library builds many Python C extensions.
&lt;/ul&gt;

&lt;p&gt;
Modular build systems offer a number of advantages:
&lt;ul&gt;
&lt;li&gt; You can download and build only the parts you need.  This can be a big help if some modules are huge but seldom change while the modules you work on are small and fast to build.
&lt;li&gt; Some systems (such as Debian packages) give you binary packages so you don't need to build the dependencies of the modules that you want to work on.  JHBuild doesn't provide this but it could be achieved with a little work.
&lt;li&gt; Dependencies are clearer.
&lt;li&gt; External interfaces are clearer too.
&lt;li&gt; It is possible to change one module's version independently of other modules (to the extent that differing versions are compatible).
&lt;li&gt; They are relatively easy to use in a decentralised way.  It is easy to create a new version of a module set which adds or removes modules.
&lt;li&gt; You don't have to check huge dependencies into your version control system.  Some projects check in monster tarballs or source trees, which dwarf the project's own code.  If you avoid this practice you will make it easier for distributions to package your software.
&lt;/ul&gt;

&lt;p&gt;
The two categories can coexist:  Each module may internally be a monolithic build which can be arbitrarily complex.  Autotools is an example of that.  This is not too bad because at least we have contained the complexity within the module.  The layer on top, which connects modules together, can be relatively simple.

&lt;p&gt;
Despite its faults, autotools is very amenable to being part of a modular build:
&lt;ul&gt;
&lt;li&gt; The build tree does not need to be kept around after doing "make install".
&lt;li&gt; Output can be directed using "--prefix=foo" and "make install DESTDIR=foo".
&lt;li&gt; Inputs can be specified via --prefix and PATH and other environment variables.
&lt;li&gt; The build tree can be separate from the source tree.  It's easy to have multiple build trees with different build options.
&lt;/ul&gt;


&lt;p&gt;
The systems I listed as modular all have their own problems.  The main problem with Debian packages is that they are installed system-wide, which requires root access and makes it difficult to install multiple versions of a package.  It is possible to work around this problem using chroots.  JHBuild, Zero-Install and Nix avoid this problem.  JHBuild and Zero-Install are not so good at capturing immutable snapshots of package sets.  Nix is good at capturing snapshots, but Nix makes it difficult to change a library without rebuilding everything that uses it.

&lt;p&gt;
Despite these problems, these systems have a nice property: they are layered.  It is possible to mix and match modules and replace the build layer.  Hence it is possible to build Xorg and GNOME either with JHBuild or as Debian packages.  In turn, there is a choice of tools for building Debian source packages.  There is even a tool for making sets of Debian packages from JHBuild module descriptions.

&lt;p&gt;
These systems do not interoperate perfectly, but they do work and scale.

&lt;p&gt;
There are some arguments for having a monolithic system.  In some situations it is difficult to split pieces of software into separately-built modules.  For example, Plash-glibc is currently built by symlinking the source for the Plash IPC library into the glibc source tree, so that glibc builds it with the correct compiler flags and with the glibc internal header files.  Ideally the IPC library would be built as a separate module, but for now it is better not to.

&lt;p&gt;
Still, if you can find good module boundaries, it is a good idea to take advantage of them.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5138322273230482528?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5138322273230482528/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5138322273230482528' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5138322273230482528'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5138322273230482528'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/12/modular-vs-monolithic-build-systems.html' title='Modular vs. monolithic build systems'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-742557273792207321</id><published>2009-12-01T20:06:00.004Z</published><updated>2009-12-01T23:52:24.503Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>read_file() and write_file() in Python</title><content type='html'>Here are two functions I would like to see in the Python standard library:

&lt;pre&gt;
def read_file(filename):
  fh = open(filename, "r")
  try:
      return fh.read()
  finally:
      fh.close()

def write_file(filename, data):
  fh = open(filename, "w")
  try:
      fh.write(data)
  finally:
      fh.close()
&lt;/pre&gt;

&lt;p&gt;
In the first case, it is &lt;em&gt;possible&lt;/em&gt; to do &lt;tt&gt;open(filename).read()&lt;/tt&gt; instead of &lt;tt&gt;read_file(filename)&lt;/tt&gt;.  However, this is not good practice because if you are using a Python implementation that uses a garbage collection implementation other than reference counting, there will be a delay before the Python file object, and the file descriptor it wraps, are freed.  So you will temporarily leak a scarce resource.

&lt;/p&gt;&lt;p&gt;
I have sometimes argued that GC should really know about file descriptors:  If the open() syscall fails with EMFILE or ENFILE, the Python standard library should run GC and retry.  (However this is probably not going to work for reliably receiving FDs in messages using recvmsg().)  But in the case of read_file() it is really easy to inform the system of the lifetime of the FD using close(), so there is no excuse not to do so!

&lt;/p&gt;&lt;p&gt;
In the second case, again it is &lt;em&gt;possible&lt;/em&gt; to do &lt;tt&gt;open(filename, "w").write(data)&lt;/tt&gt; instead of &lt;tt&gt;write_file(filename, data)&lt;/tt&gt;, but in the presence of non-refcounting GC it will not only temporarily leak an FD, it will also do the wrong thing!  The danger is that the file's contents will be only partially written, because Python file objects are buffered, and the buffer is not flushed until you do close() or flush().  If you use open() this way and test only on CPython, you won't find out that your program breaks on Jython, IronPython or PyPy (all of which normally use non-refcounting GC, as far as I know).

&lt;/p&gt;&lt;p&gt;
Maybe CPython should be changed so that the file object's destructor does not flush the buffer.  That would break some programs but expose the problem.  Maybe it could be optional.

&lt;/p&gt;&lt;p&gt;
read_file() and write_file() calls also look nicer.  I don't want the ugliness of open-coding these functions.

&lt;/p&gt;&lt;p&gt;
Don't write code like this:

&lt;/p&gt;&lt;pre&gt;
fh = open(os.path.join(temp_dir, "foo"), "w")
fh.write(blah)
fh.close()
fh = open(os.path.join(temp_dir, "bar"), "w")
fh.write(stuff)
fh.close()
&lt;/pre&gt;

&lt;p&gt;
Instead, write it like this:

&lt;/p&gt;&lt;pre&gt;
write_file(os.path.join(temp_dir, "foo"), blah)
write_file(os.path.join(temp_dir, "bar"), stuff)
&lt;/pre&gt;

&lt;p&gt;
Until Python includes these functions in the standard library, I will copy and paste them into every Python codebase I work on in protest!  They are too trivial, in my opinion, to be worth the trouble of depending on an external library.

&lt;/p&gt;&lt;p&gt;
I'd like to see these functions as builtins, because they should be as readily available as open(), for which they are alternatives.  However if they have to be imported from a module I won't complain!
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-742557273792207321?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/742557273792207321/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=742557273792207321' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/742557273792207321'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/742557273792207321'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/12/readfile-and-writefile-in-python.html' title='read_file() and write_file() in Python'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-4510566911954263374</id><published>2009-11-18T06:13:00.002Z</published><updated>2009-12-01T20:03:22.388Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>CapPython revisited</title><content type='html'>[I've had a draft of this post floating around since June, and it's about time I posted it.]

&lt;p&gt;
Guido van Rossum's &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-March/003821.html"&gt;criticisms of CapPython&lt;/a&gt; from earlier in the year are largely right, though I would put the emphasis differently.  A lot of existing Python code will not pass CapPython's static verifier.  But I was not expecting large amounts of code to pass the verifier.

&lt;p&gt;
CapPython was an experiment to see if Python could be tamed using only a static verifier - similar to &lt;a href="http://code.google.com/p/joe-e/"&gt;Joe-E&lt;/a&gt;, a subset of Java - without resorting to automated rewriting of code.  I had a hunch that it was possible.  I also wanted a concrete system to show why I preferred to avoid some of the Python constructs that CapPython prohibits.  Some of Joe-E's &lt;a href="http://www.cs.berkeley.edu/~daw/joe-e/spec-20080203.pdf"&gt;restrictions&lt;/a&gt; are also quite severe, such as prohibiting &lt;tt&gt;try...finally&lt;/tt&gt; in order to &lt;a href="http://www.cs.berkeley.edu/~daw/papers/pure-ccs08.pdf"&gt;prevent non-deterministic execution&lt;/a&gt; (something, incidentally, that CapPython does not try to do).  It turned out that it was better to abandon the goal of being static-only when it came to &lt;a href="http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html"&gt;loading Python modules&lt;/a&gt;.

&lt;p&gt;
My goal was that CapPython code should look like fairly idiomatic Python code (and be able to run directly under Python), &lt;em&gt;not&lt;/em&gt; that all or most idiomatic Python code would be valid CapPython code.  As a consequence, I didn't want CapPython to require special declarations, such as decorators on classes, for making objects encapsulated.  Instead I wanted objects to be encapsulated by default.  Maybe that was an arbitrary goal.

&lt;p&gt;
CapPython managed to meet that goal, but only partially, and only by &lt;a href="http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html"&gt;depending on Python 2.x's unbound methods feature&lt;/a&gt;, which was removed in Python 3.0.

&lt;p&gt;
The reason my goal was met only partially is that in some circumstances it is necessary to wrap class objects (using a decorator such as &lt;a href="http://lackingrhoticity.blogspot.com/2009/06/encapsulation-in-python-two-approaches.html"&gt;@sealed in my earlier post&lt;/a&gt;) so that the class's constructor is exposed but inheritance is blocked.  Whether this is necessary depends on whether the class object is authorityless.  A class object is authorityless if it does not reference any authority-carrying objects (directly, or indirectly through objects captured by method function closures).  If a class object is authorityless, it should be safe to grant other, untrusted modules the ability to create derived classes.  Otherwise, the class object ought to be wrapped.  The problem is that it is not trivial to make this determination.  Furthermore, permitting inheritance is the default, and this is not a safe default.

&lt;p&gt;
If we have to go around putting @sealed decorators on all class definitions because the non-@sealed default is not-quite-encapsulated, we may as well consider alternatives where the default is not-at-all-encapsulated.

&lt;p&gt;
The main alternative is to resurrect CPython's restricted mode and add a wrapper type for making encapsulated objects (i.e. a non-broken version of Bastion along the lines of my &lt;a href="http://lackingrhoticity.blogspot.com/2009/06/encapsulation-in-python-two-approaches.html"&gt;earlier post&lt;/a&gt;).

&lt;p&gt;
If modifying CPython is considered acceptable, and one's goal is to add object-capabilities to Python in the most practical way (rather than to investigate a specific approach), resurrecting restricted mode is probably a better way to do it than CapPython.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-4510566911954263374?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/4510566911954263374/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=4510566911954263374' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/4510566911954263374'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/4510566911954263374'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/11/cappython-revisited.html' title='CapPython revisited'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-2031520730555073529</id><published>2009-09-28T22:34:00.003+01:00</published><updated>2009-09-28T23:26:09.232+01:00</updated><title type='text'>A combined shell and terminal</title><content type='html'>Last year I wrote about a couple of &lt;a href="http://lackingrhoticity.blogspot.com/2008/11/shell-features.html"&gt;features that I'd like to see&lt;/a&gt; in a hypothetical integrated shell and terminal.  

&lt;p&gt;
Since then I have implemented my idea!  The result is &lt;a href="http://plash.beasts.org/wiki/CoconutShell"&gt;Coconut Shell&lt;/a&gt;.  It is a combined shell and terminal.  It is intended to look and behave much the same as Bash + GNOME Terminal.  I am now using it as my main shell/terminal.

&lt;p&gt;
I find the most useful feature is finish notifications: When a long-running command finishes, it flashes the window's task bar icon.  A trivial example:

&lt;p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SsEsQhfz2_I/AAAAAAAAACc/JRVpd3tIguw/s1600-h/highlight-screenshot.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 149px;" src="http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SsEsQhfz2_I/AAAAAAAAACc/JRVpd3tIguw/s320/highlight-screenshot.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5386635291693145074" /&gt;&lt;/a&gt;

&lt;p&gt;
Now I no longer have to keep checking minimised windows to see whether a build has finished or whether packages have finished downloading and installing.

&lt;p&gt;
It works with tabs too:

&lt;p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_IE-Ip3EU3_Y/SsEshlzszdI/AAAAAAAAACk/aqkCbJfeteQ/s1600-h/highlight-screenshot-tabs.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 156px;" src="http://2.bp.blogspot.com/_IE-Ip3EU3_Y/SsEshlzszdI/AAAAAAAAACk/aqkCbJfeteQ/s320/highlight-screenshot-tabs.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5386635584908086738" /&gt;&lt;/a&gt;

&lt;p&gt;
This had an unforeseen benefit.  It helps when I'm testing GUI applications.  When I close down the GUI application, the tab that I ran the application from gets highlighted, which makes it easier to find the tab to re-run the app from.  I don't search through windows as much as I used to.  (I tend to have a lot of terminal windows and tabs open at a time!)

&lt;p&gt;
How it works:

&lt;p&gt;
Coconut Shell is implemented in Python.  The terminal and shell run in the same process.  It uses libvte to provide the terminal emulator widget -- the same as GNOME Terminal.  It uses &lt;a href="http://codespeak.net/pyrepl/"&gt;Pyrepl&lt;/a&gt; as a readline-alike to read commands from the terminal.

&lt;p&gt;
Normal shells rely on Unix process groups to stop backgrounded jobs from reading from the terminal.  Coconut Shell doesn't rely on process groups in the same way.  Instead, it creates a new TTY FD pair for every command it launches, and it forwards input and output to and from the terminal.  This is partly out of necessity (process groups get in the way of reading from multiple TTY FDs at the same time), and partly because it is more robust.  It stops jobs from interfering with each other and making themselves unbackgroundable.  It also demonstrates that Unix never needed the whole sorry mess of process groups and session IDs and the weirdness they inflict on TTY file descriptors, because job control can be done by forwarding IO instead.

&lt;p&gt;
There's more information on the &lt;a href="http://plash.beasts.org/wiki/CoconutShell"&gt;home page&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-2031520730555073529?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/2031520730555073529/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=2031520730555073529' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2031520730555073529'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2031520730555073529'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/09/combined-shell-and-terminal.html' title='A combined shell and terminal'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SsEsQhfz2_I/AAAAAAAAACc/JRVpd3tIguw/s72-c/highlight-screenshot.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-7620886937073692406</id><published>2009-09-18T14:50:00.006+01:00</published><updated>2009-12-01T20:03:22.388Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Logging with stack traces</title><content type='html'>I find that postmortem debugging is useful (e.g. examining process state in gdb or pdb after a crash or traceback), but trying to debug using breakpoints is an exercise in tedium.  I prefer debugging using logging.  However, logs obtained just by adding printf statements tend to lack context information, and if you add lots of printfs the logs require more effort to read.

&lt;p&gt;
You can get context information in Python by doing traceback.print_stack() after every message, but the result tends to be unreadable.  Below is a tool which prints stack context in a more readable tree format.  I found it very useful a couple of months ago when investigating a bug in which a recursive invocation of a callback function caused failures later on.  Had I not seen the stack context, I might not have realised that the callback was being called recursively.

&lt;p&gt;
Here is some example output.  It comes from running the &lt;a href="http://allmydata.org"&gt;Tahoe&lt;/a&gt; client, while monkey-patching some socket functions to show which sockets the process is binding and connecting to.

&lt;pre style="font-size: xx-small"&gt;
0:/usr/bin/twistd:21:&amp;lt;module&gt;
 1:/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py:27:run
  2:/usr/lib/python2.5/site-packages/twisted/application/app.py:379:run
   3:/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py:23:runApp
    4:/usr/lib/python2.5/site-packages/twisted/application/app.py:158:run
     5:/usr/lib/python2.5/site-packages/twisted/scripts/_twistd_unix.py:213:postApplication
      6:/usr/lib/python2.5/site-packages/twisted/scripts/_twistd_unix.py:174:startApplication
       7:/usr/lib/python2.5/site-packages/twisted/application/service.py:228:privilegedStartService
        8:/usr/lib/python2.5/site-packages/twisted/application/service.py:228:privilegedStartService
         9:/usr/lib/python2.5/site-packages/twisted/application/service.py:228:privilegedStartService
          10:/usr/lib/python2.5/site-packages/twisted/application/internet.py:68:privilegedStartService
           11:/usr/lib/python2.5/site-packages/twisted/application/internet.py:86:_getPort
            12:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:467:listenTCP
             13:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:730:startListening
              14:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:718:createInternetSocket
               15:/usr/lib/python2.5/site-packages/twisted/internet/base.py:724:createInternetSocket
                16:tahoe-client.tac:55:__init__
                 create socket &amp;lt;Socket object at 0x8eb580c&gt; (2, 1)
              14:tahoe-client.tac:64:bind
               bind &amp;lt;Socket object at 0x8eb580c&gt; (('', 56530),)
foolscap.pb.Listener starting on 56530
         9:/usr/lib/python2.5/site-packages/twisted/application/service.py:228:privilegedStartService
          10:/usr/lib/python2.5/site-packages/twisted/application/internet.py:68:privilegedStartService
           11:/usr/lib/python2.5/site-packages/twisted/application/internet.py:86:_getPort
            12:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:467:listenTCP
             13:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:730:startListening
              14:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:718:createInternetSocket
               15:/usr/lib/python2.5/site-packages/twisted/internet/base.py:724:createInternetSocket
                16:tahoe-client.tac:55:__init__
                 create socket &amp;lt;Socket object at 0x8eb5844&gt; (2, 1)
              14:tahoe-client.tac:64:bind
               bind &amp;lt;Socket object at 0x8eb5844&gt; (('0.0.0.0', 3456),)
nevow.appserver.NevowSite starting on 3456
Starting factory &amp;lt;nevow.appserver.NevowSite instance at 0x8eb07ac&gt;
My pid: 27115
      6:/usr/lib/python2.5/site-packages/twisted/application/app.py:113:runReactorWithLogging
       7:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:220:run
        8:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:228:mainLoop
         9:/usr/lib/python2.5/site-packages/twisted/internet/base.py:561:runUntilCurrent
          10:/usr/lib/python2.5/site-packages/foolscap/eventual.py:26:_turn
           11:/usr/lib/python2.5/site-packages/allmydata/node.py:259:_startService
            12:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:191:addCallback
             13:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:182:addCallbacks
              14:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:317:_runCallbacks
               15:/usr/lib/python2.5/site-packages/allmydata/node.py:259:&amp;lt;lambda&gt;
                16:/usr/lib/python2.5/site-packages/allmydata/util/iputil.py:93:get_local_addresses_async
                 17:/usr/lib/python2.5/site-packages/allmydata/util/iputil.py:137:get_local_ip_for
                  18:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:388:listenUDP
                   19:/usr/lib/python2.5/site-packages/twisted/internet/udp.py:84:startListening
                    20:/usr/lib/python2.5/site-packages/twisted/internet/udp.py:89:_bindSocket
                     21:/usr/lib/python2.5/site-packages/twisted/internet/base.py:724:createInternetSocket
                      22:tahoe-client.tac:55:__init__
                       create socket &amp;lt;Socket object at 0x8eb58ec&gt; (2, 2)
                     21:tahoe-client.tac:64:bind
                      bind &amp;lt;Socket object at 0x8eb58ec&gt; (('', 0),)
twisted.internet.protocol.DatagramProtocol starting on 53017
Starting protocol &amp;lt;twisted.internet.protocol.DatagramProtocol instance at 0x8eb0dcc&gt;
                  18:/usr/lib/python2.5/site-packages/twisted/internet/udp.py:181:connect
                   19:tahoe-client.tac:58:connect
                    connect &amp;lt;Socket object at 0x8eb58ec&gt; (('198.41.0.4', 7),)
(Port 53017 Closed)
Stopping protocol &amp;lt;twisted.internet.protocol.DatagramProtocol instance at 0x8eb0dcc&gt;
         9:/usr/lib/python2.5/site-packages/twisted/internet/base.py:561:runUntilCurrent
          10:/usr/lib/python2.5/site-packages/foolscap/eventual.py:26:_turn
           11:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:239:callback
            12:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:304:_startRunCallbacks
             13:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:317:_runCallbacks
              14:/usr/lib/python2.5/site-packages/allmydata/client.py:136:_start_introducer_client
               15:/usr/lib/python2.5/site-packages/twisted/application/service.py:148:setServiceParent
                16:/usr/lib/python2.5/site-packages/twisted/application/service.py:260:addService
                 17:/usr/lib/python2.5/site-packages/allmydata/introducer/client.py:58:startService
                  18:/usr/lib/python2.5/site-packages/foolscap/pb.py:901:connectTo
                   19:/usr/lib/python2.5/site-packages/foolscap/reconnector.py:60:startConnecting
                    20:/usr/lib/python2.5/site-packages/foolscap/reconnector.py:87:_connect
                     21:/usr/lib/python2.5/site-packages/foolscap/pb.py:830:getReference
                      22:/usr/lib/python2.5/site-packages/twisted/internet/defer.py:107:maybeDeferred
                       23:/usr/lib/python2.5/site-packages/foolscap/pb.py:856:_getReference
                        24:/usr/lib/python2.5/site-packages/foolscap/pb.py:947:getBrokerForTubRef
                         25:/usr/lib/python2.5/site-packages/foolscap/negotiate.py:1332:connect
                          26:/usr/lib/python2.5/site-packages/foolscap/negotiate.py:1359:connectToAll
                           27:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:474:connectTCP
                            28:/usr/lib/python2.5/site-packages/twisted/internet/base.py:665:connect
                             29:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:885:_makeTransport
                              30:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:593:__init__
                               31:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:497:createInternetSocket
                                32:tahoe-client.tac:55:__init__
                                 create socket &amp;lt;Socket object at 0x8eb5a74&gt; (2, 1)
                           27:/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py:474:connectTCP
                            28:/usr/lib/python2.5/site-packages/twisted/internet/base.py:665:connect
                             29:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:885:_makeTransport
                              30:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:593:__init__
                               31:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:497:createInternetSocket
                                32:tahoe-client.tac:55:__init__
                                 create socket &amp;lt;Socket object at 0x8c59e64&gt; (2, 1)
         9:/usr/lib/python2.5/site-packages/twisted/internet/base.py:561:runUntilCurrent
          10:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:506:resolveAddress
           11:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:513:_setRealAddress
            12:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:539:doConnect
             13:tahoe-client.tac:61:connect_ex
              connect_ex &amp;lt;Socket object at 0x8eb5a74&gt; (('127.0.0.1', 45651),)
          10:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:506:resolveAddress
           11:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:513:_setRealAddress
            12:/usr/lib/python2.5/site-packages/twisted/internet/tcp.py:539:doConnect
             13:tahoe-client.tac:61:connect_ex
              connect_ex &amp;lt;Socket object at 0x8c59e64&gt; (('192.168.20.244', 45651),)
&lt;/pre&gt;

Here is the code for the logging tool:

&lt;pre&gt;
import sys

laststack = []

def tlog(*msgs):
    """Outputs a log message with stack trace context."""
    global laststack
    try:
        raise Exception()
    except Exception:
        frame = sys.exc_info()[2].tb_frame.f_back
    stack = extract(frame)
    for i, fr in enumerate(stack):
        if not (i &amp;lt; len(laststack) and laststack[i] is fr):
            lineno = fr.f_lineno
            co = fr.f_code
            filename = co.co_filename
            name = co.co_name
            print " " * i + ("%i:%s:%s:%s" % (i, filename, lineno, name))
    print " " * len(stack) + " ".join(map(str, msgs))
    laststack = stack

def extract(frame):
    got = []
    while frame is not None:
        got.append(frame)
        frame = frame.f_back
    got.reverse()
    return got
&lt;/pre&gt;

&lt;p&gt;
This makes use of the fact that Python's stack frames are heap-allocated, so they have identity, and it is possible to hold on to a stack frame object after the frame has exited.  This means it is possible to distinguish between two different invocations of the same function, even when the tracebacks are textually identical.  This would not be possible in C -- gdb would have to approximate.

&lt;p&gt;
For completeness, here is the monkey-patching code that I used:

&lt;pre&gt;
import socket

class Socket(socket.socket):
    def __init__(self, *args, **kwargs):
        tlog("create socket", self, args)
        super(Socket, self).__init__(*args, **kwargs)
    def connect(self, *args, **kwargs):
        tlog("connect", self, args)
        super(Socket, self).connect(*args, **kwargs)
    def connect_ex(self, *args, **kwargs):
        tlog("connect_ex", self, args)
        super(Socket, self).connect_ex(*args, **kwargs)
    def bind(self, *args, **kwargs):
        tlog("bind", self, args)
        super(Socket, self).bind(*args, **kwargs)

socket.socket = Socket
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-7620886937073692406?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/7620886937073692406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=7620886937073692406' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7620886937073692406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7620886937073692406'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/09/logging-with-stack-traces.html' title='Logging with stack traces'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5208579857160639646</id><published>2009-08-31T19:00:00.002+01:00</published><updated>2009-12-01T20:03:22.388Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='automated-testing'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>How to do model checking of Python code</title><content type='html'>I read about &lt;a href="http://www.stanford.edu/~engler/explode-osdi06.pdf"&gt;eXplode&lt;/a&gt; in early 2007.  The paper describes a neat technique, inspired by model checking, for exhaustively exploring all execution paths of a program.  The difference from model checking is that it doesn't require a model of the program to be written in a special language; it can operate on a program written in a normal language.  The authors of the paper use the technique to check the correctness of Linux filesystems, and version control systems layered on top of them, in the face of crashes.

&lt;p&gt;
The basic idea is to have a special function, choose(), that performs a non-deterministic choice, e.g.

&lt;pre&gt;
   choose([1, 2, 3])
&lt;/pre&gt;

&lt;p&gt;
Conceptually, this forks execution into three branches, returning 1, 2 or 3 in each branch.

&lt;p&gt;
The program can call choose() multiple times, so you end up with a choice tree.

&lt;p&gt;
The job of the eXplode check runner is to explore that choice tree, and to try to find sequences of choices that make the program fail.

&lt;p&gt;
eXplode does not try to snapshot the state of the program when choose() is called.  (That's what other model checkers have tried to do, and it's what would happen if choose() were implemented using the Unix fork() call.)

&lt;p&gt;
Instead, eXplode restores the state of the program by running it from the start and replaying a previously-recorded sequence of choices by returning them one-by-one from choose().  A node in the choice tree is represented by the sequence of choices required to get to it, and so to explore that node's subtree the check runner will repeatedly run the program from the start, replaying the choices for that subtree.  The check runner does a breadth-first traversal of the choice tree, so it has a queue of nodes to visit, and each visit will add zero or more subnodes to the queue.  (See the code below for a concrete implementation.)

&lt;p&gt;
This replay technique means that the program must be deterministic, otherwise the choice tree won't get explored properly.  All non-determinism must occur through choose().

&lt;p&gt;
There seem to be two situations in which you might call choose():

&lt;ol&gt;

&lt;li&gt;You can call choose() from your test code to generate input.

&lt;p&gt;
In eXplode, the test program writes files and tries to read them back, and it uses choose() to pick what kind of writes it does.

&lt;p&gt;
So far I have used choose() in tests to generate input in two different ways:

&lt;p&gt;
In one case, I started with valid input and mangled it in different ways, to check the error handling of the software-under-test.  The test code used choose() to pick a file to delete or line to remove.  The idea was that the software-under-test should give a nicely-formatted error message, specifying the faulty file, rather than a Python traceback.  There was only a bounded number of faults to introduce so the choice tree was finite.

&lt;p&gt;
In another case, I built up XML trees to check that the software under test could process them.  In this case the choice tree is infinite.  I decided to prune it arbitrarily at a certain depth so that the test could run as part of a fairly quickly-running test suite.

&lt;p&gt;
In both these cases, using the eXplode technique wasn't strictly necessary.  It would have been possible to write the tests without choose(), perhaps by copying the inputs before modifying them.  But I think choose() makes the test more concise, and not all objects have interfaces for making copies.

&lt;li&gt;You can also call choose() from the software-under-test to introduce deliberate faults, or from a fake object that the software-under-test is instantiated with.

&lt;p&gt;
In eXplode, the filesystem is tested with a fake block device which tries to simulate all the possible failures modes of a real hard disc (note that &lt;a href="http://www.landley.net/notes.html#27-08-2009"&gt;Flash drives have different failure modes&lt;/a&gt;!).  The fake device uses choose() to decide when to simulate a crash and restart, and which blocks should be successfully written to the virtual device when the crash happens.

&lt;p&gt;
You can also wrap malloc() so that it non-deterministically returns NULL in order to test that software handles out-of-memory conditions gracefully.  Testing error paths is difficult to do and this looks like a good way of doing it.

&lt;p&gt;
In these cases, the calls to choose() can be buried deeply in the call stack, and the eXplode check runner is performing an inversion of control that would be hard to do in a conventional test case.

&lt;/ol&gt;

&lt;p&gt;
Here's an example of how to implement the eXplode technique in Python:

&lt;pre&gt;

import unittest


# If anyone catches and handles this, it will break the checking model.
class ModelCheckEscape(Exception):

    pass


class Chooser(object):

    def __init__(self, chosen, queue):
        self._so_far = chosen
        self._index = 0
        self._queue = queue

    def choose(self, choices):
        if self._index &amp;lt; len(self._so_far):
            choice = self._so_far[self._index]
            if choice not in choices:
                raise Exception("Program is not deterministic")
            self._index += 1
            return choice
        else:
            for choice in choices:
                self._queue.append(self._so_far + [choice])
            raise ModelCheckEscape()


def check(func):
    queue = [[]]
    while len(queue) &gt; 0:
        chosen = queue.pop(0)
        try:
            func(Chooser(chosen, queue))
        except ModelCheckEscape:
            pass
        # Can catch other exceptions here and report the failure
        # - along with the choices that caused it - and then carry on.


class ModelCheckTest(unittest.TestCase):

    def test(self):
        got = []

        def func(chooser):
            # Example of how to generate a Cartesian product using choose():
            v1 = chooser.choose(["a", "b", "c"])
            v2 = chooser.choose(["x", "y", "z"])
            # We don't normally expect the function being tested to
            # capture state - in this case the "got" list - but we
            # do it here for testing purposes.
            got.append(v1 + v2)

        check(func)
        self.assertEquals(sorted(got),
                          sorted(["ax", "ay", "az",
                                  "bx", "by", "bz",
                                  "cx", "cy", "cz"]))


if __name__ == "__main__":
    unittest.main()
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5208579857160639646?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5208579857160639646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5208579857160639646' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5208579857160639646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5208579857160639646'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/08/how-to-do-model-checking-of-python-code.html' title='How to do model checking of Python code'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-822244683657603633</id><published>2009-06-28T20:32:00.001+01:00</published><updated>2009-12-01T20:03:22.388Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Encapsulation in Python: two approaches</title><content type='html'>Time for another post on how object-capability security might be done in Python.

&lt;p&gt;
Suppose function closures in Python were encapsulated (i.e. the attributes func_closure, func_globals etc. were removed) and were the only route for getting encapsulation for non-builtin objects, as is the case in Python's old restricted mode (a.k.a. &lt;a href="http://docs.python.org/library/rexec.html"&gt;rexec&lt;/a&gt;).  There would be two ways of defining new encapsulated objects.  Let's call them the Cajita approach and the Bastion approach.

&lt;p&gt;
The Cajita-style approach (named after the Cajita subset of Javascript) is to define an object using a record of function closures, one closure per method.  Here's an example, from a &lt;a href=""&gt;post from Ka-Ping Yee from 2003&lt;/a&gt;:

&lt;pre&gt;
def Counter():
    self = Namespace()
    self.i = 0

    def next():
        self.i += 1
        return self.i

    return ImmutableNamespace(next)
&lt;/pre&gt;

(There are some more examples - a read-only FileReader object, and Mint and Purse objects based on the example from &lt;a href="http://www.erights.org"&gt;E&lt;/a&gt; - in &lt;a href="http://tav.espians.com/update-on-securing-the-python-interpreter.html"&gt;Tav's post on object-capabilities in Python&lt;/a&gt;.)

&lt;p&gt;
This is instead of the far more idiomatic, but unencapsulated class definition:

&lt;pre&gt;
class Counter(object):

    def __init__(self):
        self._i = 0

    def next(self):
        self._i += 1
        return self._i
&lt;/pre&gt;

(Ignoring that the more idiomatic way to do this is to use a generator or itertools.count().)

&lt;p&gt;
I am calling this the Cajita approach because it is analagous to how objects are defined in the Cajita subset of Javascript (part of the &lt;a href="http://code.google.com/p/google-caja/"&gt;Caja&lt;/a&gt; project), where this code would be written as:

&lt;pre&gt;
function Counter() {
    var i = 0;
    return Obj.freeze({
        next: function () {
            i += 1;
            return i;
        },
    });
}
&lt;/pre&gt;

The Python version of the Cajita style is a lot more awkward because Python is syntactically stricter than Javascript:
&lt;ul&gt;
&lt;li&gt;
Expressions cannot contain statements, so function definitions cannot be embedded in an object creation expression.  As a result, method names have to be specified twice: once in the method definitions and again when passed to ImmutableNamespace.  (It would be three times if ImmutableNamespace did not use __name__.)

&lt;li&gt;
A function can only assign to a variable in an outer scope by using the nonlocal declaration (recently introduced), and that is awkward.  The example works around this by doing "self = Namespace()" and assigning to the attribute self.i.
&lt;/ul&gt;

The Cajita style also has a memory usage cost.  The size of each object will be O(n) in the number of methods, because each method is created as a separate function closure with its own pointer to the object's state.  This is not trivial to optimise away because doing so would change the object's garbage collection behaviour.

&lt;p&gt;
For these reasons it would be better not to use the Cajita style in Python.

&lt;p&gt;
The alternative is the approach taken by the &lt;a href="http://docs.python.org/library/bastion.html"&gt;Bastion module&lt;/a&gt;, which the Python standard library used to provide for use with Python's restricted mode.  (Restricted mode and Bastion.py are actually still present in Python 2.6, but restricted mode has holes and is unsupported, and Bastion.py is disabled.)

&lt;p&gt;
Bastion provided wrapper objects that exposed only the public attributes of the object being wrapped (where non-public attribute names are those that begin with an underscore).  This approach means we can define objects using class definitions as usual, and wrap the resulting objects.  To make a class encapsulated and uninheritable, we just have to add a decorator:

&lt;pre&gt;
@sealed
class Counter(object):

    def __init__(self):
        self._i = 0

    def next(self):
        self._i += 1
        return self._i
&lt;/pre&gt;

where "sealed" can be defined as follows:

&lt;pre&gt;
# Minimal version of Bastion.BastionClass.
# Converts a function closure to an object.
class BastionClass:

    def __init__(self, get):
        self._get = get

    def __getattr__(self, attr):
        return self._get(attr)

# Minimal version of Bastion.Bastion.
def Bastion(object):
    # Create a function closure wrapping the object.
    def get(attr):
        if type(attr) is str and not attr.startswith("_"):
            return getattr(object, attr)
        raise AttributeError(attr)
    return BastionClass(get)

def sealed(klass):
    def constructor(*args, **kwargs):
        return Bastion(klass(*args, **kwargs))
    return constructor
&lt;/pre&gt;

&lt;pre&gt;
&gt;&gt;&gt; c = Counter()
&gt;&gt;&gt; c.next()
1
&gt;&gt;&gt; c.next()
2
&gt;&gt;&gt; c._i
AttributeError: _i
&lt;/pre&gt;


&lt;h2&gt;Mutability problem&lt;/h2&gt;

One problem with the Bastion approach is that Bastion's wrapper objects are mutable.  Although you can't use the Bastion object to change the wrapped object's attributes, you can change the Bastion object's attributes.  This means that these wrapper objects cannot be safely shared between mutually distrusting components.  Multiple holders of the object could use it as a communications channel or violate each other's expectations.

&lt;p&gt;
There isn't an obvious way to fix this in pure Python.  Overriding __setattr__ isn't enough.  I expect the simplest way to deal with this is to implement Bastion in a C extension module instead of in Python.  The same goes for ImmutableNamespace (referred to in the first example) - the Cajita-style approach faces the same issue.


&lt;h2&gt;Wrapping hazard&lt;/h2&gt;

There is a potential hazard in defining encapsulated objects via wrappers that is not present in &lt;a href="http://lackingrhoticity.blogspot.com/2008/08/introducing-cappython.html"&gt;CapPython&lt;/a&gt;.  In methods, "self" will be bound to the private view of the object.  There is a risk of accidentally writing code that passes the unwrapped self object to untrusted code.

&lt;p&gt;
An example:

&lt;pre&gt;
class LoggerMixin(object):

    def sub_log(self, prefix):
        return PrefixLog(self, prefix)

@sealed
class PrefixLog(LoggerMixin):

    def __init__(self, log, prefix):
        self._log = log
        self._prefix = log

    def log(self, message):
        self._log.log("%s: %s" % (prefix, message))
&lt;/pre&gt;

(This isn't a great example because of the circular relationship between the two classes.)

&lt;p&gt;
This hazard is something we could lint for.  It would be easy to warn about cases where self is used outside of an attribute access.

&lt;p&gt;
The method could be changed to:

&lt;pre&gt;
    def sub_log(self, prefix):
        return PrefixLog(seal_object(self), prefix)
&lt;/pre&gt;

This creates a new wrapper object each time which is not always desirable.  To avoid this we could store the wrapper object in an attribute of the wrapped object.  Note that that would create a reference cycle, and while CPython's cycle collector is usually fine for collecting cycles, creating a cycle for every object might not be a good idea.  An alternative would be to memoize seal_object() using a weak dictionary.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-822244683657603633?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/822244683657603633/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=822244683657603633' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/822244683657603633'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/822244683657603633'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/06/encapsulation-in-python-two-approaches.html' title='Encapsulation in Python: two approaches'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-1817291894934344363</id><published>2009-06-14T15:37:00.004+01:00</published><updated>2009-12-01T20:05:27.798Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='native-client'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python standard library in Native Client</title><content type='html'>The Python standard library now works under &lt;a href="http://plash.beasts.org/wiki/NativeClient"&gt;Native Client&lt;/a&gt; in the web browser, including the &lt;a href="http://www.sqlite.org/"&gt;Sqlite&lt;/a&gt; &lt;a href="http://docs.python.org/library/sqlite3.html"&gt;extension module&lt;/a&gt;.

&lt;p&gt;
By that I mean that it is possible to import modules from the standard library, but a lot of system calls won't be available.  Sqlite works for in-memory use.

&lt;p&gt;
Here's a screenshot of the browser-based Python REPL:

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SjUOWbwQ1wI/AAAAAAAAABM/kRSA3RIS4OA/s1600-h/nacl-python-repl-3.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SjUOWbwQ1wI/AAAAAAAAABM/kRSA3RIS4OA/s320/nacl-python-repl-3.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5347195911142430466" /&gt;&lt;/a&gt;

&lt;p&gt;
The changes needed to make this work were quite minor:
&lt;ul&gt;
&lt;li&gt; Implementing stat() in glibc so that it does an RPC to the Javascript running on the web page.  Python uses stat() when searching for modules in sys.path.
&lt;li&gt; Changing fstat() in NaCl so that it doesn't return a fixed inode number.  Python uses st_ino to determine whether an extension module is the same as a previously-loaded module, which doesn't work if st_ino is the same for different files.
&lt;li&gt; Some tweaks to nacl-glibc-gcc, a wrapper script for nacl-gcc.
&lt;li&gt; Ensuring that NaCl-glibc's libpthread.so works.
&lt;/ul&gt;

I didn't have to modify Python or Sqlite at all.  I didn't even have to work out what arguments to pass to Sqlite's configure script - the &lt;a href="http://packages.ubuntu.com/source/jaunty/sqlite3"&gt;Debian package for Sqlite&lt;/a&gt; builds without modification.  It's just a matter of setting PATH to override gcc to run nacl-glibc-gcc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-1817291894934344363?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/1817291894934344363/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=1817291894934344363' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/1817291894934344363'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/1817291894934344363'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/06/python-standard-library-in-native.html' title='Python standard library in Native Client'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_IE-Ip3EU3_Y/SjUOWbwQ1wI/AAAAAAAAABM/kRSA3RIS4OA/s72-c/nacl-python-repl-3.png' height='72' width='72'/><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-35601132814859662</id><published>2009-05-28T19:47:00.002+01:00</published><updated>2009-05-28T19:54:46.558+01:00</updated><title type='text'>A really simple tracing debugger: part 2</title><content type='html'>In my &lt;a href="http://lackingrhoticity.blogspot.com/2009/05/really-simple-tracing-debugger.html"&gt;last post&lt;/a&gt; I showed how to get a readable execution trace of a process using a simple ptrace()-based tracer (itrace) and binutils' &lt;tt&gt;addr2line&lt;/tt&gt;.  By doing a bit more work we can extract the function call tree from the trace.  In order to distinguish callers and callees, we can use the stack pointer (%esp), which itrace outputs.

&lt;p&gt;
Here's the call tree for a "hello world" program using glibc 2.9 on i386, statically linked:

&lt;pre&gt;
$ ./itrace ./hellow-static | python treeify.py ./hellow-static
_start ??:0 8048130
  __libc_start_main glibc/csu/libc-start.c:93 8048220
    _dl_aux_init glibc/elf/dl-support.c:165 80518e0
    _dl_discover_osversion glibc/elf/../sysdeps/unix/sysv/linux/dl-sysdep.c:67 80521e0
      __uname ??:0 806e160
    __pthread_initialize_minimal glibc/csu/libc-tls.c:247 8048750
      __libc_setup_tls glibc/csu/libc-tls.c:111 8048550
        __sbrk glibc/misc/sbrk.c:34 80505d0
          __brk glibc/misc/../sysdeps/unix/sysv/linux/i386/brk.c:36 806f9c0
          __brk glibc/misc/../sysdeps/unix/sysv/linux/i386/brk.c:36 806f9c0
        memcpy glibc/string/memcpy.c:33 804fc10
      init_slotinfo glibc/csu/libc-tls.c:83 80486cb
        init_static_tls glibc/csu/libc-tls.c:100 8048708
  _dl_setup_stack_chk_guard glibc/csu/../sysdeps/unix/sysv/linux/dl-osinfo.h:76 8048283
    __libc_init_first glibc/csu/../sysdeps/unix/sysv/linux/init-first.c:44 80522f0
      __setfpucw glibc/math/../sysdeps/i386/setfpucw.c:29 8056270
      __libc_init_secure glibc/elf/enbl-secure.c:34 8052190
      _dl_non_dynamic_init glibc/elf/dl-support.c:235 8051b50
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        _dl_init_paths glibc/elf/dl-load.c:630 8072ae0
          _dl_important_hwcaps glibc/elf/dl-support.c:306 8051b20
          __libc_malloc glibc/malloc/malloc.c:3540 804eb60
            malloc_hook_ini glibc/malloc/hooks.c:35 804f360
              ptmalloc_init glibc/malloc/arena.c:426 804b370
                ptmalloc_init_minimal glibc/malloc/arena.c:374 804b39f
                  __getpagesize glibc/misc/../sysdeps/unix/sysv/linux/getpagesize.c:29 8050650
                  __linkin_atfork glibc/nptl/../nptl/sysdeps/unix/sysv/linux/register-atfork.c:115 80513a0
                next_env_entry glibc/malloc/arena.c:344 804b45d
            __libc_malloc glibc/malloc/malloc.c:3540 804eb60
              _int_malloc glibc/malloc/malloc.c:4130 804cae0
                malloc_consolidate glibc/malloc/malloc.c:4835 804afd0
                malloc_init_state glibc/malloc/malloc.c:2412 804a8e0
              sYSMALLOc glibc/malloc/malloc.c:2929 804d200
                __default_morecore glibc/malloc/morecore.c:48 804f9c0
                  __sbrk glibc/misc/sbrk.c:34 80505d0
                    __brk glibc/misc/../sysdeps/unix/sysv/linux/i386/brk.c:36 806f9c0
                __default_morecore glibc/malloc/morecore.c:48 804f9c0
                  __sbrk glibc/misc/sbrk.c:34 80505d0
                    __brk glibc/misc/../sysdeps/unix/sysv/linux/i386/brk.c:36 806f9c0
          __libc_malloc glibc/malloc/malloc.c:3540 804eb60
            _int_malloc glibc/malloc/malloc.c:4130 804cae0
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
      dl_platform_init glibc/elf/../sysdeps/i386/dl-machine.h:273 8051c38
        getenv glibc/stdlib/getenv.c:36 8056ea0
          strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
    __init_misc glibc/misc/init-misc.c:31 8051120
      rindex glibc/string/../sysdeps/i386/strrchr.S:37 8067a30
    __cxa_atexit glibc/stdlib/cxa_atexit.c:34 8048ab0
      __new_exitfn glibc/stdlib/cxa_atexit.c:63 8048960
    __libc_csu_init glibc/csu/elf-init.c:65 80487b0
      _init /build/buildd/glibc-2.7/build-tree/amd64-i386/csu/crti.S:15 80480f4
        _init ??:0 8048116
          frame_dummy crtstuff.c:0 80481b0
            __register_frame_info ??:0 80a2b50
              __register_frame_info_bases ??:0 80a2ab0
                __i686.get_pc_thunk.bx ??:0 80a29f9
          __do_global_ctors_aux crtstuff.c:0 80a4b10
        _init /build/buildd/glibc-2.7/build-tree/amd64-i386/csu/crtn.S:10 8048120
    _setjmp glibc/setjmp/../sysdeps/i386/bsd-_setjmp.S:36 8048830
    main ??:0 80481f0
      _IO_puts glibc/libio/ioputs.c:34 8048b10
        strlen glibc/string/../sysdeps/i386/i486/strlen.S:34 804f9f0
        _IO_new_file_xsputn glibc/libio/fileops.c:1288 8065870
          _IO_new_file_overflow glibc/libio/fileops.c:829 80661a0
            _IO_doallocbuf glibc/libio/genops.c:419 8049780
              _IO_file_doallocate glibc/libio/filedoalloc.c:88 808bd50
                _IO_file_stat glibc/libio/fileops.c:1225 8065b90
                  ___fxstat64 glibc/io/../sysdeps/unix/sysv/linux/fxstat64.c:46 804fe20
                mmap glibc/misc/../sysdeps/unix/sysv/linux/i386/mmap.S:81 8051080
                _IO_setb glibc/libio/genops.c:404 8049670
            _IO_new_do_write glibc/libio/fileops.c:493 8065a50
          _IO_default_xsputn glibc/libio/genops.c:452 8049850
      _IO_acquire_lock_fct glibc/libio/libioP.h:968 8048b8d
    exit glibc/stdlib/exit.c:34 8048870
      __libc_csu_fini glibc/csu/elf-init.c:91 8048770
      _fini /build/buildd/glibc-2.7/build-tree/amd64-i386/csu/crti.S:41 80a5574
        _fini ??:0 80a5587
          __do_global_dtors_aux crtstuff.c:0 8048160
            fini glibc/dlfcn/dlerror.c:207 80975e0
              check_free glibc/dlfcn/dlerror.c:188 8097560
            __deregister_frame_info ??:0 80a3c30
            __deregister_frame_info_bases ??:0 80a3b20
              __i686.get_pc_thunk.bx ??:0 80a29f9
        _fini /build/buildd/glibc-2.7/build-tree/amd64-i386/csu/crtn.S:21 80a558c
      _IO_cleanup glibc/libio/genops.c:1007 8049fd0
        _IO_flush_all_lockp glibc/libio/genops.c:823 8049db0
          _IO_new_file_overflow glibc/libio/fileops.c:829 80661a0
            _IO_new_do_write glibc/libio/fileops.c:493 8065a50
              new_do_write glibc/libio/fileops.c:505 8065760
                _IO_new_file_write glibc/libio/fileops.c:1261 8065a80
                  ?? ??:0 8
                __libc_write ??:0 804ffa0
                  __write_nocancel ??:0 804ffaa
      _IO_unbuffer_write glibc/libio/genops.c:951 8049fe8
        _IO_new_file_setbuf glibc/libio/fileops.c:445 8066380
          _IO_default_setbuf glibc/libio/genops.c:562 80496d0
            _IO_new_file_sync glibc/libio/fileops.c:891 80660d0
            _IO_setb glibc/libio/genops.c:404 8049670
      _exit glibc/posix/../sysdeps/unix/sysv/linux/i386/_exit.S:25 804fdc0
&lt;/pre&gt;


Believe it or not, this trace is quite illuminating.  Working out that these calls will happen by reading the glibc source could take a long time!

&lt;p&gt;
The Python script below (treeify.py) does two things:

&lt;ul&gt;
&lt;li&gt;
Firstly, it filters the trace to include only function entry points.  We can just about infer function entry points from &lt;tt&gt;addr2line&lt;/tt&gt;'s output.  We assume that when a function name (plus filename and line number) appears for the first time in the trace, the current address is the function's entry point.  This doesn't work fully when inline functions are instantiated multiple times though.  We could use &lt;tt&gt;nm&lt;/tt&gt; to find symbol addresses, but &lt;tt&gt;addr2line&lt;/tt&gt; gives us source filenames.

&lt;li&gt;
Secondly, it works out the nesting of the call tree by looking at the stack pointer.
&lt;/ul&gt;

&lt;pre&gt;
import subprocess
import sys

def read():
    for line in sys.stdin:
        try:
            regs = [int(x, 16) for x in line.split(" ")]
            yield {"eip": regs[0], "esp": regs[1]}
        # Ignore lines interspersed with other output!
        except (ValueError, IndexError):
            pass

def addr2line(iterable):
    proc = subprocess.Popen(["addr2line", "-e", sys.argv[1], "-f"],
                            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    for regs in iterable:
        proc.stdin.write("%x\n" % regs["eip"])
        a = proc.stdout.readline().rstrip("\n")
        b = proc.stdout.readline().rstrip("\n")
        regs["func"] = "%s %s" % (a, b)
        yield regs

def entry_points(iterable):
    funcs = {}
    # We treat the first address we see for the function as its entry
    # point, and only report those entries from this point on.
    for regs in iterable:
        func = regs["func"].split(":")[0]
        if funcs.setdefault(func, regs["eip"]) == regs["eip"]:
            yield regs

def add_nesting(iterable):
    stack = [2 ** 64]
    for regs in iterable:
        stack_pos = regs["esp"]
        if stack_pos &amp;lt; stack[-1]:
            stack.append(stack_pos)
        while stack_pos &gt; stack[-1]:
            stack.pop()
        regs["indent"] = "  " * len(stack)
        yield regs

for x in add_nesting(entry_points(addr2line(read()))):
    print x["indent"], x["func"], "%x" % x["eip"]
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-35601132814859662?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/35601132814859662/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=35601132814859662' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/35601132814859662'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/35601132814859662'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/05/really-simple-tracing-debugger-part-2.html' title='A really simple tracing debugger: part 2'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-3884754451942536283</id><published>2009-05-16T16:07:00.002+01:00</published><updated>2009-05-16T17:16:38.405+01:00</updated><title type='text'>A really simple tracing debugger</title><content type='html'>gdb can be really useful for debugging when it prints the information you want, but sometimes it will fail to give a useful backtrace, and it can be hard to find out what happened just before your process crashed.

&lt;p&gt;
I faced that problem when &lt;a href="http://plash.beasts.org/wiki/NativeClient"&gt;porting glibc to Native Client&lt;/a&gt;.  gdb (or &lt;tt&gt;strace -i&lt;/tt&gt;) could tell me the value of the instruction pointer when the process crashed, but it couldn't produce a backtrace because it didn't know how to read memory with NaCl's x86 segmentation setup.  (I should point out that gdb has since &lt;a href="http://nativeclient.googlecode.com/svn/data/docs_tarball/nacl/googleclient/native_client/documentation/nacl-gdb.html"&gt;been ported&lt;/a&gt;.)  Often I could look up the address with &lt;tt&gt;addr2line&lt;/tt&gt; or &lt;tt&gt;objdump -d&lt;/tt&gt; to find the cause of the problem.  However, that didn't help when the process died by jumping to address 0, or when it jumped to &lt;tt&gt;_start&lt;/tt&gt; when running &lt;tt&gt;atexit&lt;/tt&gt; handlers.

&lt;p&gt;
I found another way to do debugging, using a trace of the program's execution.

&lt;p&gt;
So, here's how to create a debugger with a single shell pipeline (plus a small C program) that works on statically-linked executables.

&lt;p&gt;
The first part is a tool called itrace, which starts a subprocess and single-steps its execution using Linux's &lt;a href="http://linux.die.net/man/2/ptrace"&gt;ptrace()&lt;/a&gt; system call and prints the process's eip and esp registers (instruction pointer and stack pointer) at every step.  (The code is below.)

&lt;p&gt;
Here's what we get if we run itrace on a simple statically-linked executable:

&lt;pre&gt;
$ ./itrace ./hellow-static | less
8048130 ffce4a50
8048132 ffce4a50
8048133 ffce4a54
8048135 ffce4a54
8048138 ffce4a50
...
&lt;/pre&gt;

To make this intelligible, all we have to do is pipe the output of itrace through &lt;a href="http://sourceware.org/binutils/docs/binutils/addr2line.html"&gt;addr2line&lt;/a&gt;.  This tool is part of &lt;a href="http://www.gnu.org/software/binutils/"&gt;binutils&lt;/a&gt; and translates addresses to source code filenames and line numbers using the debugging information in the executable.  Also, pipe through &lt;tt&gt;uniq&lt;/tt&gt; to remove duplicate lines:

&lt;pre&gt;
$ ./itrace ./hellow-static | addr2line -e ./hellow-static | uniq | less
??:0
/work/glibc/csu/libc-start.c:93
/work/glibc/csu/libc-start.c:103
/work/glibc/csu/libc-start.c:93
/work/glibc/csu/libc-start.c:103
/work/glibc/csu/libc-start.c:93
/work/glibc/csu/libc-start.c:103
/work/glibc/csu/libc-start.c:106
/work/glibc/csu/libc-start.c:103
/work/glibc/csu/libc-start.c:106
...
/work/glibc/stdlib/exit.c:88
/work/glibc/stdlib/exit.c:90
/work/glibc/posix/../sysdeps/unix/sysv/linux/i386/_exit.S:25
/work/glibc/posix/../sysdeps/unix/sysv/linux/i386/_exit.S:29
/work/glibc/posix/../sysdeps/unix/sysv/linux/i386/_exit.S:30
&lt;/pre&gt;

addr2line prints "??:0" for addresses that it can't find in debugging information.  If your executable hasn't been built with debugging information - the static libc.a in Debian/Ubuntu isn't built with debugging enabled - you can pass the &lt;tt&gt;-f&lt;/tt&gt; option to addr2line to print function names as well, and then filter out the useless "??:0" lines:

&lt;pre&gt;
$ cat hellow.c &amp;lt;&amp;lt;END
#include &amp;lt;stdio.h&gt;
int main() { printf("Hello, world!\n"); return 0; }
END
$ gcc -static hellow.c -o hellow-static
$ ./itrace ./hellow-static | addr2line -e ./hellow-static -f | grep -v "??:0" | uniq | less
_start
__libc_start_main
_dl_aux_init
__libc_start_main
_dl_discover_osversion
__uname
_dl_discover_osversion
__libc_start_main
__pthread_initialize_minimal
__libc_setup_tls
...
_IO_default_setbuf
_IO_new_file_setbuf
_IO_unbuffer_write
_IO_cleanup
exit
_exit
&lt;/pre&gt;

This tells us what happened before the program exited.  In this case, it was executing exit() and flushing stdio buffers.

&lt;p&gt;
Tracing debugging can often be more helpful than postmortem debugging.  Debugging with printf is still popular, after all!  However, single-stepping using ptrace() has quite a high overhead so this is not really suitable for longer-running processes.  Potentially Valgrind could give us an execution trace more efficiently using its dynamic translation approach, but that would still be quite slow.  Modern x86 CPUs have a "Branch Trace Store" feature for recording address of basic blocks as they are executed which has the potential to be a faster method, and there is a proposal to &lt;a href="http://lwn.net/Articles/280420/"&gt;make this available in Linux&lt;/a&gt; through the ptrace() interface.


&lt;h2&gt;itrace.c&lt;/h2&gt;

This assumes i386, so if you're on a x86-64 system you can build it with:
&lt;pre&gt;
gcc -m32 itrace.c -o itrace
&lt;/pre&gt;

&lt;pre&gt;
#include &amp;lt;stdio.h&gt;
#include &amp;lt;sys/wait.h&gt;
#include &amp;lt;unistd.h&gt;

#include &amp;lt;linux/user.h&gt;
#include &amp;lt;sys/ptrace.h&gt;

int main(int argc, char **argv)
{
  int pid = fork();
  if(pid == 0) {
    if(ptrace(PTRACE_TRACEME) &amp;lt; 0) {
      perror("ptrace");
      _exit(1);
    }
    execvp(argv[1], argv + 1);
    perror("exec");
    _exit(1);
  }
  while(1) {
    int status;
    struct user_regs_struct regs;
    if(waitpid(pid, &amp;status, 0) &amp;lt; 0)
      perror("waitpid");
    if(!WIFSTOPPED(status))
      break;
    if(ptrace(PTRACE_GETREGS, pid, 0, &amp;regs) &amp;lt; 0)
      perror("ptrace/GETREGS");
    printf("%lx %lx\n", regs.eip, regs.esp);
    if(ptrace(PTRACE_SINGLESTEP, pid, 0, 0) &amp;lt; 0)
      perror("ptrace/SINGLESTEP");
  }
  return 0;
}
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-3884754451942536283?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/3884754451942536283/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=3884754451942536283' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3884754451942536283'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/3884754451942536283'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/05/really-simple-tracing-debugger.html' title='A really simple tracing debugger'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5643427562802407075</id><published>2009-05-04T15:02:00.003+01:00</published><updated>2009-12-01T20:05:27.799Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='native-client'/><title type='text'>Progress on Native Client</title><content type='html'>&lt;a href="http://lackingrhoticity.blogspot.com/2009/01/what-does-nacl-mean-for-plash.html"&gt;Back in January&lt;/a&gt; I wrote that I was porting &lt;a href="http://www.gnu.org/software/libc/"&gt;glibc&lt;/a&gt; to &lt;a href="http://code.google.com/p/nativeclient/"&gt;Native Client&lt;/a&gt;.  I have made some good progress since then.

&lt;p&gt;
The port is at the stage where it can run simple statically-linked and dynamically-linked executables, from the command line and from the web browser.

&lt;p&gt;
In particular, Python works.  I have put together a simple interactive top-level that demonstrates running Python from the web browser:

&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_IE-Ip3EU3_Y/Sf72wRliYzI/AAAAAAAAAAM/ZuzZ1AjZX2o/s1600-h/nacl-python-repl-1.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400; height: 300;" src="http://4.bp.blogspot.com/_IE-Ip3EU3_Y/Sf72wRliYzI/AAAAAAAAAAM/ZuzZ1AjZX2o/s320/nacl-python-repl-1.png" alt="" id="BLOGGER_PHOTO_ID_5331970318068245298" border="0" /&gt;&lt;/a&gt;

&lt;p&gt;
Upstream NaCl doesn't support any &lt;a href="http://code.google.com/p/nativeclient/wiki/Porting"&gt;filename-based calls&lt;/a&gt; such as open(), but we do.  In this setup, open() does not of course access the real filesystem on the host system.  open() in NaCl-glibc sends a message to the Javascript code on the web page.  The Javascript code can fetch the requested file, create a file descriptor for the file using the plugin's API, and send the file descriptor to the NaCl subprocess in reply.

&lt;p&gt;
This has not involved modifying Python at all, although I have added an extension module to wrap a couple of NaCl-specific system calls (imc_sendmsg() and imc_recvmsg()).  The Python build runs to completion, including the parts where it runs the executable it builds.  A simple &lt;a href="http://plash.beasts.org/wiki/NativeClient/Ncrewrite"&gt;instruction-rewriting trick&lt;/a&gt; means that dynamically-linked NaCl executables can run outside of NaCl, directly under Linux, linked against the regular Linux glibc.  This means we can avoid some of the problems associated with cross-compilation when building for NaCl.

&lt;p&gt;
This work has involved extending Native Client:

&lt;ul&gt;
&lt;li&gt;Adding support for dynamic loading of code.  Initially I have focused on just making it work.  Now I'll focus on ensuring this is secure.
&lt;li&gt;Writing a &lt;a href="http://plash.beasts.org/wiki/NativeClient/Plugin"&gt;custom browser plugin&lt;/a&gt; for NaCl which allows Javascript code to handle asynchronous callbacks from NaCl subprocesses.
&lt;li&gt;Making &lt;a href="http://plash.beasts.org/wiki/NativeClient/Changes"&gt;various changes&lt;/a&gt; to the NaCl versions of gcc and binutils to support dynamically linked code.
&lt;/ul&gt;

Hopefully these changes can get merged upstream eventually.  Some of the toolchain changes have already gone in.

&lt;p&gt;
See the &lt;a href="http://plash.beasts.org/wiki/NativeClient"&gt;web page&lt;/a&gt; for instructions on how to get the code and try it out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5643427562802407075?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5643427562802407075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5643427562802407075' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5643427562802407075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5643427562802407075'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/05/progress-on-native-client.html' title='Progress on Native Client'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_IE-Ip3EU3_Y/Sf72wRliYzI/AAAAAAAAAAM/ZuzZ1AjZX2o/s72-c/nacl-python-repl-1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-8024453153605922481</id><published>2009-04-18T13:33:00.003+01:00</published><updated>2009-12-01T20:03:22.389Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python variable binding semantics, part 2</title><content type='html'>Time for me to follow up my &lt;a href="http://lackingrhoticity.blogspot.com/2009/03/python-variable-binding-semantics.html"&gt;previous blog post&lt;/a&gt; and explain the code snippets.

&lt;h2&gt;Snippet 1&lt;/h2&gt;

&lt;pre&gt;
funcs = []
for x in (1, 2, 3):
    funcs.append(lambda: x)
for f in funcs:
    print f()
&lt;/pre&gt;

Although you might expect this Python code snippet to print "1 2 3", it actually prints "3 3 3".  The reason for this is that the same mutable variable binding for &lt;tt&gt;x&lt;/tt&gt; is used across all iterations of the loop.  The &lt;tt&gt;for&lt;/tt&gt; loop does not create new variable bindings.

&lt;p&gt;
The function closures that are created inside the loop do not capture the value of &lt;tt&gt;x&lt;/tt&gt;; they capture the cell object that &lt;tt&gt;x&lt;/tt&gt; is bound to under the hood (or the globals dictionary if this code snippet is run at the top level), and each iteration mutates the same cell object.


&lt;p&gt;
Interestingly, Python does not restrict the target of the &lt;tt&gt;for&lt;/tt&gt; loop to be a variable or tuple of variables.  It can be any lvalue expression that can appear on the left hand side of an assignment, including indexing expressions.  For example:

&lt;pre&gt;
x = [10, 20]
for x[0] in (1,2,3):
    pass
print x # Prints [3, 20]
&lt;/pre&gt;


&lt;h2&gt;Snippet 2&lt;/h2&gt;

This problem is Python's variable binding semantics is not specific to lambdas and also occurs if you use &lt;tt&gt;def&lt;/tt&gt;.

&lt;pre&gt;
funcs = []
for x in (1, 2, 3):
    def myfunc():
        return x
    funcs.append(myfunc)
for f in funcs:
    print f()
&lt;/pre&gt;

This also prints "3 3 3".


&lt;h2&gt;Snippet 3&lt;/h2&gt;

The remaining snippets are examples of ways to work around this problem.  They print the desired "1 2 3".

&lt;p&gt;
One way is to use default arguments:

&lt;pre&gt;
funcs = []
for x in (1, 2, 3):
    funcs.append(lambda x=x: x)
for f in funcs:
    print f()
&lt;/pre&gt;

This is actually the trick that was used to get the effect of lexical scoping before lexical scoping was added to Python.

&lt;p&gt;
The default argument captures the value of &lt;tt&gt;x&lt;/tt&gt; at the point where the function closure is created, and this value gets rebound to &lt;tt&gt;x&lt;/tt&gt; when the function is called.

&lt;p&gt;
Although it is concise, I'm not too keen on this workaround.  It is an abuse of default arguments.  There is a risk that you accidentally pass too many arguments to the function and thereby override the captured value of &lt;tt&gt;x&lt;/tt&gt;.


&lt;h2&gt;Snippet 4&lt;/h2&gt;

&lt;pre&gt;
funcs = []
for x in (1, 2, 3):
    def g(x):
        funcs.append(lambda: x)
    g(x)
for f in funcs:
    print f()
&lt;/pre&gt;

This is quite an ugly workaround, but it's perhaps the most general one.  The loop body is wrapped in a function.  The &lt;tt&gt;x&lt;/tt&gt; inside function &lt;tt&gt;g&lt;/tt&gt; and the &lt;tt&gt;x&lt;/tt&gt; outside are statically different variable bindings.  A new mutable binding of &lt;tt&gt;x&lt;/tt&gt; is created on every call to &lt;tt&gt;g&lt;/tt&gt;, but this binding is never modified.


&lt;h2&gt;Snippets 5 and 6&lt;/h2&gt;

The remaining two snippets are attempts to make the code less ugly, by giving names to the intermediate functions that are introduced to rebind &lt;tt&gt;x&lt;/tt&gt; and thereby give the desired behaviour.

&lt;pre&gt;
funcs = []
def append_a_func(x):
    funcs.append(lambda: x)
for x in (1, 2, 3):
    append_a_func(x)
for f in funcs:
    print f()

def make_func(x):
    return lambda: x
funcs = []
for x in (1, 2, 3):
    funcs.append(make_func(x))
for f in funcs:
    print f()
&lt;/pre&gt;

I suppose there is a downside to de-uglifying the code.  If the code looks too normal, there's a risk that someone will refactor it to the incorrect version without testing that it works properly when looping over lists containing multiple items.


&lt;h2&gt;Can the language be fixed?&lt;/h2&gt;

The root cause is that Python does away with explicit variable binding declarations.  If a variable is assigned within a function -- including being assigned as the target of a &lt;tt&gt;for&lt;/tt&gt; loop -- it becomes local to the function, but variables are never local to smaller statement blocks such as a loop body.  Python doesn't have block scoping.

There are a couple of naive ways in which you might try to fix the language so that the code does what you would expect, both of which have problems.

We could change the semantics of function closure creation so that closures take a snapshot of variables' values.  But this would break mutually recursive function definitions, and cases where functions are referred to before they are defined, such as:

&lt;pre&gt;
def f(): # This "def" would raise an UnboundLocalError or a NameError
    g()
def g():
    pass
&lt;/pre&gt;

We could change the semantics of &lt;tt&gt;for&lt;/tt&gt; loops so that the target variable is given a variable binding that is local to the loop body.  But this wouldn't change the status of variable bindings created by assignment, so this code would still have the unwanted behaviour:

&lt;pre&gt;
for x in (1, 2, 3):
    y = x
    funcs.append(lambda: y)
&lt;/pre&gt;


&lt;h2&gt;Linting&lt;/h2&gt;

It should be possible to create a lint tool to detect this hazard.  It could look for closures that capture variables that are assigned by or in loops.  I wonder how many false positives it would give on a typical codebase.


&lt;h2&gt;Javascript&lt;/h2&gt;

Javascript suffers from the same problem:

&lt;pre&gt;
funcs = [];
for(var x = 0; x &amp;lt; 3; x++) {
    funcs.push(function() { return x; })
}
for(var i = 0; i &amp;lt; funcs.length; i++) {
    print(funcs[i]());
}
&lt;/pre&gt;

This code prints "3 3 3" when run using rhino.

&lt;p&gt;
In Javascript's case the affliction is entirely gratuitous.  It happens because &lt;tt&gt;var&lt;/tt&gt; declarations are implicitly hoisted up to the top of the function, for no apparent reason.  It is gratuitous because this is not the result of a trade-off.

&lt;p&gt;
The good news, though, is that the problem is recognised and a fix to Javascript is planned.  I think the plan is for Ecmascript 3.1 to introduce a &lt;tt&gt;let&lt;/tt&gt; declaration as an alternative to &lt;tt&gt;var&lt;/tt&gt; without the hoisting behaviour.


&lt;h2&gt;List comprehensions&lt;/h2&gt;

Back to Python.  The same problem also applies to list comprehensions and generator expressions.

&lt;pre&gt;
funcs = [(lambda: x) for x in (1, 2, 3)]
for f in funcs:
    print f()

funcs = list((lambda: x) for x in (1, 2, 3))
for f in funcs:
    print f()
&lt;/pre&gt;

&lt;p&gt;
These both print "3 3 3".

&lt;p&gt;
(Note that I added brackets around the lambdas for clarity but the syntax does not require them.)

&lt;p&gt;
This is forgivable for list comprehensions, at least in Python 2.x, because the assignment to &lt;tt&gt;x&lt;/tt&gt; escapes into the surrounding function.  (See my &lt;a href="http://lackingrhoticity.blogspot.com/2008/08/4-python-variable-binding-oddities.html"&gt;earlier post&lt;/a&gt;.)

&lt;p&gt;
But for generators (and for list comprehensions in Python 3.0), the scope of &lt;tt&gt;x&lt;/tt&gt; is limited to the comprehension.  Semantically, it would be easy to limit the scope of &lt;tt&gt;x&lt;/tt&gt; to within a loop iteration, so that each iteration introduces a new variable binding.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-8024453153605922481?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/8024453153605922481/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=8024453153605922481' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8024453153605922481'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8024453153605922481'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/04/python-variable-binding-semantics-part.html' title='Python variable binding semantics, part 2'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-8312916391893328938</id><published>2009-03-31T13:29:00.003+01:00</published><updated>2009-12-01T20:03:22.389Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Python variable binding semantics</title><content type='html'>What do the six following chunks of code do, and what would you like them to do?  Which do you prefer?

&lt;pre&gt;
funcs = []
for x in (1, 2, 3):
    funcs.append(lambda: x)
for f in funcs:
    print f()

funcs = []
for x in (1, 2, 3):
    def myfunc():
        return x
    funcs.append(myfunc)
for f in funcs:
    print f()

funcs = []
for x in (1, 2, 3):
    funcs.append(lambda x=x: x)
for f in funcs:
    print f()

funcs = []
for x in (1, 2, 3):
    def g(x):
        funcs.append(lambda: x)
    g(x)
for f in funcs:
    print f()

funcs = []
def append_a_func(x):
    funcs.append(lambda: x)
for x in (1, 2, 3):
    append_a_func(x)
for f in funcs:
    print f()

def make_func(x):
    return lambda: x
funcs = []
for x in (1, 2, 3):
    funcs.append(make_func(x))
for f in funcs:
    print f()
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-8312916391893328938?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/8312916391893328938/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=8312916391893328938' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8312916391893328938'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8312916391893328938'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/03/python-variable-binding-semantics.html' title='Python variable binding semantics'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5294336070766206032</id><published>2009-02-22T23:07:00.004Z</published><updated>2009-02-22T23:13:03.372Z</updated><title type='text'>OpenStreetMap</title><content type='html'>I tried out &lt;a href="http://openstreetmap.org"&gt;OpenStreetMap&lt;/a&gt; the other day after reading the &lt;a href="http://lwn.net/Articles/318801/"&gt;recent article about it on LWN&lt;/a&gt;.  Amazingly it looks pretty complete for the parts of the UK and US that I looked at.  I don't know how much comes from people gathering data by travelling around or by entering data from satellite photos or out-of-copyright maps, but whichever of these it is, it's very impressive.

&lt;p&gt;
It is also better than &lt;a href="http://maps.google.co.uk"&gt;Google Maps&lt;/a&gt; in some ways:

&lt;ul&gt;

&lt;li&gt; It has more details: it shows footpaths, rivers and streams, wooded areas, and paths across parks.  It has outlines of interesting buildings where people have entered data for them.

&lt;li&gt; I think the default renderer (Mapnik) looks better than the Google Maps equivalent, especially when zoomed out to a level where streets are one pixel thick but still distinguishable.  Google Maps gives too much prominence to the major roads -- it renders them thickly, in bright colours, and with large labels, which tends to drown out the details.  Mapnik is more subtle.  The map just looks more &lt;em&gt;interesting&lt;/em&gt;.

&lt;li&gt; It uses more of the browser window and doesn't waste as much space on sidebars.  It's almost a trivial point, but it makes a difference.

&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5294336070766206032?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5294336070766206032/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5294336070766206032' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5294336070766206032'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5294336070766206032'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/02/openstreetmap.html' title='OpenStreetMap'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5362852545327948462</id><published>2009-01-16T17:27:00.005Z</published><updated>2009-12-01T20:03:22.389Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='automated-testing'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Testing using golden files in Python</title><content type='html'>This is the third post in a series about automated testing in Python (earlier posts:
&lt;a href="http://lackingrhoticity.blogspot.com/2008/11/tempdirtestcase-python-unittest-helper.html"&gt;1&lt;/a&gt;,
&lt;a href="http://lackingrhoticity.blogspot.com/2008/12/helper-for-monkey-patching-in-tests.html"&gt;2&lt;/a&gt;).
This post is about testing using golden files.

&lt;p&gt;
A golden test is a fancy way of doing assertEquals() on a string or a directory tree, where the expected output is kept in a separate file or files -- the &lt;em&gt;golden files&lt;/em&gt;.  If the actual output does not match the expected output, the test runner can optionally run an interactive file comparison tool such as &lt;a href="http://meld.sourceforge.net"&gt;Meld&lt;/a&gt; to display the differences and allow you to selectively merge the differences into the golden file.

&lt;p&gt;
This is useful when
&lt;ul&gt;
&lt;li&gt; the data being checked is large - too large to embed into the Python source; or
&lt;li&gt; the data contains relatively inconsequential details, such as boilerplate text or formatting, which might be changed frequently.
&lt;/ul&gt;

As an example, suppose you have code to format some data as HTML.  You can have a test which creates some example data, formats this as HTML, and compares the result to a golden file.  Something like this:

&lt;pre&gt;
class ExampleTest(golden_test.GoldenTestCase):

    def test_formatting_html(self):
        obj = make_some_example_object()
        temp_dir = self.make_temp_dir()
        format_as_html(obj, temp_dir)
        self.assert_golden(temp_dir, os.path.join(os.path.dirname(__file__),
                                                  "golden-files"))

if __name__ == "__main__":
    golden_test.main()
&lt;/pre&gt;

By default, the test runs non-interactively, which is what you want on a continuous integration machine, and it will print a diff if it fails.  To switch on the semi-interactive mode which runs Meld, you run the test with the option &lt;tt&gt;--meld&lt;/tt&gt;.

&lt;p&gt;
Here is a simple version of the test helper (taken from &lt;a href="http://svn.gna.org/viewcvs/plash/scratch/x11-proxy/xjack/golden_test.py?rev=833&amp;view=auto"&gt;here&lt;/a&gt;):

&lt;pre&gt;
import os
import subprocess
import sys
import unittest

class GoldenTestCase(unittest.TestCase):

    run_meld = False

    def assert_golden(self, dir_got, dir_expect):
        assert os.path.exists(dir_expect), dir_expect
        proc = subprocess.Popen(["diff", "--recursive", "-u", "-N",
                                 "--exclude=.*", dir_expect, dir_got],
                                stdout=subprocess.PIPE)
        stdout, stderr = proc.communicate()
        if len(stdout) &gt; 0:
            if self.run_meld:
                # Put expected output on the right because that is the
                # side we usually edit.
                subprocess.call(["meld", dir_got, dir_expect])
            raise AssertionError(
                "Differences from golden files found.\n"
                "Try running with --meld to update golden files.\n"
                "%s" % stdout)
        self.assertEquals(proc.wait(), 0)

def main():
    if sys.argv[1:2] == ["--meld"]:
        GoldenTestCase.run_meld = True
        sys.argv.pop(1)
    unittest.main()
&lt;/pre&gt;

(It's a bit cheesy to modify global state on startup to enable melding, but because &lt;tt&gt;unittest&lt;/tt&gt; doesn't make it easy to pass parameters into tests this is the simplest way of doing it.)

&lt;p&gt;
Golden tests have the same sort of advantages that are associated with test-driven development in general.
&lt;ul&gt;
&lt;li&gt; Golden files are checked into version control and help to make changesets self-documenting.  A changeset that affects the program's output will include patches that demonstrate how the output is affected.  You can see the history of the program's output in version control.  (This assumes that everyone runs the tests before committing!)
&lt;li&gt; Sometimes you can point people at the golden files if they want to see example output.  For HTML, sometimes you can contrive the CSS links to work so that the HTML looks right when viewed in a browser.
&lt;li&gt; And of course, this can catch cases where you didn't intend to change the program's output.
&lt;/ul&gt;

My typical workflow is to add a test with some example input that I want the program to handle, and run the test with &lt;tt&gt;--meld&lt;/tt&gt; until the output I want comes up on the left-hand side in Meld.  I mark the output as OK by copying it over to the right-hand side.  This is not quite test-first, because I am letting the test suggest what its expected output should be.  But it is possible to do this in an even more test-first manner by typing the output you expect into the right-hand side.

&lt;p&gt;
Other times, one change will affect many locations in the golden files, and adding a new test is not necessary.  It's usually not too difficult to quickly eyeball the differences with Meld.

&lt;p&gt;
Here are some of the things that I have used golden files to test:

&lt;ul&gt;
&lt;li&gt; formatting of automatically-generated e-mails
&lt;li&gt; automatically generated configuration files
&lt;li&gt; HTML formatting logs (&lt;a href="http://svn.gna.org/viewcvs/plash/trunk/build-tools/build_log_test.py?rev=874&amp;view=auto"&gt;build_log_test.py&lt;/a&gt; and its &lt;a href="http://svn.gna.org/viewcvs/plash/trunk/build-tools/golden-files/?rev=874"&gt;golden files&lt;/a&gt;)
&lt;li&gt; pretty-printed output of X Windows messages in xjack-xcb (&lt;a href="http://svn.gna.org/viewcvs/plash/scratch/x11-proxy/xjack/golden_test.py?rev=833&amp;view=auto"&gt;golden_test.py&lt;/a&gt; and &lt;a href="http://svn.gna.org/viewcvs/plash/scratch/x11-proxy/xjack/golden_test_data/?rev=874"&gt;golden_test_data&lt;/a&gt;).  This ended up testing several components in one go:
&lt;ul&gt;
  &lt;li&gt; the XCB protocol definitions for X11
  &lt;li&gt; the encoders and decoders that work off of the XCB protocol definitions
  &lt;li&gt; the pretty printer for the decoder
&lt;/ul&gt;
&lt;/ul&gt;

On the other hand, sometimes the overhead of having a separate file isn't worth it, and if the expected output is small (maybe 5 lines or so) I might instead paste it into the Python source as a multiline string.

&lt;p&gt;
It can be tempting to overuse golden tests.  As with any test suite, avoid creating one big example that tries to cover all cases.  (This is particularly tempting if the test is slow to run.)  Try to create smaller, separate examples.  The test helper above is not so good at this, because if you are not careful it can end up running Meld several times.  In the past I have put several examples into (from &lt;tt&gt;unittest&lt;/tt&gt;'s point of view) one test case, so that it runs Meld only once.

&lt;p&gt;
Golden tests might not work so well if components can be changed independently.  For example, if your XML library changes its whitespace pretty-printing, the tests' output could change.  This is less of a problem if your code is deployed with tightly-controlled versions of libraries, because you can just update the golden files when you upgrade libraries.

&lt;p&gt;
A note on terminology:  I think I got the term "golden file" from my previous workplace, where other people were using them, and the term seems to enjoy some limited use judging from Google.  "Golden test", however, may have been a term that I have made up and that no-one else outside my workplace is using for this meaning.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5362852545327948462?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5362852545327948462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5362852545327948462' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5362852545327948462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5362852545327948462'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/01/testing-using-golden-files-in-python.html' title='Testing using golden files in Python'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-421129930262351906</id><published>2009-01-11T21:22:00.004Z</published><updated>2009-12-01T20:05:27.799Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='native-client'/><title type='text'>On ABI and API compatibility</title><content type='html'>If you are going to create a new execution environment, such as a new OS, it can make things simpler if it presents the same interfaces as an existing system.  It makes porting easier.  So,
&lt;ul&gt;
&lt;li&gt; If you can, keep the ABI the same.
&lt;li&gt; If you can't keep the ABI the same, at least keep the API the same.
&lt;/ul&gt;

&lt;p&gt;
Don't be tempted to say "we're changing X; we may as well take this opportunity to change Y, which has always bugged me".  Only change things if there is a good reason.

&lt;p&gt;
For an example, let's look at the case of GNU Hurd.

&lt;ul&gt;
&lt;li&gt; In principle, the Hurd's glibc could present the same ABI as Linux's glibc (they share the same codebase, after all), but partly because of a different in threading libraries, they were made incompatible.  Unifying the ABIs was &lt;a href="http://lists.debian.org/debian-hurd/1998/12/msg00141.html"&gt;planned&lt;/a&gt;, but it appears that 10 years later it has not happened (Hurd has a &lt;a href="http://packages.debian.org/sid/libc0.3"&gt;libc0.3 package&lt;/a&gt; instead of libc6).

  &lt;p&gt;Using the same ABI would have meant that the same executables would work on Linux and the Hurd.  Debian would not have needed to rebuild all its packages for a separate "hurd-i386" architecture.  It would have saved a lot of effort.

  &lt;p&gt;I suspect that if glibc were ported to the Hurd today, it would not be hard to make the ABIs the same.  The threading code has changed a lot in the intervening time.  I think it is cleaner now.

&lt;li&gt; The Hurd's glibc also changed the API: they decided not to define PATH_MAX.  The idea was that if there was a program that used fixed-length buffers for storing filenames, you'd be forced to fix it.  Well, that wasn't a good idea.  It just created unnecessary work.  Hurd developers and users had enough on their plates without having to fix unrelated implementation quality issues in programs they wanted to use.
&lt;/ul&gt;


Similarly, it would help if glibc for &lt;a href="http://code.google.com/p/nativeclient/"&gt;NaCl (Native Client)&lt;/a&gt; could have the same ABI as glibc for Linux.  Programs have to be recompiled for NaCl with nacl-gcc to meet the requirements of NaCl's validator, but could the resulting code still run directly on Linux, linked with the regular i386 Linux glibc and other libraries?

The problem here is that code compiled for NaCl will expect all code addresses to be 32-byte-aligned.  If the NaCl code is given a function address or a return address which is not 32-byte-aligned, things will go badly wrong when it tries to jump to it.

Some background:  In normal x86 code, to return from a function you write this:

&lt;pre&gt;
        ret
&lt;/pre&gt;

This pops an address off the stack and jumps to it.  In NaCl, this becomes:

&lt;pre&gt;
        popl %ecx
        and $0xffffffe0, %ecx
        jmp *%ecx
&lt;/pre&gt;

This pops an address off the stack, rounds it down to the nearest 32 byte boundary and jumps to it.  If the calling function's &lt;tt&gt;call&lt;/tt&gt; instruction was not placed at the end of a 32 byte block (which NaCl's assembler will arrange), the return address will not be aligned and this code will jump to the wrong location.

&lt;p&gt;
However, there is a way around this.  We can get the NaCl assembler and linker to keep a list of all the places where a forcible alignment instruction (the &lt;tt&gt;and $0xffffffe0, %ecx&lt;/tt&gt; above) was inserted, and put this list into the executable or library in a special section or segment.  Then when we want to run the executable or library directly on Linux, we can rewrite all these locations so that the sequence above becomes

&lt;pre&gt;
        popl %ecx
        nop
        nop
        jmp *%ecx
&lt;/pre&gt;

or maybe even just

&lt;pre&gt;
        ret
        nop
        nop
        nop
        nop
        nop
&lt;/pre&gt;

We can reuse the relocations mechanism to store these rewrites.  The crafty old linker already does something similar for thread-local variable accesses.  When it knows that a thread-local variable is being accessed from the library where it is defined, it can rewrite the general-purpose-but-slow instruction sequence for TLS variable access into a faster instruction sequence.  The general purpose instruction sequence even contains &lt;tt&gt;nop&lt;/tt&gt;s to allow for rewriting to the slightly-longer fast sequence.

&lt;p&gt;
This arrangement for running NaCl-compiled code could significantly simplify the process of building and testing code when porting it to NaCl.  It can help us avoid the difficulties associated with cross-compiling.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-421129930262351906?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/421129930262351906/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=421129930262351906' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/421129930262351906'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/421129930262351906'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/01/on-abi-and-api-compatibility.html' title='On ABI and API compatibility'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-2353650276868239348</id><published>2009-01-04T20:30:00.003Z</published><updated>2009-12-01T20:05:27.799Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='native-client'/><title type='text'>What does NaCl mean for Plash?</title><content type='html'>Google's &lt;a href="http://code.google.com/p/nativeclient/"&gt;Native Client (NaCl)&lt;/a&gt;, announced &lt;a href="http://google-code-updates.blogspot.com/2008/12/native-client-technology-for-running.html"&gt;last month&lt;/a&gt;, is an ingenious hack to get around the problem that existing OSes don't provide adequate security mechanisms for sandboxing native code.

&lt;p&gt;
You can look at NaCl as an interesting combination of OS-based and language-based security mechanisms:

&lt;ul&gt;
&lt;li&gt; NaCl uses a code verifier to prevent use of unsafe instructions such as those that perform system calls.  This is not a million miles away from programming language subsets like &lt;a href="http://code.google.com/p/google-caja/"&gt;Cajita&lt;/a&gt; and &lt;a href="http://code.google.com/p/joe-e/"&gt;Joe-E&lt;/a&gt;, except that it operates at the level of x86 instructions rather than source code.

&lt;p&gt;
Since x86 instructions are variable-length and unaligned, NaCl has to stop you from jumping into an unsafe instruction hidden in the middle of a safe instruction.  It does that by requiring that all indirect jumps are jumps to the start of 32-byte-aligned blocks; instructions are not allowed to straddle these blocks.

&lt;li&gt; It uses the x86 architecture's little-used segmentation feature to limit memory accesses to a range of address space.  So the processor is doing bounds checking for free.

&lt;p&gt;
   Actually, segmentation has been used before - in L4 and EROS's "&lt;a href="http://www.eros-os.org/design-notes/SmallSpaces386.html"&gt;small spaces&lt;/a&gt;" facility for switching between processes with small address spaces without flushing the TLB.  NaCl gets the same benefit: switching between trusted and untrusted code should be fast; faster than trapping system calls with ptrace(), for example.
&lt;/ul&gt;

&lt;a href="http://plash.beasts.org"&gt;Plash&lt;/a&gt; is also a hack to get sandboxing, specifically on Linux, but it has some limitations:
&lt;ol&gt;
&lt;li&gt; it doesn't block network access;
&lt;li&gt; it doesn't limit CPU and memory resource usage, so sandboxed programs can still cause denial of service;
&lt;li&gt; it requires a &lt;a href="http://plash.beasts.org/wiki/PlashGlibc"&gt;custom glibc&lt;/a&gt;, which can be a pain to build;
&lt;li&gt; it changes the API/ABI that sandboxed programs see in some small but significant ways:
  &lt;ul&gt;
    &lt;li&gt; some syscalls are effectively disabled; programs must go through libc for these calls, which stops statically linked programs from working;
    &lt;li&gt; /proc/self doesn't work, and Plash's architecture makes it hard to emulate /proc.
  &lt;/ul&gt;
&lt;/ol&gt;

Interface changes mean some programs require patching, e.g. to not depend on /proc.  If there were more people behind Plash, these interface changes wouldn't be a big problem.  These problems can be addressed, with work.  But Plash hasn't really caught on, so the manpower isn't there.

&lt;p&gt;
NaCl also breaks the ABI - it breaks it totally.  Code must be recompiled.  However, NaCl provides bigger benefits in return.  It allows programs to be deployed in new contexts: on Windows; in a web browser.  It is more secure than Plash, because it can block network access and limit the amount of memory a process can allocate.  Also, because NaCl mediates access more completely, it would be easier to emulate interfaces like /proc.

&lt;p&gt;
NaCl isn't only useful as a browser plugin.  We could use it as a general purpose OS security mechanism.  We could have GNU/Linux programs running on Windows (without the Linux bit).

&lt;p&gt;
Currently NaCl does not support all the features you'd need in a modern OS.  In particular, dynamic linking.  NaCl doesn't yet support loading code beyond an initial statically linked ELF executable.  But we can add this.  I am making a start at &lt;a href="http://groups.google.com/group/native-client-discuss/browse_thread/thread/5f710e87e0a2721f/4485969260e8fbe3?"&gt;porting glibc&lt;/a&gt;, along with its dynamic linker.  After all, I have ported glibc once before!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-2353650276868239348?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/2353650276868239348/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=2353650276868239348' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2353650276868239348'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2353650276868239348'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2009/01/what-does-nacl-mean-for-plash.html' title='What does NaCl mean for Plash?'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-176827863545917206</id><published>2008-12-17T13:31:00.001Z</published><updated>2009-12-01T20:03:22.389Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='automated-testing'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Helper for monkey patching in tests</title><content type='html'>Following on from my post on &lt;a href="http://lackingrhoticity.blogspot.com/2008/11/tempdirtestcase-python-unittest-helper.html"&gt;TempDirTestCase&lt;/a&gt;, here is another Python test case helper which I introduced at work.  Our codebase has quite a few test cases which use monkey patching.  That is, they temporarily modify a module or a class to replace a function or method for the duration of the test.

&lt;p&gt;
For example, you might want to monkey patch &lt;tt&gt;time.time&lt;/tt&gt; so that it returns repeatable timestamps during the test.  We have quite a lot of test cases that do something like this:

&lt;pre&gt;
class TestFoo(unittest.TestCase):

    def setUp(self):
        self._old_time = time.time
        def monkey_time():
            return 0
        time.time = monkey_time

    def tearDown(self):
        time.time = self._old_time

    def test_foo(self):
        # body of test case
&lt;/pre&gt;

Having to save and restore the old values gets tedious, particularly if you have to monkey patch several objects (and, unfortunately, there are a few tests that monkey patch &lt;em&gt;a lot&lt;/em&gt;).  So I introduced a monkey_patch() method so that the code above can be simplified to:

&lt;pre&gt;
class TestFoo(TestCase):

    def test_foo(self):
        self.monkey_patch(time, "time", lambda: 0)
        # body of test case
&lt;/pre&gt;

(OK, I'm cheating by using a lambda the second time around to make the code look shorter!)

&lt;p&gt;
Now, monkey patching is not ideal, and I would prefer not to have to use it.  When I write new code I try to make sure that it can be tested without resorting to monkey patching.  So, for example, I would parameterize the software under test to take &lt;tt&gt;time.time&lt;/tt&gt; as an argument instead of getting it directly from the &lt;tt&gt;time&lt;/tt&gt; module.
(here's &lt;a href="http://svn.gna.org/viewcvs/plash/trunk/build-tools/build_log.py?rev=874&amp;view=auto"&gt;an example&lt;/a&gt;).

&lt;p&gt;
But sometimes you have to work with a codebase where most of the code is not covered by tests and is structured in such a way that adding tests is difficult.  You could refactor the code to be more testable, but that risks changing its behaviour and breaking it.  In that situation, monkey patching can be very useful.  Once you have some tests, refactoring can become easier and less risky.  It is then easier to refactor to remove the need for monkey patching -- although in practice it can be hard to justify doing that, because it is relatively invasive and might not be a big improvement, and so the monkey patching stays in.

&lt;p&gt;
Here's the code, an extended version of the base class from the &lt;a href="http://lackingrhoticity.blogspot.com/2008/11/tempdirtestcase-python-unittest-helper.html"&gt;earlier post&lt;/a&gt;:

&lt;pre&gt;
import os
import shutil
import tempfile
import unittest

class TestCase(unittest.TestCase):

    def setUp(self):
        self._on_teardown = []

    def make_temp_dir(self):
        temp_dir = tempfile.mkdtemp(prefix="tmp-%s-" % self.__class__.__name__)
        def tear_down():
            shutil.rmtree(temp_dir)
        self._on_teardown.append(tear_down)
        return temp_dir

    def monkey_patch(self, obj, attr, new_value):
        old_value = getattr(obj, attr)
        def tear_down():
            setattr(obj, attr, old_value)
        self._on_teardown.append(tear_down)
        setattr(obj, attr, new_value)

    def monkey_patch_environ(self, key, value):
        old_value = os.environ.get(key)
        def tear_down():
            if old_value is None:
                del os.environ[key]
            else:
                os.environ[key] = old_value
        self._on_teardown.append(tear_down)
        os.environ[key] = value

    def tearDown(self):
        for func in reversed(self._on_teardown):
            func()
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-176827863545917206?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/176827863545917206/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=176827863545917206' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/176827863545917206'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/176827863545917206'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/12/helper-for-monkey-patching-in-tests.html' title='Helper for monkey patching in tests'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-2298739297148000356</id><published>2008-11-26T19:54:00.002Z</published><updated>2008-11-26T20:01:52.604Z</updated><title type='text'>Shell features</title><content type='html'>Here are two features I would like to see in a Unix shell:

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Timing:&lt;/strong&gt; The shell should record how long each command takes.  It should be able to show the start and stop times and durations of commands I have run in the past.
&lt;li&gt;&lt;strong&gt;Finish notifications:&lt;/strong&gt; When a long-running command finishes, the task bar icon for the shell's terminal window should flash, just as instant messaging programs flash their task bar icon when you receive a message.  If the terminal window is tabbed, the tab should be highlighted too.
&lt;/ul&gt;

You can achieve the first with the &lt;tt&gt;time&lt;/tt&gt; command, sure, but sometimes I start a command without knowing in advance that it will be long-running.  It's hard to add the timer in afterwards.  Also, Bash's &lt;tt&gt;time&lt;/tt&gt; builtin stops working if you suspend a job with Ctrl-Z.  It would be simpler if the shell collected this information by default.

&lt;p&gt;
The second feature requires some integration between the shell and the terminal.  This could be done via some new terminal escape sequence or perhaps using the WINDOWID environment variable that gnome-terminal appears to pass to its subprocesses.  But actually, I would prefer if the shell provided its own terminal window.  There would be more scope for combining GUI and CLI features that way, such as displaying filename completions or (more usefully) command history in a pop-up window.

&lt;p&gt;
I have seen a couple of attempts to do that.  &lt;a href="http://hotwire-shell.org"&gt;Hotwire&lt;/a&gt; is one, but it is too different from Bash for my tastes.  I would like a GUI shell that initially looks and can be used just like gnome-terminal + Bash.  &lt;a href="http://personal.atl.bellsouth.net/v/c/vcato/gsh/"&gt;Gsh&lt;/a&gt; is closer to what I have in mind, but it is quite old, written in Tcl/Tk and C, and not complete.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-2298739297148000356?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/2298739297148000356/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=2298739297148000356' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2298739297148000356'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/2298739297148000356'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/11/shell-features.html' title='Shell features'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-300366081212107187</id><published>2008-11-22T16:40:00.002Z</published><updated>2009-12-01T20:01:50.853Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='automated-testing'/><title type='text'>TempDirTestCase, a Python unittest helper</title><content type='html'>I have seen a lot of unittest-based test cases written in Python that create temporary directories in setUp() and delete them in tearDown().

&lt;p&gt;
Creating temporary directories is such a common thing to do that I have a base class that provides a make_temp_dir() helper method.  As a result, you often don't have to define setUp() and tearDown() in your test cases at all.

&lt;p&gt;
I have ended up copying this into different codebases because it's easier than making the codebases depend on an external library for such a trivial piece of code.  This seems to be quite common: lots of Python projects provide their own test runners based on unittest.

&lt;p&gt;
Here's the code:

&lt;pre&gt;
import shutil
import tempfile
import unittest

class TempDirTestCase(unittest.TestCase):

    def setUp(self):
        self._on_teardown = []

    def make_temp_dir(self):
        temp_dir = tempfile.mkdtemp(prefix="tmp-%s-" % self.__class__.__name__)
        def tear_down():
            shutil.rmtree(temp_dir)
        self._on_teardown.append(tear_down)
        return temp_dir

    def tearDown(self):
        for func in reversed(self._on_teardown):
            func()
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-300366081212107187?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/300366081212107187/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=300366081212107187' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/300366081212107187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/300366081212107187'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/11/tempdirtestcase-python-unittest-helper.html' title='TempDirTestCase, a Python unittest helper'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-6666814769092103806</id><published>2008-10-27T19:55:00.003Z</published><updated>2009-12-07T15:27:25.468Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='packaging'/><title type='text'>Making relocatable packages with JHBuild</title><content type='html'>I have been revisiting &lt;a href="http://article.gmane.org/gmane.comp.file-systems.zero-install.devel/2004"&gt;an experiment I began back in March&lt;/a&gt; with building GNOME with &lt;a href="http://live.gnome.org/Jhbuild"&gt;JHBuild&lt;/a&gt;.  I wanted to see if it would be practical to use JHBuild to package all of GNOME with &lt;a href="http://zero-install.sourceforge.net"&gt;Zero-Install&lt;/a&gt;.  The main issue with packaging anything with Zero-Install is relocatability.

&lt;p&gt;
If you build and install an autotools-based package with
&lt;pre&gt;
./configure --prefix=/FOO &amp;&amp; make install
&lt;/pre&gt;
the resulting files installed under /FOO will often have the pathname /FOO embedded in them, sometimes in text files, other times compiled into libraries and executables.
This is a problem for Zero-Install because it runs under a normal user account and wants to install files into a user's home directory under ~/.cache.  Currently if a program is to be packaged with Zero-Install it must be relocatable via environment variables.  Compiling pathnames in is no good (at least without an environment variable override) because you don't know in advance where the program will be installed.

&lt;p&gt;
I found a few cases where pathnames get compiled in:
&lt;ul&gt;
&lt;li&gt; text: pkg-config .pc files
&lt;li&gt; text: libtool .la files
&lt;li&gt; text: shell scripts generated from .in files, such as gtkdocize, intltoolize and gettextize
&lt;li&gt; binary: rpaths added by libtool
&lt;/ul&gt;

&lt;p&gt;
It is possible to handle these individual cases.  Zero-Install's &lt;a href="http://zero-install.sourceforge.net/make-headers.html"&gt;make-headers&lt;/a&gt; tool will fix up pkg-config .pc files.  libtool .la files can apparently just be removed on Linux without any adverse effects.  libtool could be modified to not use rpaths (unfortunately --disable-rpath doesn't seem to work), which are overridden by LD_LIBRARY_PATH anyway.  gtkdocize et al could be modified.  But that sounds like a lot of work.  I'd like to get something working first.

&lt;p&gt;
In revisiting this I hoped that the only cases that would matter would be text files.  It would be easy to do a search and replace inside text files to relocate packages.  The idea would be to build with
&lt;pre&gt;
/configure --prefix=/FAKEPREFIX
make install DESTDIR=/tempdest
&lt;/pre&gt;
and then rewrite /FAKEPREFIX to (say) /home/fred/realprefix.  In a text file, changing the length of a pathname and changing the size of the file usually doesn't matter, but doing this to an ELF executable would completely screw the executable up.  This search-and-replace trick would be a hack, but it would be worth trying.

&lt;p&gt;
It turned out that Evince (which I was using as a test case) embeds the pathname /FAKEPREFIX/share/applications/evince.desktop in its own executable, and if this file doesn't exist, it segfaults on startup.

&lt;p&gt;
Then it occurred to me that I could rewrite filenames inside binary files without changing the length of the filename: just pad the filenames out to a fixed length at the start.

&lt;p&gt;
So the idea now is to build with something like:
&lt;pre&gt;
/configure --prefix=/home/bob/builddir/gtk-XXXXXXX
make install
&lt;/pre&gt;
and, when installing the files on another machine, rewrite
&lt;pre&gt;
/home/bob/builddir/gtk-XXXXXXXXXXXXXXXXXXX
&lt;/pre&gt;
to
&lt;pre&gt;
/home/fred/.cache/0install.net/gtk-XXXXXXX
&lt;/pre&gt;

&lt;p&gt;
Just make sure you start off with enough padding to allow the package to be relocated to any path a user is likely to use in practice.

&lt;p&gt;
This is even hackier than rewriting filenames inside text files, but it's very simple!

&lt;p&gt;
This is partly inspired by &lt;a href="http://nixos.org"&gt;Nix&lt;/a&gt;, which does something similar, but with a bit more complexity.  Nix will install a package under (something like) /nix/store/&amp;lt;hash&gt;, where &amp;lt;hash&gt; is (roughly) a cryptographic hash of the binary package's contents.  But packages like Evince contain embedded filenames referring to their own contents, so Nix will build it with:
&lt;pre&gt;
./configure --prefix=/nix/store/&amp;lt;made-up-hash&gt;
make install
&lt;/pre&gt;
where &amp;lt;made-up-hash&gt; is chosen randomly.  Afterwards, the output is rewritten to replace &amp;lt;made-up-hash&gt; with the real hash of the output, but there is some cleverness to discount &amp;lt;made-up-hash&gt; from affecting the real hash.

&lt;p&gt;
(Earlier versions of Nix used the hash of the build input to identify packages rather than the hash of the build output.  This avoided the need to do rewriting but didn't allow a package's contents to be verified based on its hash name.)

&lt;p&gt;
The fact that Nix uses this scheme successfully indicates that filename rewriting in binaries works, and filenames are not being checksummed or compressed or encoded in weird ways, which is good.

&lt;p&gt;
My plan now is:
&lt;ul&gt;
&lt;li&gt; Extend JHBuild to build packages into fixed-length prefixes and produce Zero-Install feeds.  My work so far is on &lt;a href="https://code.launchpad.net/~mrs/+junk/jhbuild-relocatability"&gt;this Bazaar branch&lt;/a&gt;.
&lt;li&gt; Extend Zero-Install to do filename rewriting inside files in order to relocate packages.
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-6666814769092103806?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/6666814769092103806/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=6666814769092103806' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6666814769092103806'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/6666814769092103806'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/10/making-relocatable-packages-with.html' title='Making relocatable packages with JHBuild'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-7643546437004104031</id><published>2008-09-18T12:42:00.004+01:00</published><updated>2009-12-01T20:03:22.389Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Attribute access in format strings in Python 3.0</title><content type='html'>Here is another problem with securing Python 3.0: &lt;a href="http://www.python.org/dev/peps/pep-3101/"&gt;PEP 3101&lt;/a&gt; has extended format strings so that they contain an attribute access syntax.  This makes the format() method on strings too powerful.  It exposes unrestricted use of getattr, so it can be used to read private attributes.

&lt;p&gt;
For example, the expression &lt;tt&gt;"{0._foo}".format(x)&lt;/tt&gt; is equivalent to &lt;tt&gt;str(x._foo)&lt;/tt&gt;.

&lt;p&gt;
CapPython &lt;em&gt;could&lt;/em&gt; work around this, but it would not be simple.  It could block the "format" attribute (that is, treat it as a private attribute), although that is jarring because this word does not even look special.  Lots of code will want to use format(), so we would have to rewrite this to a function call that interprets format strings safely.  Having to do rewriting increases complexity.  And if our safe format string interpreter is written in Python, it will be slower than the built-in version.

&lt;p&gt;
My recommendation would be to take out the getattr-in-format-strings feature before Python 3.0 is released.  Once the release has happened it would be much easier to add such a feature than to take it out.

&lt;p&gt;
It is a real shame that Python 3 has not adopted E's &lt;a href="http://www.erights.org/elang/grammar/quasi-overview.html"&gt;quasi-literals&lt;/a&gt; feature, which has been around for a while.  Not only are quasi-literals more secure than PEP 3101's format strings, they are more general (because they allow any kind of subexpression) and could be faster (because more can be done at compile time).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-7643546437004104031?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/7643546437004104031/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=7643546437004104031' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7643546437004104031'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/7643546437004104031'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/09/attribute-access-in-format-strings-in.html' title='Attribute access in format strings in Python 3.0'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-5810066586285153696</id><published>2008-09-14T19:10:00.001+01:00</published><updated>2009-12-01T20:03:22.390Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>CapPython, unbound methods and Python 3.0</title><content type='html'>CapPython needs to block access to method functions.  Luckily, one case is already covered by Python's unbound methods, but they are going away in Python 3.0.

&lt;p&gt;
Consider the following piece of Python code:

&lt;pre&gt;
class C(object):

    def f(self):
        return self._field

x = C()
&lt;/pre&gt;

&lt;p&gt;
In Python, methods are built out of functions.  "def" always defines a function.  In this case, the function f is defined in a class scope, so the function gets wrapped up inside a class object, making it available as a method.

&lt;p&gt;
There are three ways in which we might use function f:

&lt;ul&gt;
&lt;li&gt; Via instances of class C as a normal method, e.g. x.f().  This is the common case.  The expression x.f returns a &lt;strong&gt;bound method&lt;/strong&gt;, which wraps the instance x and the function f.

&lt;li&gt; Via class C, e.g. C.f(x).  The expression C.f returns an &lt;strong&gt;unbound method&lt;/strong&gt;.  If you call this unbound method with C.f(y), it first checks that y is an instance of C.  If that is the case, it calls f(x).  Otherwise, it raises a TypeError.

&lt;li&gt; Directly, as a function, assuming you can get hold of the unwrapped function.  There are several ways to get hold of the function:
&lt;ul&gt;
  &lt;li&gt; x.f.im_func or C.f.im_func.  Bound and unbound methods make the function they wrap available via an attribute called "im_func".
  &lt;li&gt; In class scope, "f" is visible directly as a variable.
  &lt;li&gt; C.__dict__["f"]
&lt;/ul&gt;

&lt;/ul&gt;

&lt;p&gt;
CapPython allows the first two but aims to block direct use of method functions.

&lt;p&gt;
In CapPython, attribute access is restricted so that you can only access private attributes (those starting with an underscore) via a "self" variable inside a method function.  For this to work, access to methods functions must be restricted.  Function f should only ever be used on instances of C and its subclasses.

&lt;p&gt;
Suppose that constraint was violated.  If you could get hold of the unwrapped function f, you could apply it to an object y of any type, and f(y) would return the value of the private attribute, y._field.  That would violate encapsulation.

&lt;p&gt;
To enforce encapsulation, CapPython blocks the paths for getting hold of f that are listed above, as well as some others:

&lt;ul&gt;
&lt;li&gt; "im_func" is treated as a private attribute, even though it doesn't start with an underscore.
&lt;li&gt; In class scope, reading the variable f is forbidden.  Or, to be more precise, if variable f is read, f is no longer treated as a method function, and its access to self._field is forbidden.
&lt;li&gt; __dict__ is a private attribute, so the expression C.__dict__ is rejected.
&lt;li&gt; Use of __metaclass__ is blocked, because it provides another way of getting hold of a class's __dict__.
&lt;li&gt; Use of decorators is restricted.
&lt;/ul&gt;

&lt;p&gt;
Bound methods and unbound methods both wrap up function f so that it can be used safely.

&lt;p&gt;
However, this is changing in Python 3.0.  Unbound methods are being removed.  This means that C.f simply returns function f.  If CapPython is going to work on Python 3.0, I am afraid it will have to become a lot more complicated.  CapPython would have to apply rewriting to class definitions so that class objects do not expose unwrapped method functions.  Pre-3.0 CapPython has been able to get away without doing source rewriting.

&lt;p&gt;
Pre-3.0 CapPython has a very nice property:  It is possible for non-CapPython-verified code to pass classes and objects into verified CapPython code without allowing the latter to break encapsulation.  The non-verified code has to be careful not to grant the CapPython code unsafe objects such as "type" or "globals" or "getattr", but the chances of doing that are fairly low, and this is something we could easily lint for.  However, if almost &lt;em&gt;every class&lt;/em&gt; in Python 3.0 provides access to objects that break CapPython's encapsulation (that is, method functions), so that the non-CapPython code must wrap every class, the risks of combining code in this way are significantly increased.

&lt;p&gt;
Ideally, I'd like to see this change in Python 3.0 reverted.  Unbound methods were scheduled for removal in a one-liner in &lt;a href="http://www.python.org/dev/peps/pep-3100/"&gt;PEP 3100&lt;/a&gt;.  This originated in a &lt;a href="http://www.artima.com/weblogs/viewpost.jsp?thread=86641"&gt;comment on Guido van Rossum's blog&lt;/a&gt; and &lt;a href="http://mail.python.org/pipermail/python-dev/2005-January/050625.html"&gt;a follow-on thread&lt;/a&gt;.  The motivation seems to be to simplify the language, which is often good, but not in this case.  However, I'm about 3 years too late, and Python 3.0 is scheduled to be released in the next few weeks.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-5810066586285153696?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/5810066586285153696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=5810066586285153696' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5810066586285153696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/5810066586285153696'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html' title='CapPython, unbound methods and Python 3.0'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>9</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-8420441188602214785</id><published>2008-09-14T13:00:00.003+01:00</published><updated>2008-09-14T13:13:06.181+01:00</updated><title type='text'>Dealing with modules and builtins in CapPython</title><content type='html'>In my previous post, &lt;a href="http://lackingrhoticity.blogspot.com/2008/08/introducing-cappython.html"&gt;Introducing CapPython&lt;/a&gt;, I wrote that CapPython "does not yet block access to Python's builtin functions such as open, and it does not yet deal with Python's module system".

&lt;p&gt;
Dealing with modules and builtins has turned out to be easier than I thought.

&lt;p&gt;
At the time, I had in mind that CapPython could work like the &lt;a href="http://code.google.com/p/joe-e/"&gt;Joe-E&lt;/a&gt; verifier.  Joe-E is a static verifier for an object-capability subset of Java.  If your Java code passes Joe-E, you can compile it, put it in your CLASSPATH, and load it with the normal Java class loader.  If your Joe-E code uses any external classes, Joe-E statically checks that these classes (and the methods used on them) have been whitelisted as safe.

&lt;p&gt;
I had envisaged that CapPython would work in a similar way.  Module imports would be checked against a whitelisted set.  Uses of builtin functions would be checked: &lt;tt&gt;len&lt;/tt&gt; would be allowed, but &lt;tt&gt;open&lt;/tt&gt; would be rejected.  CapPython code would be loaded through Python's normal module loader; code would be installed by adding it to PYTHONPATH as usual.  (One difference from Joe-E is that CapPython would not be able to statically block the use of a method from a particular class or interface, because Python is not statically typed.  CapPython can block methods only by their name.)

&lt;p&gt;
But there are some problems with this approach:

&lt;ul&gt;

&lt;li&gt; This provides a way to block builtins such as getattr and open, but not to replace them.  We would want to change getattr to reject (at runtime) attribute names starting with "&lt;tt&gt;_&lt;/tt&gt;".  We could introduce a &lt;tt&gt;safe_getattr&lt;/tt&gt; function, but we'd have to change code to import and use a differently named function.

&lt;/li&gt;&lt;li&gt; It makes it hard to modify or replace modules.

&lt;/li&gt;&lt;li&gt; In turn, that makes it hard to subset modules.  Suppose you want to allow &lt;tt&gt;os.path.join&lt;/tt&gt;, but not the rest of &lt;tt&gt;os&lt;/tt&gt;.  Doing this via a static check is awkward; it would have to block use of &lt;tt&gt;os&lt;/tt&gt; as a first class value.

&lt;/li&gt;&lt;li&gt; It makes it harder to apply rewriting to code, if it turns out that CapPython needs to be a rewriter and not just a verifier.

&lt;/li&gt;&lt;li&gt; It would require reasoning about Python's module loader.

&lt;/li&gt;&lt;li&gt; It's not clear who does the checking and who initiates the loading, so there could be a risk that the two are not operating on the same code.

&lt;/li&gt;&lt;li&gt; It relies on global state like PYTHONPATH, sys.modules and the filesystem.

&lt;/li&gt;&lt;li&gt; It doesn't let us instantiate modules multiple times.  Sometimes you want to instantiate a module multiple times because it contains mutable state, or because you want the instantiations to use different imports.  Some Python libraries have global state that would be awkward to remove, such as Twisted's default reactor (an event dispatcher).  Joe-E rejects global state by rejecting static variables, but this is much harder to do in Python.

&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;
There is a simpler, more flexible approach, which is closer to what E and Caja/Cajita do: implement a custom module loader, and load all CapPython code through that.  All imports that a CapPython module does would also go through the custom module loader.

&lt;p&gt;
It just so happens that Python provides enough machinery for that to work.  Python's &lt;tt&gt;exec&lt;/tt&gt; interface makes it possible to execute source code in a custom top-level scope, and that scope does not have to include Python's builtins if you set the special &lt;tt&gt;__builtins__&lt;/tt&gt; variable in the scope.  Behind the scenes, Python's "import" statement is implemented via a function called &lt;tt&gt;__import__&lt;/tt&gt; which the interpreter looks up from a module's top-level scope as &lt;tt&gt;__builtins__["__import__"]&lt;/tt&gt;, so it can be replaced on a per-module basis.

&lt;p&gt;
Rather than trying to statically verify that Python's default global scope (which includes its default unsafe module importer and unsafe builtins) is used safely, we just substitute a safe scope - "&lt;a href="http://www.eros-os.org/pipermail/e-lang/2008-August/012838.html"&gt;scope substitution&lt;/a&gt;", as Mark Miller referred to it.

&lt;p&gt;
This is now implemented in CapPython.  It has a "safe eval" function and a module loader.  The aim is that the module loader will be able to run as untrusted CapPython code.  Currently the module loader uses &lt;tt&gt;open&lt;/tt&gt; to read files, so it has the run of the filesystem, but eventually that will be tamed.

&lt;p&gt;
I am surprised that custom module loaders are not used more often in Python.

&lt;p&gt;
I imagine Joe-E could work the same way, because Java allows you to create new class loaders.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-8420441188602214785?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/8420441188602214785/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=8420441188602214785' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8420441188602214785'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8420441188602214785'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html' title='Dealing with modules and builtins in CapPython'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-8784272214286759372</id><published>2008-08-07T07:30:00.001+01:00</published><updated>2008-08-11T19:52:23.522+01:00</updated><title type='text'>Introducing CapPython</title><content type='html'>Python is not a language that provides encapsulation.  That is, it does not enforce any difference between the private and public parts of an object.  All attributes of an object are public from the language's point of view.  Even functions are not encapsulated: you can access the internals of a function through the attributes func_closure, func_globals, etc.

&lt;p&gt;
However, Python has a convention for private attributes of objects which is widely used.  It's written down in &lt;a href="http://www.python.org/dev/peps/pep-0008/"&gt;PEP 0008&lt;/a&gt; (from 2001).  Attributes that start with an underscore are private.  (Actually PEP 0008 uses the term "non-public" but let's put that aside for now.)

&lt;p&gt;
CapPython proposes that we enforce this convention by defining a subset of Python to enforce it.  The hope is that this subset could be an object-capability language.  Hopefully we can do this in such a way that you can get encapsulation by default and still have fairly idiomatic Python code.

&lt;p&gt;
The core idea is that private attributes may only be accessed through "self" variables.  (We have to expand the definition of "private attribute" to include attributes starting with "func_" and some other prefixes that are used for Python built-in objects.)

&lt;p&gt;
As an example, suppose we want to implement a read-only wrapper around dictionary objects:

&lt;pre&gt;
class FrozenDict(object):
    def __init__(self, dictionary):
        self._dict = dictionary
    def get(self, key):
        return self._dict.get(key)
    # This is incomplete: there are other methods in the dict interface.
&lt;/pre&gt;

You can do this:
&lt;pre&gt;
&gt;&gt;&gt; d = FrozenDict({"a": 1})
&gt;&gt;&gt; d.get("a")
1
&gt;&gt;&gt; d.set("a", 2)
AttributeError
&lt;/pre&gt;

but the following code is statically rejected:
&lt;pre&gt;
&gt;&gt;&gt; d._dict
&lt;/pre&gt;
because &lt;code&gt;_dict&lt;/code&gt; is a private attribute and &lt;code&gt;d&lt;/code&gt; is not a "self" variable.

&lt;p&gt;
A self variable is a variable that is the first argument of a method function.  A method function is a function defined on a class (with some restrictions to prevent method functions from escaping and being used in ways that would break encapsulation).

&lt;p&gt;
We also have to disallow all assignments to attributes (both public and private) except through "self".  This is a harsher restriction.  Otherwise a recipient of a FrozenDict could modify the object:

&lt;pre&gt;
def my_function(key):
    return "Not the dictionary item you expected"
d.get = my_function
&lt;/pre&gt;

and the FrozenDict instance would no longer be frozen.

&lt;p&gt;
This scheme has some nice properties.  As with lambda-style object definitions in E, encapsulation is enforced statically.  No type checking is required; it's just a syntactic check.  No run-time checks need to be added.

&lt;p&gt;
Furthermore, instance objects do not need to take any special steps to defend themselves; they are encapsulated by default.  We don't need to wrap all objects to hide their private attributes (which is the approach that some attempts at a safer Python have taken).  Class definitions do not need to inherit from some special base class.  This means that TCB objects can be written in normal Python and passed into CapPython safely; they are defended by default from CapPython code.

&lt;p&gt;
However, class objects are not encapsulated by default.  A class object has at least two roles: it acts as a constructor function, and it can be used to derive new classes.  The new classes can access their instance objects' private attributes (which are really "protected" attributes in Java terminology - one reason why PEP 0008 does not use the word "private").  So you might want to make a class "final", as in not inheritable.  One way to do that is to wrap the class so that the constructor is available, but the class itself is not:

&lt;pre&gt;
class FrozenDict(object):
    ...
def make_frozen_dict(*args):
    return FrozenDict(*args)
&lt;/pre&gt;

The function &lt;code&gt;make_frozen_dict&lt;/code&gt; is what you would export to other modules, while &lt;code&gt;FrozenDict&lt;/code&gt; would be closely-held.

&lt;p&gt;
Maybe this wrapping should be done by default so that the class is encapsulated by default, but it's not yet clear how best to do so, or how the default would be overridden.

&lt;p&gt;
I have started writing a static verifier for CapPython.  The code is &lt;a href="https://launchpad.net/cappython"&gt;on Launchpad&lt;/a&gt;.  It is not yet complete.  It does not yet block access to Python's builtin functions such as &lt;code&gt;open&lt;/code&gt;, and it does not yet deal with Python's module system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-8784272214286759372?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/8784272214286759372/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=8784272214286759372' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8784272214286759372'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8784272214286759372'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/08/introducing-cappython.html' title='Introducing CapPython'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3888382143307542639.post-8144945872399458155</id><published>2008-08-05T14:45:00.000+01:00</published><updated>2008-08-05T19:23:35.133+01:00</updated><title type='text'>Four Python variable binding oddities</title><content type='html'>Python has some strange variable binding semantics.  Here are some examples.

&lt;p&gt;
&lt;strong&gt;Oddity 1:&lt;/strong&gt; If Python were a normal lambda language, you would expect the expression &lt;code&gt;x&lt;/code&gt; to be equivalent to &lt;code&gt;(lambda: x)()&lt;/code&gt;.  I mean &lt;code&gt;x&lt;/code&gt; to be a variable name here, but you would expect the equivalence to hold if &lt;code&gt;x&lt;/code&gt; were any expression.  However, there is one context in which the two are not equivalent: class scope.
&lt;pre&gt;x = 1
class C:
    x = 2
    print x
    print (lambda: x)()&lt;/pre&gt;Expected output:
&lt;pre&gt;2
2&lt;/pre&gt;Actual output:
&lt;pre&gt;2
1&lt;/pre&gt;

There is a fairly good reason for this.

&lt;p&gt;
&lt;strong&gt;Oddity 2:&lt;/strong&gt; This is also about class scope.  If you're familiar with Python's list comprehensions and generator expressions, you might expect list comprehensions to be just a special case of generators that evaluates the sequence up-front.

&lt;pre&gt;
x = 1
class C:
    x = 2
    print [x for y in (1,2)]
    print list(x for y in (1,2))
&lt;/pre&gt;

Expected output:
&lt;pre&gt;
[2, 2]
[2, 2]
&lt;/pre&gt;

Actual output:
&lt;pre&gt;
[2, 2]
[1, 1]
&lt;/pre&gt;

This happens for a mixture of good reasons and bad reasons.  List comprehensions and generators have different variable binding rules.  Class scopes are somewhat odd, but they are at least consistent in their oddness.  If list comprehensions and generators are brought into line with each other, you would actually expect to get this output:
&lt;pre&gt;
[1, 1]
[1, 1]
&lt;/pre&gt;

Otherwise class scopes would not behave as consistently.


&lt;p&gt;
&lt;strong&gt;Oddity 3:&lt;/strong&gt;

&lt;pre&gt;
x = "top"
print (lambda: (["a" for x in (1,2)], x))()
print (lambda: (list("a" for x in (1,2)), x))()
&lt;/pre&gt;

Expected output might be:
&lt;pre&gt;
(['a', 'a'], 'top')
(['a', 'a'], 'top')
&lt;/pre&gt;

Or if you're aware of list comprehension oddness, you might expect it to be:

&lt;pre&gt;
(['a', 'a'], 2)
(['a', 'a'], 2)
&lt;/pre&gt;
(assuming this particular ordering of the "print" statements)

But it's actually:
&lt;pre&gt;
(['a', 'a'], 2)
(['a', 'a'], 'top')
&lt;/pre&gt;

If you thought that you can't assign to a variable in an expression in Python, you'd be wrong.  This expression:
&lt;pre&gt;
[1 for x in [100]]
&lt;/pre&gt;
is equivalent to this statement:
&lt;pre&gt;
x = 100
&lt;/pre&gt;

&lt;strong&gt;Oddity 4:&lt;/strong&gt; Back to class scopes again.

&lt;pre&gt;
x = "xtop"
y = "ytop"
def func():
    x = "xlocal"
    y = "ylocal"
    class C:
        print x
        print y
        y = 1
func()
&lt;/pre&gt;

Naively you might expect it to print this:
&lt;pre&gt;
xlocal
ylocal
&lt;/pre&gt;
If you know a bit more you might expect it to print something like this:
&lt;pre&gt;
xlocal
Traceback ... UnboundLocalError: local variable 'y' referenced before assignment
(or a NameError instead of an UnboundLocalError)
&lt;/pre&gt;
Actually it prints this:
&lt;pre&gt;
xlocal
ytop
&lt;/pre&gt;

I think this is the worst oddity, because I can't see a good use for it.

For comparison, if you replace "class C" with a function scope, as follows:

&lt;pre&gt;
x = "xtop"
y = "ytop"
def func():
    x = "xlocal"
    y = "ylocal"
    def g():
        print x
        print y
        y = 1
    g()
func()
&lt;/pre&gt;
then you get:
&lt;pre&gt;
xlocal
Traceback ... UnboundLocalError: local variable 'y' referenced before assignment
&lt;/pre&gt;

I find that more reasonable.


&lt;p&gt;
&lt;strong&gt;Why bother?&lt;/strong&gt;
These issues become important if you want to write a verifier for an object-capability subset of Python.  Consider an expression like this:

&lt;pre&gt;
(lambda: ([open for open in (1,2)], open))()
&lt;/pre&gt;

It could be completely harmless, or it might be so dangerous that it could give the program that contains it the ability to read or write any of your files.  You'd like to be able to tell.  This particular expression is harmless.  Or at least it is harmless until a new release of Python changes the semantics of list comprehensions...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3888382143307542639-8144945872399458155?l=lackingrhoticity.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://lackingrhoticity.blogspot.com/feeds/8144945872399458155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3888382143307542639&amp;postID=8144945872399458155' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8144945872399458155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3888382143307542639/posts/default/8144945872399458155'/><link rel='alternate' type='text/html' href='http://lackingrhoticity.blogspot.com/2008/08/4-python-variable-binding-oddities.html' title='Four Python variable binding oddities'/><author><name>Mark Seaborn</name><uri>http://www.blogger.com/profile/08046205947658697263</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry></feed>
