Sunday 24 May 2015

Passing FDs/handles between processes on Unix and Windows -- a comparison

Handles on Windows are analogous to file descriptors (FDs) on Unix, and both can be passed between processes. However, the way in which handles/FDs can be passed between processes is quite different on Unix and Windows.

In this blog post I'll explain the difference. You might find this useful if you are familiar with systems programming on either Unix or Windows but not both.

Similarities

I'll first explain what's the same on Unix and Windows. Both OSes have a distinction between FD/handle numbers and FD/handle objects.

On both Windows and Unix, each process has its own FD/handle table which maps from FD/handle numbers to FD/handle objects:

  • FD numbers are indexes into the FD table. On Unix, an FD number is an int.

    Windows uses the HANDLE type for handle numbers. Though HANDLE is typedef'd to void *, a HANDLE is really just a 32-bit index. (Windows does not use a HANDLE as a pointer into the process's address space.)

  • FD objects are what FD numbers map to. User code never gets to see FD objects directly: it can only manipulate them via FD numbers. Multiple FD numbers can map to the same FD object. FD objects are (mostly) reference counted.

    All of this applies to handles on Windows too.

Note: Some people use the alternative terminology that a "file description" refers to the underlying object while "file descriptor" refers to the number. I prefer to add "number" or "object" as a suffix as the way to disambiguate -- it is more explicit, and the term "file description" is not often used.

Differences

The key difference between Unix and Windows is this:

On Unix, FD objects can be sent via sockets in messages. On Windows, handle objects cannot be sent in messages; only handle numbers can.

Windows fills this gap by allowing one process to read or modify another process's handle table synchronously using the DuplicateHandle() API. Using this API involves one process dealing with another process's handle numbers.

In contrast, Unix has no equivalent to DuplicateHandle(). A Unix process's FD table is private to the process. Consequently, on Unix it is much rarer for a process to have dealings with another process's FD numbers.

On Windows, to send a handle to another process, the sender will generally call two system calls:

  • Firstly, the sender must call DuplicateHandle() to copy the handle to the destination process. This requires the sender to have a process handle for the destination process. DuplicateHandle() will return a handle number indexing into the destination process's handle table.

  • Secondly, the sender must communicate the handle number to the destination process, e.g. by sending a message containing the number via a pipe using WriteFile().

On Unix, to send a handle to another process, the sender will call just one system call, sendmsg(), for sending a message via a Unix domain socket:

  • The sender's call to sendmsg() specifies an FD number to send via the cmsg/SCM_RIGHTS interface. The kernel translates this to an FD object, and copies the reference to this object into the socket's in-kernel message buffer.

  • The receiver can call recvmsg() to receive the message. recvmsg() will remove the FD object from the socket's message buffer and add the FD object to the calling process's FD table, allocating a new FD number within that table.

Implications

  • Cycles and GC: On Unix, it is possible to create a reference cycle from a socket to itself, via the socket's own message buffer. This means it is possible that a socket FD object is only referenced by its own message buffer. Reference counting alone wouldn't be enough to free an unreachable socket such as that. Linux therefore has an in-kernel garbage collector (GC) that handles freeing unused socket FD objects when there are reference cycles.

    In contrast, Windows doesn't need a GC for handling cycles between handles.

  • Process handles: Windows has a notion of process handles, which are used by the DuplicateHandle() API. Unix does not have an equivalent concept of process FDs as standard, although FreeBSD has process FDs (added as part of the Capsicum API).

  • Sandboxing: It is easier to handle sandboxed processes with the Unix model than with the Windows model.

    On Unix, it is easy for mutually distrusting processes to exchange FDs, because the kernel handles translating FD numbers between processes.

    On Windows, if two processes want to exchange handles, one will generally have a process handle for the other process, which gives the first process control over the latter process. An alternative is to use a broker process for exchanging handles. Chrome's Windows sandbox implements this for exchanging handles between sandboxed processes (see BrokerDuplicateHandle() in content/common/sandbox_win.cc).

  • Namespace hazard: Windows' handle-passing model creates a potential hazard: If DuplicateHandle() gives you a HANDLE that's an index into another process's handle table, you must be careful not to treat it as an index into your process's handle table. For example, don't call CloseHandle() on it, otherwise you might close an unrelated handle.

    It is much rarer for this hazard to arise on Unix.

  • Unix emulation: The difference between the FD/handle-passing models makes it tricky to emulate Unix domain sockets on Windows.

    An accurate emulation would need to implement the semantics of how an FD object can be stored in the message buffer of a socket that might be read from concurrently by multiple processes.

    Cygwin implements an emulation of Unix domain sockets on Windows, but it currently does not implement FD-passing.

(Updated on 2015/07/03 [previous version]: Corrected the post to say that Cygwin does not implement FD-passing.)

3 comments:

Anonymous said...

cygwin emulates unix local sockets using loopback tcp/ip socket connections (!), apparently with a small protocol on top with a random secret key for some basic level of security. Yes, this completely negates any performance and some security advantages and useful properties of unix local sockets over tcp/ip sockets on cygwin, but cygwin is basically there to facilitate running unix/linux/posix code on windows at all, not ultimate performance, I suppose. Note in particular actually passing descriptors over descriptors (a very useful and increasingly popular trick on modern unix/linux as it can amount to a simple and strong Capability-based Security model) apparently doesn't currently work!

AF_UNIX (AF_LOCAL) sockets are not available in Winsock. They are implemented in Cygwin by using local AF_INET sockets instead. This is completely transparent to the application. Cygwin's implementation also supports the getpeereid BSD extension. However, Cygwin does not yet support descriptor passing.

Anonymous said...

I needed to pass file descriptors on Linux and found a lot of questions about how to do this but few useful examples. So, I've put a copy of my code on github: https://github.com/vomlehn/hord-priv. The code includes a test program that uses pipes to communicate between two processes.

Mark Seaborn said...

@Anonymous #1: Thanks for pointing that out about Cygwin. I have corrected the post.