Wednesday, 17 December 2008

Helper for monkey patching in tests

Following on from my post on TempDirTestCase, here is another Python test case helper which I introduced at work. Our codebase has quite a few test cases which use monkey patching. That is, they temporarily modify a module or a class to replace a function or method for the duration of the test.

For example, you might want to monkey patch time.time so that it returns repeatable timestamps during the test. We have quite a lot of test cases that do something like this:

class TestFoo(unittest.TestCase):

    def setUp(self):
        self._old_time = time.time
        def monkey_time():
            return 0
        time.time = monkey_time

    def tearDown(self):
        time.time = self._old_time

    def test_foo(self):
        # body of test case
Having to save and restore the old values gets tedious, particularly if you have to monkey patch several objects (and, unfortunately, there are a few tests that monkey patch a lot). So I introduced a monkey_patch() method so that the code above can be simplified to:
class TestFoo(TestCase):

    def test_foo(self):
        self.monkey_patch(time, "time", lambda: 0)
        # body of test case
(OK, I'm cheating by using a lambda the second time around to make the code look shorter!)

Now, monkey patching is not ideal, and I would prefer not to have to use it. When I write new code I try to make sure that it can be tested without resorting to monkey patching. So, for example, I would parameterize the software under test to take time.time as an argument instead of getting it directly from the time module. (here's an example).

But sometimes you have to work with a codebase where most of the code is not covered by tests and is structured in such a way that adding tests is difficult. You could refactor the code to be more testable, but that risks changing its behaviour and breaking it. In that situation, monkey patching can be very useful. Once you have some tests, refactoring can become easier and less risky. It is then easier to refactor to remove the need for monkey patching -- although in practice it can be hard to justify doing that, because it is relatively invasive and might not be a big improvement, and so the monkey patching stays in.

Here's the code, an extended version of the base class from the earlier post:

import os
import shutil
import tempfile
import unittest

class TestCase(unittest.TestCase):

    def setUp(self):
        self._on_teardown = []

    def make_temp_dir(self):
        temp_dir = tempfile.mkdtemp(prefix="tmp-%s-" % self.__class__.__name__)
        def tear_down():
            shutil.rmtree(temp_dir)
        self._on_teardown.append(tear_down)
        return temp_dir

    def monkey_patch(self, obj, attr, new_value):
        old_value = getattr(obj, attr)
        def tear_down():
            setattr(obj, attr, old_value)
        self._on_teardown.append(tear_down)
        setattr(obj, attr, new_value)

    def monkey_patch_environ(self, key, value):
        old_value = os.environ.get(key)
        def tear_down():
            if old_value is None:
                del os.environ[key]
            else:
                os.environ[key] = old_value
        self._on_teardown.append(tear_down)
        os.environ[key] = value

    def tearDown(self):
        for func in reversed(self._on_teardown):
            func()

6 comments:

Zooko said...

Here is a similar tool that I use:

pyutil.fileutil.NamedTemporaryDirectory

Zooko said...

Oh wait, I was thinking of your earlier post about temporary directories. *Here* is a tool that I use that has to do with timestamps:

pyutil.repeatable_random

Mark Seaborn said...

Be careful about using __del__ to delete files like this. Changes in GC behaviour could break your code. If you get the filename from the object and then discard the object, the file could have been deleted when you come to refer to it by filename.

What I'd like to do is replace filenames with file objects - an experiment that would be easy to do with CapPython. open() and os.path.join() would be replaced so that they worked on file objects. Then it would be safe to garbage collect temporary directories.

It would be interesting to see how much code survives such a change.

Zooko said...

Are you warning that the file could already have been deleted before the shutdown() method attempts to delete it? That's okay. Or are you warning that a different file could have taken its place and then shutdown() would delete the wrong file? That shouldn't happen due to the sufficiently large random name.

Zooko said...

P.S. I totally agree about a more "object oriented" file API. There are several alternative file APIs out there for Python, but I haven't looked at them. Have you?

Mark Seaborn said...

I mean that if you do something like this:

dirpath = NamedTemporaryDirectory().name
os.mkdir(os.path.join(dirpath, "subdir"))

Won't the temporary directory have been deleted by __del__ by the time you do mkdir(), when using reference counting? In which case mkdir() will fail.

You can do:

temp = NamedTemporaryDirectory()
os.mkdir(os.path.join(temp.name, "subdir"))

But this is relying on CPython's refcounting behaviour.