Solve tough problems with spikes

Sometimes I’m approaching a problem where I lack some understanding that would let me start with nice little unit tests. So instead, I start with a high-level functional test. Then I start getting the code to work by any means necessary. I do this without writing any more tests. I just cowboy-code my way to a solution that works. Extreme Programming calls this a Spike Solution.

When my functional test is green, I have much more understanding. I’ve been googling, looking up and using new libraries, and usually have a better idea of what a clean solution might look like. This is when I throw my code away.

Well, most of it. I keep the functional test. And if I have any particularly tricky code I might want for reference, I keep it separate from the project code, but available to refer to if I need it. Then I jump down into my unit tests and try to make my functional test green again using a proper test-driven approach.

It can be very hard to discard working code, but when I think back to every time I’ve lost some writing work unintentionally – an essay, blog post, homework assignment – the second draft is always better. I think the same is true for writing code.

Python’s patch decorator is a code smell

I’m a big fan of using mocks as a testing/design tool. But if I find myself reaching for patch instead of Mock in python, I usually stop and rethink my design.

I consider the use of patch in tests to be a code smell. It means the test code is not using my internal API. It’s reaching in to the private implementation details of my object.

For example, I recently needed a helper function for creating users on a third-party service with a set of default values. I could have written it like this:

from services import UserService

from settings import SERVICE_CONF


def create_user_with_defaults(**attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)

  service = UserService(**SERVICE_CONF)
  return service.create_user(**defaults)

This would get the job done. And because this is python, I can test it without hitting real services using @patch:

@patch("users.helpers.UserService")
def test_creates_user_with_defaults_on_user_service(self, MockUserService):
  user_service = MockUserService.return_value
  
  # execution:
  user = create_user_with_defaults()
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

But look at the verification step: there is nothing in the execution step about user_service, yet that’s what I’m asserting against. My tests have knowledge about private implementation details of the thing they’re testing. That’s bad news.

I prefer my tests to be normal consumers of my internal APIs. This forces me to keep my APIs easy to use and flexible. @patch lets me get around issues like tight coupling by hijacking my hard-coded dependencies.

Here is how I actually implemented the helper function:

def create_user_with_defaults(service, **attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)
  return service.create_user(**defaults)

I didn’t even need to import anything! This is how I would test it:

def test_creates_user_with_defaults_on_user_service(self):
  user_service = Mock()
  
  # execution:
  user = create_user_with_defaults(user_service)
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

Now compare the verification to the execution. Instead of patching the internal workings of the module, I’m explicitly passing in a mock object. I can do this because the function no longer depends on the concrete implementation of the user service, it depends on an abstraction*: some object that must be passed in that conforms to a certain interface. So it makes sense that my test verifies the interaction with that interface.

This means my test is now a normal consumer of my function, and my desire to avoid patch led me to a design that is more flexible. This became clear as soon as I wanted to create some test users in the repl. I happily created an instance of the UserService that uses the settings for our sandbox, and passed that in to my function.

*See The Dependency Inversion Principle (the D from SOLID).

The Only Three Reasons My Unit Test Should Fail

When I practice TDD, my goal is to design a system with a test suite that encourages good design and enables confident refactoring. One of the ways I do that is by keeping in mind when a test should fail. Ideally, I only want to see test failures for one of these three reasons:

  • The unit’s not built yet
  • The unit’s public interface changes
  • The unit’s behavior changes

I consider these rules. If they are all true, I must not see a test failure. If any of them are not true, I must see a test failure.

Reason 1: The unit’s not built yet

Of course a test for a thing that doesn’t exist will fail. But this is a failure that I must see for each test at least once. It is how I know the test is actually testing what I think it’s testing. It’s surprisingly easy to write a test for something after it’s built, but write it in a way that it can never actually fail (and is therefore useless). This is one of the reasons why I prefer to write my tests first.

Reason 2: Public Interface Changes

If I am testing an object’s method, and the way I init the object or call the method changes, my test should fail. This seems obvious, but notice that I only want my tests to fail if my public interface changes. I don’t want my tests to depend on private implementation details. This means my tests are a consumer of my internal API. In other words, my tests interact with my code the same way my production code interacts with it. This is what puts good design pressure on my system and is the driving force behind the idea that “code that’s easy to test is easy to write/read/change”. It also keeps my code easier to refactor, which I’ll get to later.

Reason 3: Behavior Changes

If the behavior of my unit changes, my test should fail. This also seems obvious, but again, I want to dig in to what this means for my system.

For example, consider this unit:

class Greeter(object):
  def greet(self, name):
    return "Hello %s" % name

My test might look something like this:

def test_greet_says_hello_to_name(self):
  greeter = Greeter()
  self.assertEqual(greeter.greet("Justin"), "Hello Justin")

If greet changes to write to a display instead of returning a string, this test will fail. That’s good because this is a change in behavior. That change will require a change to any consumer of the unit, and my test is a consumer of that unit just like the production code is.

Confident Refactoring

I use these rules to keep good design pressure on my system and keep things easy to refactor. For example, let’s say we do want to write to a display in that Greeter class. One implementation might look like this:

class Greeter(object):
  def greet(self, name):
    sys.stdout.write("Hello %s" % name)

I often see code like this tested with a mock in this way:

@patch('greeter.sys.stdout')
def test_greet_says_hello_to_name(self, stdout):
  greeter = Greeter()
  greeter.greet("Justin")
  stdout.write.assert_called_once_with("Hello Justin")

But this breaks one of my rules: this test is coupled to a private implementation detail of my unit. There’s nothing in my public interface that conveys my usage of sys.stdout yet my test knows to mock exactly that module. If that implementation changes, my test will fail, even though my public interface and the behavior has not changed.

This makes my code hard to refactor. When I’m refactoring, I only want to look at the implementation. I want to be free to change private details without fear of breaking tests (and therefore breaking production code) as long as I’m not changing the behavior or the public interface. That’s not true here. If I change my usage of sys.stdout to something else, I will get a surprising test failure.

What does this mean? I can’t think of a good way to test this without mocking, but I don’t want to use a mock that breaks a rule, so in this case, I think my test is telling me my design needs improvement. I would choose to fix this by promoting the display object to the public interface. That lets me inject a mock in my test without breaking a rule:

class Greeter(object):
  def __init__(self, display):
    self.display = display

  def greet(self, name):
    self.display.write("Hello %s" % name)

Now my test can still mock the display, but it does it by being a normal consumer of the object, just the same as any production code:

def test_greet_says_hello_to_name(self, stdout):
  display = Mock()
  greeter = Greeter(display)
  greeter.greet("Justin")
  display.write.assert_called_once_with("Hello Justin")

Now it is clear that if I change how I print to a display, I will need to update any consumer of the greeter object (including the tests), because the display object is part of the public interface. The rules have led me to a better design that uses the Dependency Inversion Principle.

Only letting my tests use the public interface of my units has given the unit more flexibility outside of my tests, and made it clear to me what is and is not a safe refactoring.