Mocks as a Design Tool

Many people see mocks as a necessary evil to isolate their test code from third party dependencies and the outside world (the database, network, filesystem, etc). But in the paper “Mock Roles, Not Objects“, some of the first people to describe mocks describe them as a tool used in TDD to discover good interactions between your objects (i.e. design good types). They are much more powerful, and their costs are more reasonable, when they are used as a design tool,  and not just a convenience tool for isolating your tests.

Note: “Mock” is a loaded word often used to describe any type of test double, but this article will be speaking about mocks in the strict sense. If you don’t know what that means, first read The Little Mocker by Uncle Bob. It’s the best explanation I’ve seen of the different types of test doubles. Further note: all of this will also apply to some implementations of spies.

To understand how mock objects can be used as a design tool, it helps to to think about object-oriented programming as being all about messaging. In OOP, we don’t just have procedures that we can call, we have objects that we can ask questions or give commands to. Those questions and commands are messages that we send to the object. When you write my_model.save(), try thinking of it as telling the my_model object to save itself.

So if OOP is about messages, what are mock objects used for? Verifying messages! You should use a mock when you are testing something that interacts with another object, and you want to verify that you have told that other object to do something – i.e. assert that you sent it a particular message. And when you are writing your test first, you literally get to make up what that message looks like.

This is how mocks are used as a design tool in TDD. You work outside-in: start at a high level, and delegate details to lower levels. Mock those lower levels because right now you only care about telling them what to do. You’ll worry about how they do it later, when you’re ready to test that level.

In other words, you design your messages from the perspective of the message sender, the perspective that cares most about what you want that object to do, and least about how that object does it. This leads to messages that are simple and communicate well. And that leads to an object API that is simple and communicates well.

When done right, it feels like cheating. Your high-level tests almost feel like they aren’t testing anything. That’s good. These high level tests aren’t about verifying algorithms or reducing bugs. They are about designing your messages. It’s part of a TDD process to design code that is easy to understand and maintain. This high level code is easy to test because it’s easy to understand. It also helps lead to low-level code that is easy to test and understand because you’ve shaken out all the object collaboration in the higher levels, leaving simple procedures that can be tested without mocks.

But you only get these design benefits if you own the API of the object you’re mocking. You may have heard that you should not mock what you don’t own. Some libraries even strictly enforce this rule. But what does that mean? Why is it important?

When you “mock something you don’t own”, like a third-party dependency or something in stdlib, you can’t let your tests help you decide what the messages should be, because those choices have already been made. So if you only use mocks in this way, you are only getting what should be a side-effect of mocking, with none of the design benefits. And that leads to pain, because mocks have high costs. They give you plenty of rope to hang yourself with: increased coupling between test and implementation, potential for “false positives”, and increased setup costs. Many people don’t like mocks for these reasons, and if you aren’t using them primarily to design messages, I agree, they aren’t worth it.

So how do you mitigate those costs? What exactly should you do when you have an external dependency? What does this all look like in practice? I’m still writing about those topics and more, and planning to release it as a series about mocking and TDD. If you’d like to be emailed when it is complete, subscribe to my newsletter. In the meantime, try using mocks to design the interactions between your objects. Used in this way, they can become a powerful part of your TDD tool belt.

tdubs: better test doubles for python

A couple things have been bothering me about python’s unittest.mock:

Problem 1: Stubs aren’t Mocks

Here’s a function (that is stupid and dumb because this is an example):

def get_next_page(repo, current_page):
    return repo.get_page(current_page + 1)

If I want to test this with unittest.mock, it would look like this:

def test_it_gets_next_page_from_repo(self):
    repo = Mock()
    next_page = get_next_page(repo, current_page=1)
    self.assertEqual(next_page, repo.get_page.return_value)
    repo.get_page.assert_called_with(2)

What bothers me is that I’m forced to use a mock when what I really want is a stub. What’s the difference? A stub is a test double that provides canned responses to calls. A mock is a test double that can verify what calls are made.

Look at the implementation of get_next_page. To test this, all I really need is a canned response to repo.get_page(2). But with unittest.mock, I can only give a canned response for any call to repo.get_page. That’s why I need the last line of my test to verify that I called the method with a 2. It’s that last line that bothers me.

If I’m writing tests that explicitly assert that specific calls were made, I prefer those to be verifying commands, not queries. For example, imagine I have some code that looks like this:

# ...
article.publish()
# ...

with tests like this:

def test_it_publishes_the_article(self):
    article.publish.assert_called_once_with()

Now the assertion in my test feels right. I’m telling the article to publish, so my test verifies that I sent the publish message to the article. My tests are verifying that I sent a command, I triggered some behavior that’s implemented elsewhere. Feels good. But wait…

Problem 2: Public API conflicts

Here’s the other problem. Imagine I had a typo in my test:

def test_it_publishes_the_article(self):
    article.publish.assertt_called_once_with()

Notice the extra “t” in “assert”? I hope so, because this test will pass even if article.publish is never called. Because every method called on a unittest.mock.Mock instance returns another Mock instance.

The problem here is that python’s mocks have their own public api, but they are supposed to be stand-ins for other objects that themselves have a public api. This causes conflicts. Have you ever tried to mock an object that has a name attribute? Then you’ve felt this pain (passing name as a Mock kwarg doesn’t stub a name attribute like you think it would, instead if names the mock).

Doesn’t autospec fix this problem?

autospec is an annoying bandage over this problem. It doesn’t fit into my normal TDD flow where I use the tests to tease out a collaborator’s public API before actually writing it.

Solution: tdubs

I decided to write my own test double library to fix these problems, and I am very happy with the result. I called it tdubs. See the README for installation and usage instructions. In this post I’m only going to explain the parts that solve the problems I described above.

In tdubs, stubs and mocks are explicit. If you want to give canned responses to queries, use a Stub. If you want to verify commands, use a Mock. (you want to do both? rethink your design [though it’s technically possible with a Mock])

A Stub can provide responses that are specific to the arguments passed in. This lets you create true stubs. In the example above, using tdubs I could have stubbed my repo like this:

repo = Stub('repo')
calling(repo.get_page).passing(2).returns(next_page)

and I would not need to verify my call to repo.get_page, because I would only get my expected next page object if I pass 2 to the method.

With tdubs, there’s no chance of false positives due to typos or API conflicts, because tdubs doubles have no public attributes. For example, you don’t verify commands directly on a tdubs Mock, you use a verification object:

verify(my_mock).called_with(123)

After hammering out the initial implementation to solve these specific problems, I ended up really liking the way my tests read and the type of TDD flow that tdubs enabled. I’ve been using it for my own projects since then and I think it’s ready to be used by others. So if you’re interested, visit the the readme and try it out. I’d love some feedback.

Python’s patch decorator is a code smell

I’m a big fan of using mocks as a testing/design tool. But if I find myself reaching for patch instead of Mock in python, I usually stop and rethink my design.

I consider the use of patch in tests to be a code smell. It means the test code is not using my internal API. It’s reaching in to the private implementation details of my object.

For example, I recently needed a helper function for creating users on a third-party service with a set of default values. I could have written it like this:

from services import UserService

from settings import SERVICE_CONF


def create_user_with_defaults(**attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)

  service = UserService(**SERVICE_CONF)
  return service.create_user(**defaults)

This would get the job done. And because this is python, I can test it without hitting real services using @patch:

@patch("users.helpers.UserService")
def test_creates_user_with_defaults_on_user_service(self, MockUserService):
  user_service = MockUserService.return_value
  
  # execution:
  user = create_user_with_defaults()
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

But look at the verification step: there is nothing in the execution step about user_service, yet that’s what I’m asserting against. My tests have knowledge about private implementation details of the thing they’re testing. That’s bad news.

I prefer my tests to be normal consumers of my internal APIs. This forces me to keep my APIs easy to use and flexible. @patch lets me get around issues like tight coupling by hijacking my hard-coded dependencies.

Here is how I actually implemented the helper function:

def create_user_with_defaults(service, **attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)
  return service.create_user(**defaults)

I didn’t even need to import anything! This is how I would test it:

def test_creates_user_with_defaults_on_user_service(self):
  user_service = Mock()
  
  # execution:
  user = create_user_with_defaults(user_service)
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

Now compare the verification to the execution. Instead of patching the internal workings of the module, I’m explicitly passing in a mock object. I can do this because the function no longer depends on the concrete implementation of the user service, it depends on an abstraction*: some object that must be passed in that conforms to a certain interface. So it makes sense that my test verifies the interaction with that interface.

This means my test is now a normal consumer of my function, and my desire to avoid patch led me to a design that is more flexible. This became clear as soon as I wanted to create some test users in the repl. I happily created an instance of the UserService that uses the settings for our sandbox, and passed that in to my function.

*See The Dependency Inversion Principle (the D from SOLID).