tdubs: better test doubles for python

A couple things have been bothering me about python’s unittest.mock:

Problem 1: Stubs aren’t Mocks

Here’s a function (that is stupid and dumb because this is an example):

def get_next_page(repo, current_page):
    return repo.get_page(current_page + 1)

If I want to test this with unittest.mock, it would look like this:

def test_it_gets_next_page_from_repo(self):
    repo = Mock()
    next_page = get_next_page(repo, current_page=1)
    self.assertEqual(next_page, repo.get_page.return_value)
    repo.get_page.assert_called_with(2)

What bothers me is that I’m forced to use a mock when what I really want is a stub. What’s the difference? A stub is a test double that provides canned responses to calls. A mock is a test double that can verify what calls are made.

Look at the implementation of get_next_page. To test this, all I really need is a canned response to repo.get_page(2). But with unittest.mock, I can only give a canned response for any call to repo.get_page. That’s why I need the last line of my test to verify that I called the method with a 2. It’s that last line that bothers me.

If I’m writing tests that explicitly assert that specific calls were made, I prefer those to be verifying commands, not queries. For example, imagine I have some code that looks like this:

# ...
article.publish()
# ...

with tests like this:

def test_it_publishes_the_article(self):
    article.publish.assert_called_once_with()

Now the assertion in my test feels right. I’m telling the article to publish, so my test verifies that I sent the publish message to the article. My tests are verifying that I sent a command, I triggered some behavior that’s implemented elsewhere. Feels good. But wait…

Problem 2: Public API conflicts

Here’s the other problem. Imagine I had a typo in my test:

def test_it_publishes_the_article(self):
    article.publish.assertt_called_once_with()

Notice the extra “t” in “assert”? I hope so, because this test will pass even if article.publish is never called. Because every method called on a unittest.mock.Mock instance returns another Mock instance.

The problem here is that python’s mocks have their own public api, but they are supposed to be stand-ins for other objects that themselves have a public api. This causes conflicts. Have you ever tried to mock an object that has a name attribute? Then you’ve felt this pain (passing name as a Mock kwarg doesn’t stub a name attribute like you think it would, instead if names the mock).

Doesn’t autospec fix this problem?

autospec is an annoying bandage over this problem. It doesn’t fit into my normal TDD flow where I use the tests to tease out a collaborator’s public API before actually writing it.

Solution: tdubs

I decided to write my own test double library to fix these problems, and I am very happy with the result. I called it tdubs. See the README for installation and usage instructions. In this post I’m only going to explain the parts that solve the problems I described above.

In tdubs, stubs and mocks are explicit. If you want to give canned responses to queries, use a Stub. If you want to verify commands, use a Mock. (you want to do both? rethink your design [though it’s technically possible with a Mock])

A Stub can provide responses that are specific to the arguments passed in. This lets you create true stubs. In the example above, using tdubs I could have stubbed my repo like this:

repo = Stub('repo')
calling(repo.get_page).passing(2).returns(next_page)

and I would not need to verify my call to repo.get_page, because I would only get my expected next page object if I pass 2 to the method.

With tdubs, there’s no chance of false positives due to typos or API conflicts, because tdubs doubles have no public attributes. For example, you don’t verify commands directly on a tdubs Mock, you use a verification object:

verify(my_mock).called_with(123)

After hammering out the initial implementation to solve these specific problems, I ended up really liking the way my tests read and the type of TDD flow that tdubs enabled. I’ve been using it for my own projects since then and I think it’s ready to be used by others. So if you’re interested, visit the the readme and try it out. I’d love some feedback.

How to Practice

In my previous two blog posts (here and here) I mentioned practicing. The posts cover the reasons why we should practice, but they don’t mention how to practice. Here are some resources for that:

Google code katas. These are sample programming problems that you can solve over and over in any language. They aren’t about solving hard problems, they are about practicing and forming good habits.

Similarly, google koans for your language of choice. These may feel like beginner tutorials, but they are about practicing until you have the nitty gritty details of your language committed to muscle memory.

exercism.io offers guided programming problems with pre-written tests for many languages.

The Pragmatic Programmer recommends learning a new language every year. This is great practice even if you don’t get to use those languages in your day job. Try to pick languages in a different paradigm than what you’re used to. This will often help you see better solutions to problems in any language. You can use the previous resources to learn these new languages.

For meditation, google for a mindfulness mobile app so you can have it anywhere, and as a little reminder every time you look at your phone. I use The Mindfulness App on iOS. I like it because it offers both guided meditations in different lengths, as well as meditation timers I can set to any length of time I want. Great for starting out with 1 or 2 minute meditations.

There are many ways to practice, these are just a few suggestions. However you do it, remember the goal is to train so that when the pressure is on, you default to good code instead of bad.

Meditation as Training

When we train for something, it’s to form a habit. We want some behavior to become instinct. We want to do it without having to decide to do it.

Meditation is practicing focusing your mind on one thing and nothing else. Usually it’s your breathing. It’s training to bring yourself back from distractions, to be present and mindful of the actual surroundings and not the surroundings as you wish they existed. To put your full attention on what you’re doing right now instead of what you want to be doing even ten seconds from now.

That doesn’t mean you never think of the future. It means you’ve trained so that when it’s most important to be in the moment – which is when you’re most likely to not want to be in the moment – your instincts will kick in and you will handle it.

For example, you need to get some feature out ASAP. Here’s two scenarios for doing that:

Scenario 1) You hammer out a solution. You’re thinking about code, sure, but only enough to get the characters into your editor. Your mind is really on the goal: shipping this feature. You run the code and… it doesn’t quite work. Oh, you just need this quick fix. Still doesn’t work. Quicker fix. Still doesn’t work. Agh! Because the goal is just barely out of reach and it feels like it’s moving away from you, you go faster. Quicker and quicker fixes. It finally works! The code is garbage, but it works, and it took way longer than you thought it would. So you back away slowly and don’t touch it (ever) for fear of brining down that house of cards.

Our brains default to this scenario. It takes training and practice to fix this problem. Google mindfulness meditation and read up on it. A great way to start is by counting breaths: In, out, one. In, out, two. In, out, three. Go up to ten, then go back down to one. Repeat. When you do this, your mind naturally starts wandering: thinking about what you should be doing instead, all the tasks on your plate, that stupid thing you did in the past. This is the part you’re training for: realize your mind is not in the present. Don’t get frustrated that you can’t seem to meditate “right”. Acknowledge that thought. Maybe even sit with it for a couple seconds, and then let it go. Bring yourself back to the breath. If you lost your place, start over. It’s not about finishing, it’s about the process.

If you make this a habit and practice it daily, it will help you the next time you start feeling stressed. Your breath becomes a trigger to bring you in the present so you can do your best work.

Start with 1 minute. It’s very easy to tell yourself you don’t have time to meditate, but everyone can spare 1 minute. When a minute becomes habit, try two. Then three. I’m only up to four minutes myself and am already seeing huge benefits.

Scenario 2) In this scenario you’ve trained for this. Your stress at the beginning of the cycle becomes a trigger and instead of going faster, you stop and focus on what you’re doing right now: I want to get this feature out. Ok, that’s a thought about the future. Acknowledge it and let it pass. What am I doing right now? I’m writing this line of code. Now I’m writing this line of code. Now this line.

In the first scenario, it felt like you were moving faster, but because you were focused on getting somewhere faster, you wrote bad code, which actually made you get there slower. I’m willing to bet the second scenario got you there in roughly the same amount of time, but with much less stress, and an end result that has less bugs, and is more maintainable.

Excuses for bad code

“The client keeps changing requirements!”

“This deadline is unrealistic!”

“This legacy code is too hard to work with!”

Things like these are legitimate problems that are often used as excuses to write bad code. You feel pressure to go faster, so maybe you skip the tests. Maybe you write very long functions. Maybe you let the conditionals get nested into a web of complexity that works for now but will be impossible to untangle later.

You do what you have to do, but don’t fall into the trap of thinking it was totally out of your hands why the code is now hard to maintain. The truth is, when you felt the pressure, you fell back on what was comfortable so you could go faster. If writing bad code is comfortable, you need to practice until writing bad code is uncomfortable.

A professional sports team practices so they don’t fall back on bad form when the pressure is on. A professional programmer should do the same. Your goal should be for good code to be your comfort zone.

That doesn’t mean you can never cut corners when you have to, but do it mindfully. Don’t use one of those excuses to justify your lack of skill. Notice it, and then work on fixing it.

Lots of little objects

I like to use abstractions that lead to lots of small objects that do little jobs communicating with each other to do bigger jobs. The common argument I hear against this is that it makes the code hard to change, for two reasons:

1. “The work is spread out into too many places”

The Single Responsibility Principle says that a class should have only one reason to change. The inverse of that is that if multiple things will change for the same reason, they should probably be in the same place. If you are making a change that requires editing lots of things in lots of different places, you’re either making a very high-level change – in which case, this is expected – or your design needs work – those things probably should have been in the same place.

This doesn’t contradict “lots of little objects”, it provides a counter-balance for taking it too far.

2. “It’s hard to find where I need to make my change”

“I’ll start in my view… ok that’s calling a method in this file… oh that’s calling a method in this file…” and so on.

This is sometimes true, but I prefer it to the alternative of everything happening in this one method so it’s easy to find, but when I want to make the change I’m terrified because this method does so much.

I’m willing to trade that anxiety and potential for bugs for the cost of a bit more upfront time looking for the right place to make the change. Because when I find it, it’s only doing one small thing. I can change it with confidence.

This is the main reason I think using lots of small objects enables change. And ability to change is the best indicator of a good design.

Sandi Metz has a great talk on this subject.

TDD Rules!

These are some rules I like to follow when doing TDD. You can follow them too! Rules are fun!

  • Write your tests first. If you can’t, spike a solution, throw it away, and try again.
  • Test units in isolation. Use mocks to verify interaction between units. If this makes your tests brittle, refactor.
  • You don’t need to isolate your unit from simple value objects. So use more value objects.
  • If you feel like you can’t keep everything in your head, ask yourself if you really need to keep it all in your head. If you do, you need to refactor.
  • Each branch of logic should be covered by a unit test. If that makes you feel like you have too many tests, your logic is too complicated. Refactor.
  • If you ever feel the need to only run part of the unit test suite, it’s too slow and refactoring is needed.
  • Unit tests should be written as if they are a set of requirements – or “specs” – for the unit being tested.
  • Each test should test one and only one concept. That doesn’t always mean only one assertion.
  • When fixing bugs, make sure there is a test that fails without your fix, and passes with it.
  • Never push commits that contain failing tests. This makes it harder to revert, cherry-pick, and debug problems (e.g. with git bisect).

Solve tough problems with spikes

Sometimes I’m approaching a problem where I lack some understanding that would let me start with nice little unit tests. So instead, I start with a high-level functional test. Then I start getting the code to work by any means necessary. I do this without writing any more tests. I just cowboy-code my way to a solution that works. Extreme Programming calls this a Spike Solution.

When my functional test is green, I have much more understanding. I’ve been googling, looking up and using new libraries, and usually have a better idea of what a clean solution might look like. This is when I throw my code away.

Well, most of it. I keep the functional test. And if I have any particularly tricky code I might want for reference, I keep it separate from the project code, but available to refer to if I need it. Then I jump down into my unit tests and try to make my functional test green again using a proper test-driven approach.

It can be very hard to discard working code, but when I think back to every time I’ve lost some writing work unintentionally – an essay, blog post, homework assignment – the second draft is always better. I think the same is true for writing code.

Why is “Legacy” a negative word?

Programming might be the only place where “legacy” is a negative word. It seems backwards that we’re an industry where things tend to get worse instead of better as they get older and more people work on them. How does that happen?

Imagine you’re given a task to make a change to a piece of code that works, but no one wants to touch. It’s a gnarly mess of quick fixes and rushed, untested features that’s impossible to understand.

But you’re relieved because the change you need to make is super tiny. You can dive in, add a new if block to the 500-line method and get outta there.

“Who made that mess? I wish my managers would give me time to rewrite this thing” you think as you wash your hands of that “legacy code”.

But how did the code get like that? No one sets out to write a giant, confusing method that no one wants to work in. It’s a group effort. It takes lots of little “quick fixes” and “tiny changes” like the one you just made.

It’s not up to your managers to “let” you pay down that “technical debt”. A rewrite won’t fix that problem. To really fix it, we need to change our attitude about code like this.

We need to realize that the problem isn’t some other programmer who wrote this big thing. It’s us. We all worked together to make this mess because no one thinks their one small change is causing the problem.

When we finish working on a piece of the system, it shouldn’t just do what we want, it should also be a little bit easier for the next person to come in and make more changes. Our job isn’t just to build new things, it’s to enable change. To do that, we should leave each piece of code we touch a little better than we found it.

Uncle Bob calls this The Boy Scout Rule. Martin Fowler calls it Opportunistic Refactoring.

Imagine working on a team where this is the norm. Imagine working on a code base that gets better as it ages. Imagine if “legacy” was a good thing.

That doesn’t mean there will never be a mess. But it means the trend will be towards cleaner code. Quick hacks and technical debt will no longer be excuses to continue cutting corners. They will be strategic decisions made when the trade-off is worth it. And we don’t need to get management approval to pay down that debt, because it won’t be a huge undertaking. It will be how we normally work: pay it down a little bit at a time as we continue working on the codebase.

Let’s start leaving behind a good legacy.

Python’s patch decorator is a code smell

I’m a big fan of using mocks as a testing/design tool. But if I find myself reaching for patch instead of Mock in python, I usually stop and rethink my design.

I consider the use of patch in tests to be a code smell. It means the test code is not using my internal API. It’s reaching in to the private implementation details of my object.

For example, I recently needed a helper function for creating users on a third-party service with a set of default values. I could have written it like this:

from services import UserService

from settings import SERVICE_CONF


def create_user_with_defaults(**attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)

  service = UserService(**SERVICE_CONF)
  return service.create_user(**defaults)

This would get the job done. And because this is python, I can test it without hitting real services using @patch:

@patch("users.helpers.UserService")
def test_creates_user_with_defaults_on_user_service(self, MockUserService):
  user_service = MockUserService.return_value
  
  # execution:
  user = create_user_with_defaults()
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

But look at the verification step: there is nothing in the execution step about user_service, yet that’s what I’m asserting against. My tests have knowledge about private implementation details of the thing they’re testing. That’s bad news.

I prefer my tests to be normal consumers of my internal APIs. This forces me to keep my APIs easy to use and flexible. @patch lets me get around issues like tight coupling by hijacking my hard-coded dependencies.

Here is how I actually implemented the helper function:

def create_user_with_defaults(service, **attributes):
  defaults = { "name": "test" }
  defaults.update(attributes)
  return service.create_user(**defaults)

I didn’t even need to import anything! This is how I would test it:

def test_creates_user_with_defaults_on_user_service(self):
  user_service = Mock()
  
  # execution:
  user = create_user_with_defaults(user_service)
  
  # verification:
  user_service.create_user.assert_called_once_with(name="test")
  self.assertEqual(user, user_service.create_user.return_value)

Now compare the verification to the execution. Instead of patching the internal workings of the module, I’m explicitly passing in a mock object. I can do this because the function no longer depends on the concrete implementation of the user service, it depends on an abstraction*: some object that must be passed in that conforms to a certain interface. So it makes sense that my test verifies the interaction with that interface.

This means my test is now a normal consumer of my function, and my desire to avoid patch led me to a design that is more flexible. This became clear as soon as I wanted to create some test users in the repl. I happily created an instance of the UserService that uses the settings for our sandbox, and passed that in to my function.

*See The Dependency Inversion Principle (the D from SOLID).

Do you need DI in dynamic languages?

I’m often told dependency injection isn’t needed in dynamic languages. DHH wrote an article a couple years ago that sums the argument up nicely: Dependency injection is not a virtue. Read that and come back here for my 3-years-later response.

He uses this as an example:

def publish!
  self.update published_at: Time.now
end

He says DI folks would shiver at hard-coding a call to Time.now, and I’m assuming he thinks they would inject a clock or something to more easily test it. He argues (accurately) that it’s silly to make that more complex just for testing, and you should do something like this:

Time.stub(:now) { Time.new(2012, 12, 24) }
article.publish!
assert_equal 24, article.published_at.day

I do shiver at hard-coding Time in the publish! method, but I also shiver at stubbing global, built-in classes like that…

I prefer to think about Dependency Inversion first, and then Dependency Injection if that’s a way to accomplish it.

So when I see this:

def publish!
  self.update published_at: Time.now
end

I do shiver, because hard-coding like that is well established to be a code smell. Now, a smell does not mean it’s wrong in every case, but it means I should examine this. I’m not even thinking about tests right now, I’m thinking about inverting this dependency. Here’s how I’d do it:

def publish!(time)
  self.update published_at: time
end

I’ve inverted the dependency. No dependency injection framework needed. And no stubbing globals or bringing in complex gems (like Timecop) to test it:

article.publish! Time.new(2012, 12, 24)
assert_equal 24, article.published_at.day

The fact that it’s easier to test wasn’t the goal, it was a side effect of a better design. Now the system is ready to allow users to define publish dates in the future for example.

So just because a dynamic language lets us hijack objects to make testing easier, doesn’t mean we should completely ignore good practices. I’d argue that this is better code, regardless of what language I happened to write it in.

And if you miss the old API? Just give it a default:

def publish!(time = Time.now)
  self.update published_at: time
end