This document has moved to Confluence.
This document is a draft style guide for writing Python tests (and Django tests). It is meant to be expanded and refined over time.
Many of the recommendations stand in stark contrast to conventions for good code. This is deliberate: tests are not regular code and have a much different set of risks and rewards.
If you're curious why a particular practice is recommended, get someone to defend or explain it (there's a chance it's wrong and needs to be updated). That said, a lot of these recommendations come from specific mistakes and scars earned over many years of work, so don't be surprised if they're passionately defended.
Start with PEP8 and strive to understand why it makes the recommendations it does. We like nearly all of it except for the Maximum Line Length recommendation.
Configure the Python linting plugin for your editor and use it every time without exception.
Tests should (almost) always accompany code we write professionally. We're fortunate to work in an environment where testing is actively encouraged, so don't waste that priveledge. We get away with all sorts of other efficiencies and laziness because of our sincere dedication to pragmatic and practical testing.
Every package should have setup instructions for a developer in the README
and should be testable using tox
after setup. Use 100% test success to
confirm your setup is complete.
Strive to never push a failing test to master. If you want to push failing tests to another branch, that's fine (in fact often encouraged).
We run our tests continuously with Jenkins (every morning and after every commit), but you still need to run them locally before pushing code.
Some people do TDD often, sometimes, or rarely. Figure out when it's an effective habit for you.
We are fortunate in that we have a smart and effective QA team. Some functionality simply cannot be effectively tested, which can feel frustrating to a developer. Before spending too much time on a set of tests of limited benefit, consider whether the issue should be
Do not be shy about removing tests as soon as they start to become false or unhelpful.
For regressions, you should always make sure you can reproduce the issue in a browser as a typical user before writing a single line of code (test or otherwise). After you can reproduce it, try to get a failing test to reliably reproduce it (if possible). Only then is it a good time to start writing the fix.
No developer should ever rely on tests alone. Before resolving an issue, make sure you have walked through it at least once in a browser as a typical user.
How do you know if it is actually testing anything if the assert
never failed?
When tests fail they should tell you exactly why.
Yes:
response = self.client.post('/api/v1/data/', data=SAMPLE)
assert response.status_code == status.HTTP_201_CREATED, "expected HTTP_201, got HTTP_{} data: {}".format(response.status_code, response.data)
This results in a message which explains exactly what was expected and why the test failed:
> assert response.status_code == status.HTTP_201_CREATED, "expected HTTP_201, got HTTP_{} data: {}".format(response.status_code, response.data)
E AssertionError: expected HTTP_201, got HTTP_400 data: {'name': [u'This field is required.']}
E assert 400 == 201
E + where 400 = <rest_framework.response.Response object at 0x1070bbb10>.status_code
E + and 201 = status.HTTP_201_CREATED
No:
response = self.client.post('/api/v1/data/', data=SAMPLE)
assert response.status_code == status.HTTP_201_CREATED
This results in a message which isn't very helpful in diagnosing the test failure:
> assert response.status_code == status.HTTP_201_CREATED
E AssertionError: assert 201 == 400
E + where 400 = <rest_framework.response.Response object at 0x1070bbb10>.status_code
E + and 201 = status.HTTP_201_CREATED
Read through the "One Assert per Test" section of Robert Martin's Clean Code. In fact, read the entire chapter 😉
Tests are generally structured to mirror the file layout of the modules they are testing. It is OK to group tests for small modules or to separate targeted tests for a single module across many files.
Imports should be grouped in the following order (extends PEP8 rules):
- Python standard library imports
- Public library imports
- Django imports
- Testing library imports
- Internal libary imports
- Local application imports
Each import line must be in alphabetical order inside its group.
import datetime
import inspect
import logging
import uuid
import os
from lxml import etree
from lxml.html import document_fromstring, html5parser
from django.contrib.auth import get_user_model
from django.core.urlresolvers import reverse
from django.test import TestCase
from django.test.client import Client
from django.views.defaults import server_error
import mock
from nose import SkipTest
from nose.tools import assert_equals, assert_in
import nest.models
from nest.tests.test_helper import create_document
from entice.models import UserSubject
from heron.decorators import anyone_allowed
from heron.models import Trial
from heron.urls import urlpatterns
from tools.urls import traverse_urls
Our tests are often the first thing other developers read to try to understand our code. They'll also often be the first thing we read when something is on fire and we need to fix a bug. Strive for readability in tests.
Every single test should have a human-readable docstring. The docstring should follow the pattern of the rest of the module.
See the PEP guidelines for more info on docstring conventions.
Tests are often best when they focus on what behavior "should" happen in terms of important actors in the system. A good mental trick to keep tests in this style is to use "should" in every docstring.
In addition, try to use a known actor or object as the subject of the docstring.
Keep the number of words before should (or "should not") to a minimum to improve clarity.
'''The Manage Users page should show a button to revoke access to the site for each User'''
'''The Manage Users page should show Users with obscured emails if they are associated with a demo Account'''
'''A Tutorial should be able to exist in multiple Groups'''
'''A European User should not be required to enter a state or province'''
A reader may have no idea what is happening inside your test. Help them by binding to descriptive variables.
Yes:
reset_url = reverse('django.contrib.auth.views.password_reset_confirm', kwargs={'uidb36': self.user.id, 'token': token})
absolute_reset_url = "http://{}.{}{}'.format(self.account.subdomain, settings.BASE_SITE, reset_url)
expected = "Set your personal password: {}".format(absolute_reset_url)
self.assertOutboxContainsBody(expected)
No:
self.assertOutboxContainsBody(
'Set your personal password: http://{subdomain}.{base_site}{url}'.format(
subdomain=self.account.subdomain, base_site=settings.BASE_SITE,
url=reverse('django.contrib.auth.views.password_reset_confirm',
kwargs={'uidb36': self.user.id, 'token': expected})))
We always bind an expected
value because then the reader has no doubt of what
is happening inside our asserts. Conventions like this allow us to focus on
what's different about the code instead of being distracted by eccentricities
of each implementation.
Yes:
expected = [12, 19]
self.assertEquals(expected, results)
No:
self.assertEquals(40, num_lessons)
Yes:
self.assertEquals(expected, topic_toc_links)
self.assertEquals(expected, output)
No:
self.assertEquals(results, expected)
Unlike regular code, tests are often stronger when they have explicit expected
values despite the
cost. This can improve readability ("What does this JSON response actually look like in practice?")
and ensure that tests don't accidentally test nothing.
Yes:
expected = ["2001-09-01",
"2002-12-30",
"2003-08-20",
"2003-08-29",
]
self.assertEquals(expected, dates)
No:
expected = [lesson.started for lesson in self.lessons] # Oops, self.lessons was empty and I just tested nothing
self.assertEquals(expected, dates)
Use logging
instead of print
in your code.
Learn how to make the logger work for you rather than against you.
This should appear in every one of your files:
import logging
log = logging.getLogger(__name__)
More people can read and write CSS selectors than XPath or alternatives.
Every single test that relies on specific HTML markup must use/add a CSS class starting with t-
to the required HTML elements.
This declarative style allows the HTML and CSS to be refactored without impacting tests (even extremely specific ones).
num_revoke_buttons = len(nodes.cssselect(".t-user-table .t-revoke"))
Templatetags are often much easier to test (especially in isolation) than the equivalent functional test.
We introduce randomness to our tests to make sure we're handling a range of inputs sanely.
The two most common patterns for introducing randomness are using uuid.uuid4()
for string values
and random.randint()
.
self.epub = nest.models.EpubArchive.objects.create(identifier=str(uuid.uuid4()))
bit = self.epub.htmlfile_set.create(filename="first.html", virtual_pages=random.randint(1, 100))
how_many = random.randint(10, 100)
for i in range(how_many):
...
Warning: Randomness means that some test failures will become hard to reproduce. This is usually an acceptable cost.
When setting up expected values or string inputs, try to remember to include strange Unicode characters.
expected = [u"R",
u"å",
u"n",
u"d",
u"☃",
u"ɯ"]
- General, but excellent guidance on writing unit tests (language agnostic): Ch. 9 "Unit Tests" from Clean Code
- A gentle introduction to the feel of testing (Django focus): Test-Driven Web Development with Python
- The classic description of TDD (Java examples, oh well): Test Driven Development: By Example
- Another classic focused on explaining how testing works in the real world (more Java examples, oh well): Test Driven Development: By Example
- Recent take on testing lessons from a mature project: Testing Django Projects at Scale, a PyCon CA 2013 video
- Thoughts on mocking versus faking: Stop mocking, start testing, a writeup with links to a PyCon 2012 video
- A presentation to convince you to start testing: Getting Started Testing your Python