edgi-govdata-archiving/web-monitoring-processing

Tests should use VCR or a fixture server and avoid connecting to third-party services

Closed this issue · 5 comments

Some tests connect to third-party services, which we should avoid because they could cause intermittent and confusing failures in CI (like this one for @vbanos: #233 (comment)). This also prevents tests from being run locally without a network connection, which is also no fun.

Our web_monitoring.db tests use VCR for this:

# This stashes web-monitoring-dbserver responses in JSON files (one per test)
# so that an actual server does not have to be running.
cassette_library_dir = str(Path(__file__).parent / Path('cassettes'))
db_vcr = vcr.VCR(
serializer='json',
cassette_library_dir=cassette_library_dir,
record_mode='once',
match_on=['uri', 'method'],
)
global_stash = {} # used to pass info between tests
# Refers to real data that is part of the 'seed' dataset in web-monitoring-db
URL = 'https://www3.epa.gov/climatechange/impacts/society.html'
SITE = 'site:EPA - www3.epa.gov'
AGENCY = 'EPA'
PAGE_ID = '3d068c64-967a-4ec7-af49-f8fa0f19e6f1'
TO_VERSION_ID = '795e6ff4-fcc0-444c-9f31-2b156f7dd4d4'
VERSIONISTA_ID = '13708349'
# This is used in new Versions that we add.
TIME = datetime(2017, 11, 15, tzinfo=timezone.utc)
NEW_VERSION_ID = '06620776-d347-4abd-a423-a871620299a9'
# The only matters when re-recording the tests for vcr.
AUTH = {'url': "http://localhost:3000",
'email': "seed-admin@example.com",
'password': "PASSWORD"}
def test_missing_creds():
try:
env = os.environ.copy()
os.environ.clear()
with pytest.raises(MissingCredentials):
Client.from_env()
os.environ.update({'WEB_MONITORING_DB_URL': AUTH['url'],
'WEB_MONITORING_DB_EMAIL': AUTH['email'],
'WEB_MONITORING_DB_PASSWORD': AUTH['password']})
Client.from_env() # should work
finally:
os.environ.update(env)
@db_vcr.use_cassette()
def test_list_pages():

And our utils tests use mock requests for simpler cases:

def test_retryable_request_retries():
with requests_mock.Mocker() as mock:
mock.get('http://test.com', [{'text': 'bad', 'status_code': 503},
{'text': 'good', 'status_code': 200}])
response = retryable_request('GET', 'http://test.com', backoff=0)
assert response.ok

First, we need to see what tests this includes. So far:

We need to be better about making sure future PRs that add new tests do this, too.

Update: need to re-check this issue for whether it still applies to anything in this repo after #638.

This is indeed no longer relevant.