mozilla-releng/balrog

requests can map to deleted releases for short periods of time

Opened this issue · 0 comments

We had a new Traceback show up in Sentry recently that showed a request try to retrieve a Release that didn't exist:

IndexError: list index out of range
  File "flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "flask/views.py", line 84, in view
    return self.dispatch_request(*args, **kwargs)
  File "flask/views.py", line 149, in dispatch_request
    return meth(*args, **kwargs)
  File "auslib/web/views/client.py", line 57, in get
    release, update_type = AUS.evaluateRules(query)
  File "auslib/AUS.py", line 99, in evaluateRules
    release = dbo.releases.getReleases(name=rule['mapping'], limit=1)[0]

In this case, the rule in question was the main Firefox release rule, and the Release it mapped to was Firefox-50.1.0-build2-prod. At the time of the request, that release rule pointed at Firefox-50.1.0-build2, and Firefox-50.1.0-build2-prod didn't exist. After some digging with jlund I discovered that he changed the mapping of that Rule and deleted Firefox-50.1.0-build2-prod in short succession. Because Rules are cached, we ended up with a short period of time where requests were using the cached Rule (that pointed at Firefox-50.1.0-build2-prod), but didn't have that Release cached.

This is a pretty rare occurence, but definitely possible to hit again. We only cache Rules for 30s, so that's the maximum amount of time we could stay in this state for.

There's no obvious easy fix for this. We can't prevent people from deleting Releases that are still pointed at by a cached Rule, because the admin app doesn't know anything about the caches on the public side.

One thing we might be able to try is to ensure that the mappings (aka Releases) of all cached Rules are always cached in the public app. This could be tricky though, and possibly cause a big performance penalty.

(Imported from https://bugzilla.mozilla.org/show_bug.cgi?id=1325605)