Simple. Powerful. Persistent LRU caching for the requests library.
- Documentation: https://cache_requests.readthedocs.org
- Open Source: https://github.com/bionikspoon/cache_requests
- Python version agnostic: tested against Python 2.7, 3.3, 3.4, 3.5 and Pypy
- MIT license
- Drop in decorator for the requests library.
- Automatic timer based expiration on stored items (optional).
- Backed by yahoo's powerful
redislite
. - Scalable with redis. Optionally accepts a
redis
connection. - Exposes the powerful underlying
Memoize
decorator to decorate any function. - Tested with high coverage.
- Lightweight. Simple logic.
- Lightning fast.
- Jump start your development cycle.
- Collect and reuse entire response objects.
At the command line either via easy_install or pip
$ pip install cache_requests
$ easy_install cache_requests
Or, if you have virtualenvwrapper installed
$ mkvirtualenv cache_requests
$ pip install cache_requests
Uninstall
$ pip uninstall cache_requests
To use cache_requests in a project
import cache_requests
To use cache_requests
in a project
>>> from cache_requests import Session()
requests = Session()
# from python-requests.org
>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}
method.ex
- sets the default expiration (seconds) for new cache entries.
method.redis
- creates the connection to the
redis
orredislite
database. By default this is aredislite
connection. However, a redis connection can be dropped in for easy scalability.
ex
is shared between request methods. They can be accessed bySession.cache.ex
orSession.get.ex
, whereget
is therequests.get
method- By default requests that return and error will not be cached. This can be overridden by overriding the
Session.cache.set_cache_cb
to returnFalse
. The callback takes the response object as an argument
from cache_requests import Session
requests = Session()
requests.cache.set_cache_db = lambda _:False
- By default only autonomous methods are cached (
get
,head
,options
). Each method can be setup to be cached using theSession.cache
config option.
These methods are accessed through the Session objects Session.cache.[method name]
.
They can be overridden with the Session.cache.all
setting.
For example
from cache_requests import Session
requests = Session()
requests.cache.delete = True
# cached, only called once.
requests.delete('http://google.com')
requests.delete('http://google.com')
requests.cache.delete = True
# not cached, called twice.
requests.delete('http://google.com')
requests.delete('http://google.com')
# cache ALL methods
requests.cache.all = True
# don't cache any methods
requests.cache.all = False
# Use individual method cache options.
requests.cache.all = None
Method | Cached |
---|---|
get |
True |
head |
True |
options |
True |
post |
False |
put |
False |
patch |
False |
delete |
False |
all |
None |
- Cache Busting
- Use keyword
bust_cache=True
in a memoized function to force reevaluation. - Conditionally Set Cache
- Use keyword
set_cache
to provide a callback. The callback takes the results of function as an argument and must return abool
. Alternatively,True
andFalse
can be used.
- Scenario:
- Working on a project that uses a 3rd party API or service.
- Things you want:
- A cache that persists between sessions and is lightning fast.
- Ability to rapidly explore the API and it's parameters.
- Ability to inspect and debug response content.
- Ability to focus on progress.
- Perfect transition to a production environment.
- Things you don't want:
- Dependency on network and server stability for development.
- Spamming the API. Especially APIs with limits.
- Responses that change in non-meaningful ways.
- Burning energy with copypasta or fake data to run piece of your program.
- Slow. Responses.
Make a request one time. Cache the results for the rest of your work session.
import os
if os.environ.get('ENV') == 'DEVELOP':
from cache_requests import Session
request = Session(ex=60 * 60 ) # Set expiration, 60 min
else:
import requests
# strange, complicated request you might make
headers = {"accept-encoding": "gzip, deflate, sdch", "accept-language": "en-US,en;q=0.8"}
payload = dict(sourceid="chrome-instant", ion="1", espv="2", ie="UTF-8", client="ubuntu",
q="hash%20a%20dictionary%20python")
response = requests.get('http://google.com/search', headers=headers, params=payload)
# spam to prove a point
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
# tweak your query, we're exploring here
payload = dict(sourceid="chrome-instant", ion="1", espv="2", ie="UTF-8", client="ubuntu",
q="hash%20a%20dictionary%20python2")
# do you see what changed? the caching tool did.
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
Automatically expire old content.
- How often? After a day? A week? A Month? etc. 100% of this logic is built in with the
Session.cache.ex
setting.- Effectively it can manage all of the time-based rotation.
- Perfect if you theres more data then what your API caps allow.
One line of code to use a redis
full database.
- Try
redislite
; it can handle quite a bit. Theredislite
api used by this module is 1:1 with the redis package. Just replace the connection parameter/config value.redis
is a drop in:
connection = redis.StrictRedis(host='localhost', port=6379, db=0)
requests = Session(connection=connection)
* Everything else just works. There's no magic required.
from cache_requests import Session
connection = redis.StrictRedis(host='localhost', port=6379, db=0)
ex = 7 * 24 * 60 * 60 # 1 week
requests = Session(ex=ex, connection=connection)
for i in range(1000)
payload = dict(q=i)
response = requests.get('http://google.com/search', params=payload)
print(response.text)
from cache_requests import Memoize
@Memoize(ex=15 * 60) # 15 min, default, 60 min
def amazing_but_expensive_function(*args, **kwargs)
print("You're going to like this")
Tools used in rendering this package: