sparp
stands for Simple Parallel Asynchronous Requests in Python
Find async
or await
confusing, and just want to process a list of requests? Then this
is the package for you.
Install it directly from git:
python3 -m pip install git+https://github.com/fredo838/sparp.git
Pin your version to a commit with
python3 -m pip install git+https://github.com/fredo838/sparp.git@gitsha
import sparp
configs = [{'method': 'get', 'url': 'https://www.google.com'} for _ in range(10000)]
results = sparp.sparp(configs, max_outstanding_requests=len(configs))
print(results[0].keys())
# dict_keys(['text', 'status_code', 'json', 'elapsed'])
if the request itself errors (similar to how "requests.get"
would error instead of
returning some (good or bad) status code) the, the resulting payload will be
print(results[0].keys())
# dict_keys(['error_message'])
results = sparp.sparp(
configs, # list of request configs. See below
max_outstanding_requests=1000, # max number of concurrent requests alive at the same time. Should be in [0, len(configs)]. Using len(configs) guarantees you won't bottleneck the processing.
time_between_requests=0, # minimum amount of time between two requests
ok_status_codes=[200], # status codes that are deemed "success"
stop_on_first_fail=False, # whether to stop and return (not error) when a "failed" response is encountered
disable_bar=False, # do not print anything
attempts=1, # number of times to try the request (must be at least 1)
retry_status_codes=[429], # status codes to attempt a retry on
aiohttp_client_session_kwargs={}, # additional kwargs to initialize aiohttp.ClientSession with
print_kwargs={"end":"\r"} # additional kwargs to pass to the 'print' function for printing the progress bar
)
- each
config
inconfigs
should be able to be passed toaiohttp.ClientSession.request(**config)
configs
should preferably be alist
ofdict
s, but you can also use agenerator
, so if you want to make your request as soon as you have created yourconfig
, you can.max_outstanding_requests
is a mandatory paramater, but what should you use? We create aconsumer coroutine
(read:while loop that makes requests
) for every item inrange(max_outstanding_requests)
, so the ideal value is just above the "actual" max amount of requests that will be active at the same time, but we don't know that beforehand. So rule of thumb:- try
100
, if not fast enough, make it1000
, still not fast enough uselen(configs)
. - using
len(configs)
ensures you wont bottleneck your application, but know that this createslen(configs)
coroutines
(so thosewhile loops
), so it should not be tooo much, let's say<100000
. - if the
url
you call cannot scale beyond1000
requests, than using values higher that1000
will only hurt performance
- try