python/asyncio

gather unexpectedly schedules coroutines in undeterministic order

JustinTArthur opened this issue · 9 comments

If you pass awaitables or coroutines to asyncio.gather, the function takes responsibility for wrapping them in futures and scheduling their execution. The order in which they are scheduled for execution is non-deterministic due to gather coercing the arguments to a unique set to ensure the same coroutine object isn't undertaken more than once.

Example:

import asyncio

order = []
@asyncio.coroutine
def append_a_thing(thing):
    order.append(thing)
    return thing

loop = asyncio.get_event_loop()
task = asyncio.gather(append_a_thing(1), append_a_thing(2), append_a_thing(3), append_a_thing(4))
loop.run_until_complete(results)

print("Things appended were {}.".format(task.result()))
print("Order of execution was {}.".format(order))

This might print:

Things appended were [1, 2, 3, 4].
Order of execution was [3, 1, 2, 4].

Either this is cool, and we should just warn the programmer that their coroutines may be scheduled (and executed) in any order their Python interpreter chooses…
or it's not cool and we should schedule tasks in the order expected by the programmer, while maintaining that unique awaitables and coroutines are only scheduled for execution once.

I'll submit a pull request for either scenario and someone else can decide what's cool.

Random execution order is totally correct from my perspective.
User usually don't care on order of coroutines starting.
Moreover, after first await the order becomes undetermenistic.

Most real coroutines wait for something first, so this scenario is pretty artificial. I see nothing wrong with the current semantics. I also don't think there's a need to call this out in the docs -- this is just life in the async lane.

A couple of real-world use cases I had in mind were heuristic search and usage simulation.

In heuristic search, if your coroutines were ordered by the quality of their heuristic solution, you'd want them to execute in close to that order and you may want their execution to be concurrent as to not waste any opportunity cost to waiting. Example assuming heuristic function closeness_to_my_solution:

goal = None

@asyncio.coroutine
def search(query):
    global goal
    if goal:
        return
    result = yield from aiodb.find(query)
    if goal_criteria(result):
        goal = result
    return result

queries_itinerary = sorted(queries, key=closeness_to_my_solution)
attempts = asyncio.gather(*(search(query) for query in queries_itinerary))
loop.run_until_complete(attempts)
if goal:
    print("Found what we were looking for: {}".format(goal))
    print("Attempts leading up to it: {}".format(a for a in attempts.result() if a is not None))

In usage simulation, you might be trying to simulate typical web traffic on a site for a stress test and you have an order you want the resources requested in that better simulates what a real user of your site might attempt to do and what it's browser might try to do (e.g. load 6 resources at a time). The results you gather might be response time.

These are not impossible tasks today... you can add them to the loop explicitly before calling gather, but it'd be convenient.

Reading the docs this behavior was really not expected. The current documentation for asyncio.gather specifically states (relevant part in bold):

If all the tasks are done successfully, the returned future’s result is the list of results (in the order of the original sequence, not necessarily the order of results arrival)

Simplified example from my library:

for i in range(n):
    send_request()

futures = [loop.create_task(wait_for_reply()) for i in range(n)]
values = await asyncio.gather(*futures, loop=loop)

Without wrapping each wait_for_reply() in create_task(), the order of replies is non-deterministic. I would probably have hunted ghosts for hours more if not encountering this issue ticket, so thanks for that. I'd suggest updating the current documentation to make this behavior clearer.

Agreed that life in the async lane requires dealing with these kinds of subtleties! Although, I don't think it is a completely artificial issue. In a hybrid world, when async functions/methods will be run next to blocking functions, executing them in order matters.

import time
import asyncio

def delay():
    time.sleep(1)

async def async_delay():
    await asyncio.sleep(1)

async def afuncwrapper(func):
    return func()

task1 = async_delay()
task2 = afuncwrapper(delay)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*[task1, task2]))

If task1 is started before task2 this takes 1 second, and if task2 is started before task1 it takes 2 seconds.

The real world example is a server trying to run a set of received commands concurrently. Some of those commands may be passed off to other servers with an await and some may be run locally in a blocking manner. Order can be maintained with work arounds, but being able to just call .gather as above is cleaner.

I don't think it is a needed fix, but we got bit by misreading the docs and making poor assumptions. A little note in the docs may save others time.

thanks. done.

pkch commented

@JustinTArthur I didn't quite understand your examples:

In heuristic search, if your coroutines were ordered by the quality of their heuristic solution, you'd want them to execute in close to that order and you may want their execution to be concurrent as to not waste any opportunity cost to waiting.

I assume that your search function is non-blocking (otherwise, why use event loop in the first place). But at the same time, you do gain some advantage from controlling in which order they are started. Since the event loop runs in a single thread, I am guessing your search function forwards the task to a pool of parallel workers - and if the entire pool is fully utilized, it puts the task in a wait queue. But in this case, you don't need an event loop here; you can simply use concurrent.futures.ProcessPoolExecutor.

I think the only situation where the order in which tasks are started matters is when the code needs mix blocking and non-blocking calls as in @mwfrojdman and @ssolari examples:

The real world example is a server trying to run a set of received commands concurrently. Some of those commands may be passed off to other servers with an await and some may be run locally in a blocking manner.

I feel it's cleaner to keep blocking and non-blocking calls separate. But asyncio doesn't (and cannot) prohibit such mixed usage. Since functions like gather don't preserve order, I suppose asyncio doesn't consider it to be in good style and so doesn't want to go out of its way to support it?