maxfischer2781/asyncstdlib

Add str.join(…, AsyncIterable) support

bryzgaloff opened this issue · 2 comments

Hi @maxfischer2781 and thank you for a handy lib! As for me, it misses str.join method supporting AsyncIterable as an argument. Here is the behaviour I expect:

async def generate_strings() -> typing.AsyncGenerator[str, None]:
    yield 'Hello'
    yield 'world!'

print(await asyncstdlib.join(' ', generate_strings()))  # prints 'Hello world!'

Thank you in advance! :)

I don't see how an async implementation could improve things over the naive case of realizing the list and using the regular str.join; the main advantage of str.join is its linear runtime, which naturally requires the entire iterable to be known. Internally, str.join will create a static list before joining.

I'm not completely opposed to having a str.join async variant, but the semantics would have to be clear. I would like the tools in asyncstdlib to have a clear "async advantage" to not mislead people into thinking there is some optimisation when there isn't.
Please clarify what semantics you would expect in terms of waiting or eagerly using the input.


If you just want to "join strings from an async iterable", the straightforward way is to create the list yourself:

print(" ".join([s async for s in generate_strings()]))

This has optimal total runtime and good memory consumption.

If you want to handle any iterable, asyncstdlib.list

print(" ".join(await a.list(generate_strings())))

This is very similar to the above, adding a bit of overhead in return for generality.

If you want to "iteratively and eagerly join strings from an async iterable", use asyncstdlib.reduce:

def str_join(sep, any_iterable):
    return a.reduce(lambda x, y: x + sep + y, any_iterable)

print(await str_join(" ", generate_strings()))

This has better memory consumption but quadratic total runtime.

Hi Max, thank you for a detailed response 👍

I have suggested "async" str.join as a syntactic sugar on top of functions your library provides already. I understand your concern about "clear async advantage", but, for example, your list implementation is also just a syntactic sugar with no such advantage:

return [element async for element in aiter(iterable)]

A clear difference between the two cases is that your list is named exactly after a built-in list while there is no good name for async str.join (strjoin maybe?). The rest looks similar to me, though I respect your right as an author to decide where should the border be.

Feel free to close the issue if you think this does not follow your library's philosophy 👌

await a.join(' ', generate_strings()) just looks a little bit more natively to me personally compared to ' '.join(await a.list(…)).