Possibly inaccurate doc about the use of partition
Arkham opened this issue · 9 comments
Hi all,
I was following this section of the Flow documentation regarding partition
: https://hexdocs.pm/flow/Flow.html#module-partitioning
If I run this code which doesn't have the partition step:
defmodule Test do
def run do
{:ok, stream} =
"roses are red\nviolets are blue\n"
|> StringIO.open()
stream
|> IO.binstream(:line)
|> Flow.from_enumerable()
|> Flow.flat_map(&String.split(&1, " "))
|> Flow.reduce(fn -> %{} end, fn word, acc ->
Map.update(acc, word, 1, & &1 + 1)
end)
|> Enum.to_list()
end
end
I should receive something like:
[{"roses", 1}, {"are", 1}, {"red", 1}, {"violets", 1}, {"are", 1}, {"blue", 1}]
But instead I see this:
[{"are", 2}, {"blue\n", 1}, {"red\n", 1}, {"roses", 1}, {"violets", 1}]
That's because the contents are too small. So everything is sent on a single batch, to a single producer/consumer, that can count it correctly. Can you please send a PR that adds this clarification to the docs? Thank you!
Of course, do you think there is any way to show the advantage of using 'partition' in a simpler piece of code?
Unfortunately, you can only specify the max_demand when you use partition, so I just added a paragraph in the doc to explain that this can happen.
@Arkham you can specify max_demand on from_enumerable
. :) Can you please give it a try?
I gave it a quick try locally and I got this by passing max_demand: 1
to from_enumerable:
[{"are", 1}, {"red\n", 1}, {"roses", 1}, {"are", 1}, {"blue\n", 1}, {"violets", 1}]
Aha, that's really cool, I can add that to the doc. Should I remove the comment then?
Closing this in favor of the PR anyway. :)
I think you can keep your commend and show an example with max_demand: 1
to illustrate how you can reproduce it. :)