Possibly inaccurate doc about the use of partition

Question

Possibly inaccurate doc about the use of partition

Arkham opened this issue 7 years ago · 9 comments

Hi all,

I was following this section of the Flow documentation regarding partition: https://hexdocs.pm/flow/Flow.html#module-partitioning

If I run this code which doesn't have the partition step:

defmodule Test do
  def run do
    {:ok, stream} =
      "roses are red\nviolets are blue\n"
      |> StringIO.open()

    stream
    |> IO.binstream(:line)
    |> Flow.from_enumerable()
    |> Flow.flat_map(&String.split(&1, " "))
    |> Flow.reduce(fn -> %{} end, fn word, acc ->
      Map.update(acc, word, 1, & &1 + 1)
    end)
    |> Enum.to_list()
  end
end

I should receive something like:

[{"roses", 1}, {"are", 1}, {"red", 1}, {"violets", 1}, {"are", 1}, {"blue", 1}]

But instead I see this:

[{"are", 2}, {"blue\n", 1}, {"red\n", 1}, {"roses", 1}, {"violets", 1}]

Answer 1 · 2017-05-11T15:07:06.000Z

That's because the contents are too small. So everything is sent on a single batch, to a single producer/consumer, that can count it correctly. Can you please send a PR that adds this clarification to the docs? Thank you!

Answer 2 · 2017-05-11T15:08:37.000Z

Of course, do you think there is any way to show the advantage of using 'partition' in a simpler piece of code?

Answer 3 · 2017-05-11T15:32:17.000Z

Set max_demand to 1 or 2 maybe? -- *José Valimwww.plataformatec.com.br <http://www.plataformatec.com.br/>Founder and Director of R&D*

Answer 4 · 2017-05-12T15:06:42.000Z

Unfortunately, you can only specify the max_demand when you use partition, so I just added a paragraph in the doc to explain that this can happen.

Answer 5 · 2017-05-12T15:45:54.000Z

@Arkham you can specify max_demand on from_enumerable. :) Can you please give it a try?

Answer 6 · 2017-05-12T15:47:32.000Z

I gave it a quick try locally and I got this by passing max_demand: 1 to from_enumerable:

[{"are", 1}, {"red\n", 1}, {"roses", 1}, {"are", 1}, {"blue\n", 1}, {"violets", 1}]

Answer 7 · 2017-05-12T15:48:32.000Z

Aha, that's really cool, I can add that to the doc. Should I remove the comment then?

Answer 8 · 2017-05-12T15:48:33.000Z

Closing this in favor of the PR anyway. :)

Answer 9 · 2017-05-12T15:49:54.000Z

I think you can keep your commend and show an example with max_demand: 1 to illustrate how you can reproduce it. :)