bitwalker/combine

Error thrown in `Combine.Parsers.Base.pair_right_impl/3`

yonkeltron opened this issue · 6 comments

Hello, Paul! I may have found a bug, but I also might be using the library wrong. Either way, I don't know if it's write for Combine to throw an error like this.

I ran into the following problem:

iex(1)> str = """
...(1)> PANDA "bamboo", CURRY "noodle"
...(1)> """
"PANDA \"bamboo\", CURRY \"noodle\"\n"
iex(2)> Combine.parse(str, Example.separated_property_pairs())
** (FunctionClauseError) no function clause matching in anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3

    The following arguments were given to anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3:

        # 1
        [:__ignore, :__ignore, ["CURRY", "noodle"]]

    (combine) lib/combine/parsers/base.ex:234: anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3
    (combine) lib/combine/parsers/base.ex:154: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine/parsers/base.ex:315: Combine.Parsers.Base.many1_impl/2
    (combine) lib/combine/parsers/base.ex:347: Combine.Parsers.Base.many_impl/2
    (combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
    (combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine.ex:52: Combine.parse/2

The parser code looks like this:

defmodule Example do
  @moduledoc "Module showing off an issue I've encountered with `Combine`"

  use Combine

  @doc "Entrypoint here!"
  def separated_property_pairs do
    sep_by1(property_pair(), comma_and_whitespace())
  end
  
  def property_pair do    
    sequence([property_name(), ignore(whitespace()), property_value()])
  end

  def property_value do
    quoted_string()
    |> label("PropertyValue")
  end
  
  def property_name do
    ~r{[A-Z]}
    |> word_of()
    |> label("PropertyName")
  end
  
  def comma_and_whitespace do
    ","
    |> char()
    |> ignore()
    |> many(whitespace())
    |> ignore()
  end

  def quoted_string(previous \\ nil) do
    previous
    |> between(char("\""), words(), char("\""))
    |> label("QuotedStringValue")
    |> map(&Enum.join(&1, ""))
  end

  def words do
    many(either(word(), whitespace()))
  end
  
  def whitespace do
    [space(), tab(), newline()] |> choice()
  end
end

Fixed in master, will be going out in 0.10.0 today

Looks like this introduced another problem, I believe!

Gives me:

iex(1)> Combine.parse(Example.str(), Example.key_val_pairs())
** (CaseClauseError) no case clause matching: %Combine.ParserState{column: 19, error: nil, input: "\n", labels: [:key_value, :key_name, :key_value, :key_name], line: 2, results: [:__ignore, "noodle flower", :__ignore, :__ignore, "Tree"], status: :ok}
    (combine) lib/combine/parsers/base.ex:490: Combine.Parsers.Base.preserve_ignored_impl/2
    (combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
    (combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine/parsers/base.ex:333: Combine.Parsers.Base.many1_impl/2
    (combine) lib/combine/parsers/base.ex:365: Combine.Parsers.Base.many_impl/2
    (combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
    (combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine.ex:52: Combine.parse/3

From this code:

defmodule Example do
  @moduledoc "Main parser for spaceship documents"

  use Combine

  def str do
    """
Panda: bamboo curry noodle
Tree: noodle flower
"""
  end

  @doc "Entry point!"
  def key_val_pairs(previous \\ nil) do
    ignored_newline = newline() |> ignore()
    
    previous
    |> sep_by1(key_val_pair(), ignored_newline)
  end

  def key_val_pair(previous \\ nil) do
    previous
    |> word()
    |> label(:key_name)
    |> ignore(char(":"))
    |> ignore(many(spacing()))
    |> words()
    |> label(:key_value)
    |> ignore(many(spacing()))
  end

  def words(previous) do
    previous
    |> many(either(word(), spacing()))
    |> map(fn ws -> Enum.join(ws, "") end)
  end
  
  def spacing do
    [space(), tab()] |> choice()
  end

  def whitespace do
    [spacing(), newline()] |> choice()
  end
end

Huh, I'm getting this:

Including tags: [line: "282"]
Excluding tags: [:test]



  1) test issue #41 (Combine.Test)
     test/integration_test.exs:274
     match (=) failed
     code:  assert [] = Combine.parse(str, Example2.key_val_pairs())
     right: {:error, "Expected `key_value` at line 1, column 6."}
     stacktrace:
       test/integration_test.exs:284: (test)



Finished in 0.2 seconds
7 tests, 1 failure, 6 skipped

Randomized with seed 903572

Where Example2 is the module definition you provided, and str is the string the module defined. Is the code you pasted the same that caused the problem?

Yes, but the string might be different. Try the below:

defmodule Example do
  @moduledoc "Example code which causes a problem"

  use Combine

  def str do
    """
Panda: bamboo curry noodle
Tree: noodle flower
"""
  end

  @doc "Entry point!"
  def cause_a_problem do
    Combine.parse(str(), key_val_pairs())
  end

  def key_val_pairs(previous \\ nil) do
    ignored_newline = newline() |> ignore()

    previous
    |> sep_by1(key_val_pair(), ignored_newline)
    |> label("properties")
  end

  def key_val_pair(previous \\ nil) do
    previous
    |> word()
    |> label("key_name")
    |> ignore(char(":"))
    |> ignore(many(spacing()))
    |> words()
    |> label("key_value")
    |> ignore(many(spacing()))
  end

  def words(previous) do
    previous
    |> many(either(word(), spacing()))
    |> map(fn ws -> Enum.join(ws, "") end)
  end

  def spacing do
    [space(), tab()] |> choice()
  end

  def whitespace do
    [spacing(), newline()] |> choice()
  end
end

And the tests:

defmodule ExampleTest do
  use ExUnit.Case

  # passes
  test "key_val_pair" do
    assert ["Panda", "Bamboo"] == Combine.parse("Panda: Bamboo", Example.key_val_pair())
  end

  # fails with error
  test "key_val_pairs" do
    expected = [["Panda", "bamboo curry noodle"], ["Tree", "noodle flower"]]
    actual = Combine.parse(Example.str(), Example.key_val_pairs())
    assert expected == actual
  end
end

In IEx:

Erlang/OTP 20 [erts-9.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Interactive Elixir (1.5.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Example.cause_a_problem()
** (CaseClauseError) no case clause matching: %Combine.ParserState{column: 19, error: nil, input: "\n", labels: ["key_value", "key_name", "key_value", "key_name"], line: 2, results: [:__ignore, "noodle flower", :__ignore, :__ignore, "Tree"], status: :ok}
    (combine) lib/combine/parsers/base.ex:490: Combine.Parsers.Base.preserve_ignored_impl/2
    (combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
    (combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine/parsers/base.ex:333: Combine.Parsers.Base.many1_impl/2
    (combine) lib/combine/parsers/base.ex:365: Combine.Parsers.Base.many_impl/2
    (combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
    (combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
    (combine) lib/combine/parsers/base.ex:606: Combine.Parsers.Base.label_impl/3

Alright, so after reviewing, I've found the source of the problem. However it's not an easy fix since it will require breaking changes (which is fine, just not ideal). The problem is that the pipe parser flattens out the parsers it is given - many parsers use pipe underneath, including sequence, all of the pair_*, between, and more. I can't quite recall the specific reason this decision was made, but in my current frame of mind it was a mistake. I guess I haven't run into this due to the way I define my parsers.

The change required is to make pipe push a new list of results into the parser state, this way we can handle the results predictably - then flatten the results out at the end. Which brings up another issue I just found, which is that the new keyword labelling stuff doesn't do the right thing, so that will need to be fixed as part of this effort as well.

That said, as a workaround for you (I think) - be aware that with sep_by, pair_* and some of the other functions, you do not need to use ignore - as it already handles ignoring the part which is not needed (e.g. the separator in sep_by). If you have a sequence of parsers to apply, I would use sequence to wrap them, rather than piping them one into the other, which is why you are running into issues with parsers getting a number of results they don't expect.

In the meantime I'll work on the changes I described above.

Yeah, I can confirm that keyword labelling doesn't work the way you'd expect. I'd personally expect it to work much like Parslet's output works. Let me know when I can test and I'll give it a whirl so I can provide feedback.

Also, the sequence trick worked great, except now my tree is super dense and nested! Nothing a little pattern matching can't handle, though...