Error thrown in `Combine.Parsers.Base.pair_right_impl/3`
yonkeltron opened this issue · 6 comments
Hello, Paul! I may have found a bug, but I also might be using the library wrong. Either way, I don't know if it's write for Combine to throw an error like this.
I ran into the following problem:
iex(1)> str = """
...(1)> PANDA "bamboo", CURRY "noodle"
...(1)> """
"PANDA \"bamboo\", CURRY \"noodle\"\n"
iex(2)> Combine.parse(str, Example.separated_property_pairs())
** (FunctionClauseError) no function clause matching in anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3
The following arguments were given to anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3:
# 1
[:__ignore, :__ignore, ["CURRY", "noodle"]]
(combine) lib/combine/parsers/base.ex:234: anonymous fn/1 in Combine.Parsers.Base.pair_right_impl/3
(combine) lib/combine/parsers/base.ex:154: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine/parsers/base.ex:315: Combine.Parsers.Base.many1_impl/2
(combine) lib/combine/parsers/base.ex:347: Combine.Parsers.Base.many_impl/2
(combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
(combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine.ex:52: Combine.parse/2
The parser code looks like this:
defmodule Example do
@moduledoc "Module showing off an issue I've encountered with `Combine`"
use Combine
@doc "Entrypoint here!"
def separated_property_pairs do
sep_by1(property_pair(), comma_and_whitespace())
end
def property_pair do
sequence([property_name(), ignore(whitespace()), property_value()])
end
def property_value do
quoted_string()
|> label("PropertyValue")
end
def property_name do
~r{[A-Z]}
|> word_of()
|> label("PropertyName")
end
def comma_and_whitespace do
","
|> char()
|> ignore()
|> many(whitespace())
|> ignore()
end
def quoted_string(previous \\ nil) do
previous
|> between(char("\""), words(), char("\""))
|> label("QuotedStringValue")
|> map(&Enum.join(&1, ""))
end
def words do
many(either(word(), whitespace()))
end
def whitespace do
[space(), tab(), newline()] |> choice()
end
end
Fixed in master, will be going out in 0.10.0 today
Looks like this introduced another problem, I believe!
Gives me:
iex(1)> Combine.parse(Example.str(), Example.key_val_pairs())
** (CaseClauseError) no case clause matching: %Combine.ParserState{column: 19, error: nil, input: "\n", labels: [:key_value, :key_name, :key_value, :key_name], line: 2, results: [:__ignore, "noodle flower", :__ignore, :__ignore, "Tree"], status: :ok}
(combine) lib/combine/parsers/base.ex:490: Combine.Parsers.Base.preserve_ignored_impl/2
(combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
(combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine/parsers/base.ex:333: Combine.Parsers.Base.many1_impl/2
(combine) lib/combine/parsers/base.ex:365: Combine.Parsers.Base.many_impl/2
(combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
(combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine.ex:52: Combine.parse/3
From this code:
defmodule Example do
@moduledoc "Main parser for spaceship documents"
use Combine
def str do
"""
Panda: bamboo curry noodle
Tree: noodle flower
"""
end
@doc "Entry point!"
def key_val_pairs(previous \\ nil) do
ignored_newline = newline() |> ignore()
previous
|> sep_by1(key_val_pair(), ignored_newline)
end
def key_val_pair(previous \\ nil) do
previous
|> word()
|> label(:key_name)
|> ignore(char(":"))
|> ignore(many(spacing()))
|> words()
|> label(:key_value)
|> ignore(many(spacing()))
end
def words(previous) do
previous
|> many(either(word(), spacing()))
|> map(fn ws -> Enum.join(ws, "") end)
end
def spacing do
[space(), tab()] |> choice()
end
def whitespace do
[spacing(), newline()] |> choice()
end
end
Huh, I'm getting this:
Including tags: [line: "282"]
Excluding tags: [:test]
1) test issue #41 (Combine.Test)
test/integration_test.exs:274
match (=) failed
code: assert [] = Combine.parse(str, Example2.key_val_pairs())
right: {:error, "Expected `key_value` at line 1, column 6."}
stacktrace:
test/integration_test.exs:284: (test)
Finished in 0.2 seconds
7 tests, 1 failure, 6 skipped
Randomized with seed 903572
Where Example2
is the module definition you provided, and str
is the string the module defined. Is the code you pasted the same that caused the problem?
Yes, but the string might be different. Try the below:
defmodule Example do
@moduledoc "Example code which causes a problem"
use Combine
def str do
"""
Panda: bamboo curry noodle
Tree: noodle flower
"""
end
@doc "Entry point!"
def cause_a_problem do
Combine.parse(str(), key_val_pairs())
end
def key_val_pairs(previous \\ nil) do
ignored_newline = newline() |> ignore()
previous
|> sep_by1(key_val_pair(), ignored_newline)
|> label("properties")
end
def key_val_pair(previous \\ nil) do
previous
|> word()
|> label("key_name")
|> ignore(char(":"))
|> ignore(many(spacing()))
|> words()
|> label("key_value")
|> ignore(many(spacing()))
end
def words(previous) do
previous
|> many(either(word(), spacing()))
|> map(fn ws -> Enum.join(ws, "") end)
end
def spacing do
[space(), tab()] |> choice()
end
def whitespace do
[spacing(), newline()] |> choice()
end
end
And the tests:
defmodule ExampleTest do
use ExUnit.Case
# passes
test "key_val_pair" do
assert ["Panda", "Bamboo"] == Combine.parse("Panda: Bamboo", Example.key_val_pair())
end
# fails with error
test "key_val_pairs" do
expected = [["Panda", "bamboo curry noodle"], ["Tree", "noodle flower"]]
actual = Combine.parse(Example.str(), Example.key_val_pairs())
assert expected == actual
end
end
In IEx:
Erlang/OTP 20 [erts-9.0] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.5.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Example.cause_a_problem()
** (CaseClauseError) no case clause matching: %Combine.ParserState{column: 19, error: nil, input: "\n", labels: ["key_value", "key_name", "key_value", "key_name"], line: 2, results: [:__ignore, "noodle flower", :__ignore, :__ignore, "Tree"], status: :ok}
(combine) lib/combine/parsers/base.ex:490: Combine.Parsers.Base.preserve_ignored_impl/2
(combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
(combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine/parsers/base.ex:333: Combine.Parsers.Base.many1_impl/2
(combine) lib/combine/parsers/base.ex:365: Combine.Parsers.Base.many_impl/2
(combine) lib/combine/parsers/base.ex:163: Combine.Parsers.Base.do_pipe/3
(combine) lib/combine/parsers/base.ex:152: Combine.Parsers.Base.pipe_impl/3
(combine) lib/combine/parsers/base.ex:606: Combine.Parsers.Base.label_impl/3
Alright, so after reviewing, I've found the source of the problem. However it's not an easy fix since it will require breaking changes (which is fine, just not ideal). The problem is that the pipe
parser flattens out the parsers it is given - many parsers use pipe
underneath, including sequence
, all of the pair_*
, between
, and more. I can't quite recall the specific reason this decision was made, but in my current frame of mind it was a mistake. I guess I haven't run into this due to the way I define my parsers.
The change required is to make pipe
push a new list of results into the parser state, this way we can handle the results predictably - then flatten the results out at the end. Which brings up another issue I just found, which is that the new keyword labelling stuff doesn't do the right thing, so that will need to be fixed as part of this effort as well.
That said, as a workaround for you (I think) - be aware that with sep_by
, pair_*
and some of the other functions, you do not need to use ignore
- as it already handles ignoring the part which is not needed (e.g. the separator in sep_by
). If you have a sequence of parsers to apply, I would use sequence
to wrap them, rather than piping them one into the other, which is why you are running into issues with parsers getting a number of results they don't expect.
In the meantime I'll work on the changes I described above.
Yeah, I can confirm that keyword labelling doesn't work the way you'd expect. I'd personally expect it to work much like Parslet's output works. Let me know when I can test and I'll give it a whirl so I can provide feedback.
Also, the sequence
trick worked great, except now my tree is super dense and nested! Nothing a little pattern matching can't handle, though...