Parsing a multi-line conditional expression causes exception - Unexpected token Token('QMARK', '?')
kartikp10 opened this issue · 7 comments
Hello!
I have observed that parsing a ternary expression fails if the expression is using multiple lines. For example:
data "aws_iam_policy_document" "sample" {
source_json = (
length(var.sample_value) > 0
? data.aws_iam_policy_document.sample_reader.json
: ""
)
}
Loading this file with hcl2.load(file)
will result in this exception:
Traceback (most recent call last):
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 59, in get_action
return states[state][token.type]
KeyError: 'QMARK'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/kpande/Downloads/sampletf/hcl2_test.py", line 4, in <module>
dict = hcl2.load(file)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 9, in load
return loads(file.read())
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 18, in loads
return hcl2.parse(text + "\n")
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/lark.py", line 464, in parse
return self.parser.parse(text, start=start)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 115, in parse
return self._parse(token_stream, start)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 63, in _parse
return self.parser.parse(input, start, *args)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 35, in parse
return self.parser.parse(*args)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 88, in parse
action, arg = get_action(token)
File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 66, in get_action
raise UnexpectedToken(token, expected, state=state, puppet=puppet)
lark.exceptions.UnexpectedToken: Unexpected token Token('QMARK', '?') at line 4, column 5.
Expected one of:
* __ANON_1
* __ANON_0
* RPAR
* __ANON_2
If, however, the expression is in a single line, the parsing will work fine.
Also running into this. I wonder if its as simple as editing https://github.com/amplify-education/python-hcl2/blob/master/hcl2/hcl2.lark#L12 to be
conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression
Yeah this should do it
conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression new_line_or_comment?
Seems like adding the first new_line_or_comment?
confuses the parser so every time there's an expression and then a new line it expects a question mark. I assume it's because of LALR that checks only 1 token ahead.
As another method I tried adding the new_line_or_comment?
to the end of expression
rule so it looks like:
?expression : (expr_term | operation | conditional) new_line_or_comment?
but then some of the parsing also breaks.
Any ideas? @ianvonseggern1 @kartikp10
Yeah I noticed the same when I tried it :( Unfortunately I'm really not familiar with lark so I have very few ideas about what to try next
The PR was merged. This issue can be closed now.
I'm still able to reproduce this on 4.3.4:
hcl2.loads("""\
locals {
v = (
true
? 1
: 0
)
}
""")
produces
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser_state.py", line 77, in feed_token
action, arg = states[state][token.type]
KeyError: 'QMARK'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/homebrew/lib/python3.9/site-packages/hcl2/api.py", line 27, in loads
tree = hcl2.parse(text + "\n")
File "/opt/homebrew/lib/python3.9/site-packages/lark/lark.py", line 658, in parse
return self.parser.parse(text, start=start, on_error=on_error)
File "/opt/homebrew/lib/python3.9/site-packages/lark/parser_frontends.py", line 104, in parse
return self.parser.parse(stream, chosen_start, **kw)
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 42, in parse
return self.parser.parse(lexer, start)
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 88, in parse
return self.parse_from_state(parser_state)
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 111, in parse_from_state
raise e
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 102, in parse_from_state
state.feed_token(token)
File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser_state.py", line 80, in feed_token
raise UnexpectedToken(token, expected, state=self, interactive_parser=None)
lark.exceptions.UnexpectedToken: Unexpected token Token('QMARK', '?') at line 4, column 5.