amplify-education/python-hcl2

Parsing a multi-line conditional expression causes exception - Unexpected token Token('QMARK', '?')

kartikp10 opened this issue · 7 comments

Hello!

I have observed that parsing a ternary expression fails if the expression is using multiple lines. For example:

data "aws_iam_policy_document" "sample" {
  source_json = (
    length(var.sample_value) > 0
    ? data.aws_iam_policy_document.sample_reader.json
    : ""
  )
}

Loading this file with hcl2.load(file) will result in this exception:

Traceback (most recent call last):
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 59, in get_action
    return states[state][token.type]
KeyError: 'QMARK'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/kpande/Downloads/sampletf/hcl2_test.py", line 4, in <module>
    dict = hcl2.load(file)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 9, in load
    return loads(file.read())
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/hcl2/api.py", line 18, in loads
    return hcl2.parse(text + "\n")
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/lark.py", line 464, in parse
    return self.parser.parse(text, start=start)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 115, in parse
    return self._parse(token_stream, start)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parser_frontends.py", line 63, in _parse
    return self.parser.parse(input, start, *args)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 35, in parse
    return self.parser.parse(*args)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 88, in parse
    action, arg = get_action(token)
  File "/Users/kpande/.pyenv/versions/3.9.4/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 66, in get_action
    raise UnexpectedToken(token, expected, state=state, puppet=puppet)
lark.exceptions.UnexpectedToken: Unexpected token Token('QMARK', '?') at line 4, column 5.
Expected one of:
	* __ANON_1
	* __ANON_0
	* RPAR
	* __ANON_2

If, however, the expression is in a single line, the parsing will work fine.

Also running into this. I wonder if its as simple as editing https://github.com/amplify-education/python-hcl2/blob/master/hcl2/hcl2.lark#L12 to be

conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression

Yeah this should do it

conditional : expression new_line_or_comment? "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression new_line_or_comment?

Seems like adding the first new_line_or_comment? confuses the parser so every time there's an expression and then a new line it expects a question mark. I assume it's because of LALR that checks only 1 token ahead.

As another method I tried adding the new_line_or_comment? to the end of expression rule so it looks like:
?expression : (expr_term | operation | conditional) new_line_or_comment? but then some of the parsing also breaks.

Any ideas? @ianvonseggern1 @kartikp10

Yeah I noticed the same when I tried it :( Unfortunately I'm really not familiar with lark so I have very few ideas about what to try next

Raised a PR to fix this - #128

@kartikp10 @ianvonseggern1 @arielkru @

The PR was merged. This issue can be closed now.

I'm still able to reproduce this on 4.3.4:

hcl2.loads("""\
locals {
  v = (
    true
    ? 1
    : 0
  )
}
""")

produces

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser_state.py", line 77, in feed_token
    action, arg = states[state][token.type]
KeyError: 'QMARK'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/lib/python3.9/site-packages/hcl2/api.py", line 27, in loads
    tree = hcl2.parse(text + "\n")
  File "/opt/homebrew/lib/python3.9/site-packages/lark/lark.py", line 658, in parse
    return self.parser.parse(text, start=start, on_error=on_error)
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parser_frontends.py", line 104, in parse
    return self.parser.parse(stream, chosen_start, **kw)
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 42, in parse
    return self.parser.parse(lexer, start)
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 88, in parse
    return self.parse_from_state(parser_state)
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 111, in parse_from_state
    raise e
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser.py", line 102, in parse_from_state
    state.feed_token(token)
  File "/opt/homebrew/lib/python3.9/site-packages/lark/parsers/lalr_parser_state.py", line 80, in feed_token
    raise UnexpectedToken(token, expected, state=self, interactive_parser=None)
lark.exceptions.UnexpectedToken: Unexpected token Token('QMARK', '?') at line 4, column 5.