rust plugin: wrong type annotation for args
namiwang opened this issue · 5 comments
Hi, I'm trying to implement an experimental parser with custom tokenizer, in rust, via syntax-cli
.
Here's a simple grammar (partially ripped from ruby's)
0. $accept -> program
-----------------------
1. program -> top_compstmt
2. top_compstmt -> top_stmts opt_terms
3. top_stmts -> top_stmt
4. top_stmt -> stmt
5. stmt -> expr
6. expr -> arg
7. arg -> primary
8. primary -> literal
9. literal -> numeric
10. numeric -> simple_numeric
11. simple_numeric -> tINTEGER
12. opt_terms -> terms
13. term -> tNL
14. terms -> term
┌────────────────┬───────────┐
│ Symbol │ First set │
├────────────────┼───────────┤
│ $accept │ tINTEGER │
├────────────────┼───────────┤
│ program │ tINTEGER │
├────────────────┼───────────┤
│ top_compstmt │ tINTEGER │
├────────────────┼───────────┤
│ top_stmts │ tINTEGER │
├────────────────┼───────────┤
│ top_stmt │ tINTEGER │
├────────────────┼───────────┤
│ stmt │ tINTEGER │
├────────────────┼───────────┤
│ expr │ tINTEGER │
├────────────────┼───────────┤
│ arg │ tINTEGER │
├────────────────┼───────────┤
│ primary │ tINTEGER │
├────────────────┼───────────┤
│ literal │ tINTEGER │
├────────────────┼───────────┤
│ numeric │ tINTEGER │
├────────────────┼───────────┤
│ simple_numeric │ tINTEGER │
├────────────────┼───────────┤
│ tINTEGER │ tINTEGER │
├────────────────┼───────────┤
│ opt_terms │ tNL │
├────────────────┼───────────┤
│ terms │ tNL │
├────────────────┼───────────┤
│ term │ tNL │
├────────────────┼───────────┤
│ tNL │ tNL │
└────────────────┴───────────┘
And some productions:
...
top_compstmt
: top_stmts opt_terms {
|$1: Node; $2: Token| -> Node;
$$ = Node::Dummy;
}
;
...
Would produce handlers like:
enum SV {
Undefined,
_0(Token),
_1(Node)
}
...
fn _handler2(&mut self) -> SV {
// Semantic values prologue.
let mut _1 = pop!(self.values_stack, _1);
let mut _2 = pop!(self.values_stack, _0);
let __ = Node::Dummy;
SV::_1(__)
}
...
The issue I encountered is, at the beginning of top_compstmt
aka _handler2
, the values stack is like:
[
_1(Dummy),
_0(Token { kind: 15, value: "\n", ... })
]
It seems legit to me, the first is the reduced result value for top_stmt <- ... <- tINTEGER
and the second one is the result of opt_term <- ... <- tNL
.
Then the statement let mut _1 = pop!(self.values_stack, _1);
is assuming a _1(Node)
is on the top of the stack, meanwhile the reality is the top of the stack is _0(Token)
, thus the issue.
So do you think this is an issue in syntax-cli or somewhere else in my implementation? Thanks.
@namiwang, thanks for reporting, Rust plugin is currently experimental, and might have potential bugs. If you could attach some isolate example of a smaller grammar with tokenizer rules, which shows the issue, it'll be easier to debug it.
Hi! Just coined a smaller demo, which is altered from the example/calc-ast.rs.g
.
%lex
%%
\s+ /* skip whitespace */ return "";
";" return "SEMI";
\d+ return "NUMBER";
"+" return "+";
"*" return "*";
"(" return "(";
")" return ")";
/lex
%left +
%left *
%{
#[derive(Debug)]
pub enum Node {
Literal(i32),
Binary {
op: &'static str,
left: Box<Node>,
right: Box<Node>,
},
Stmt {
expr: Box<Node>
}
}
%}
%%
Stmt
: Expr Terminator {
|$1: Node; $2: Token| -> Node;
$$ = Node::Stmt {
expr: Box::new($1)
}
}
;
Terminator
: SEMI
;
Expr
: Expr + Expr {
// Types of used args ($1, $2, ...), and return type:
|$1: Node; $3: Node| -> Node;
$$ = Node::Binary {
op: "+",
left: Box::new($1),
right: Box::new($3),
}
}
| ( Expr ) {
$$ = $2;
}
| NUMBER {
|| -> Node;
let n = yytext.parse::<i32>().unwrap();
$$ = Node::Literal(n);
};
And the log
thread 'parser' panicked at 'called `Option::unwrap()` on a `None` value', libcore/option.rs:345:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::print
at libstd/sys_common/backtrace.rs:71
at libstd/sys_common/backtrace.rs:59
2: std::panicking::default_hook::{{closure}}
at libstd/panicking.rs:211
3: std::panicking::default_hook
at libstd/panicking.rs:227
4: <std::panicking::begin_panic::PanicPayload<A> as core::panic::BoxMeUp>::get
at libstd/panicking.rs:463
5: std::panicking::try::do_call
at libstd/panicking.rs:350
6: std::panicking::try::do_call
at libstd/panicking.rs:328
7: core::ptr::drop_in_place
at libcore/panicking.rs:71
8: core::ptr::drop_in_place
at libcore/panicking.rs:51
9: <std::collections::hash::map::RandomState as core::hash::BuildHasher>::build_hasher
at /Users/travis/build/rust-lang/rust/src/libcore/macros.rs:20
10: dummy::Tokenizer::to_token
at src/lib.rs:447
11: dummy::Tokenizer::get_next_token
at src/lib.rs:363
12: dummy::Tokenizer::get_next_token
at src/lib.rs:360
13: dummy::Tokenizer::_lex_rule6
at src/lib.rs:632
14: parser::parser
at tests/parser.rs:9
15: parser::__test::TESTS::{{closure}}
at tests/parser.rs:6
16: core::ops::function::FnOnce::call_once
at /Users/travis/build/rust-lang/rust/src/libcore/ops/function.rs:223
17: <F as alloc::boxed::FnBox<A>>::call_box
at libtest/lib.rs:1451
at /Users/travis/build/rust-lang/rust/src/libcore/ops/function.rs:223
at /Users/travis/build/rust-lang/rust/src/liballoc/boxed.rs:638
18: panic_unwind::dwarf::eh::read_encoded_pointer
at libpanic_unwind/lib.rs:105
test parser ... FAILED
OK, I think the issue is in the wrong pop order. In particular in your example, the generated handler:
fn _handler1(&mut self) -> SV {
// Semantic values prologue.
let mut _1 = pop!(self.values_stack, _1);
let mut _2 = pop!(self.values_stack, _0);
let __ = Node::Stmt {
expr: Box::new(_1)
};
SV::_1(__)
}
Should first pop the token. A quick fix to test, change in the generated file the:
fn _handler1(&mut self) -> SV {
// Semantic values prologue.
let mut _2 = pop!(self.values_stack, _0);
let mut _1 = pop!(self.values_stack, _1);
let __ = Node::Stmt {
expr: Box::new(_1)
};
SV::_1(__)
}
I'll take a look later on into this, should be a simple fix, and will appreciated a PR as well in case. Thanks for catching this up!
OK, fixed in 7153b3a, and in the v. 0.1.2
.
Please let me know if you see any issues!
Thanks!