SamyaDaleh/CL-Toolbox

Earley Tree Generation Takes Too Long

Closed this issue · 5 comments

G = <N, T, S, P>
N = {S, N1}
T = {t0}
S = S
P = {S -> ε, N1 -> S, S -> t0 t0, N1 -> t0 S, S -> N1 N1}

w= t0 t0 t0

This takes for eternity, but it doesn't look that hard. Please check if there can bes something improved. I know there is a trade-off between runtime and correctness. Is there a way to put ... in a tree when tree generation stops? Is there a more feasable breakoff point?

Another famous example. How many trees should there be when parsing the empty string with this grammar?

G = <N, T, S, P>
N = {S}
T = {}
S = S
P = {S -> S S, S -> ε}

There are 7 or 49 depending on which rule is used to generate the first axiom.

Maybe have a tree limit per item? Who needs more than 10 trees anyway?

We can do better.
Is it possible if we have a recursion like (S ε) and (S (S ε) (S ε)) that we stop exactly here? We can examine the problem in that small example and double-check with the bigger one. Should be applied to the Earley Complete Rule I suspect.

Also please check if something similar can be done for TD. Not regarding left-recursion of course, but there are also tree generations that take very long. Maybe create a new ticket for TD when Earley is fixed.

Fixed. Good job, was easier tha I thought.