BioJulia/Automa.jl

Same char used in two different patterns of an alternation

Closed this issue · 2 comments

I don't know if it is a bug but I was expecting that in alternation the patterns would be tried in order and be exclusive to each other. However, it looks like it is possible to have a single char used in multiple pattern composing a single alternation.

As an example consider the following

aa = re"aa"
aa.actions[:exit] = [:twoas]
ab = re"[ab]"
ab.actions[:exit] = [:aorb]
expr = re.rep(aa | ab)
machine = Automa.compile(expr)
Automa.execute(machine, raw"aa")

It returns (0, [:aorb, :twoas]), so the first "a" is used in both aa and ab.

This is causing me some trouble since I would like to be able to have a fallback that acts on chars only if they are not used in any other patterns. I think I can see a workaround but I wanted to make sure it is the intended/expected behavior first.

I am using Automa v0.8.2.

This is indeed a bug. The correct behaviour is to throw an error, because the expression is ambiguous. Consider: Which is the correct behaviour for "aa": [:twoas] or [:aorb, :aorb]? Impossible to decide.

Turns out this has already been fixed - but since throwing an error is a breaking change, the fix won't come in until v0.9