JuliaSymbolics/Metatheory.jl

regression in simplification compared to v1

simeonschaub opened this issue · 12 comments

The following returns missing on both v2.0.1 and latest master:

julia> using Metatheory
1

julia> algebra_rules = @theory a b c begin
           a * (b * c) == (a * b) * c
           a + (b + c) == (a + b) + c

           a + b == b + a
           a * (b + c) == (a * b) + (a * c)
           (a + b) * c == (a * c) + (b * c)
           
           -a == -1 * a
           a - b == a + -b
           1 * a == a

           0 * a => 0
           a + 0 == a
           
           a::Number * b == b * a::Number
           a::Number * b::Number => a * b
           a::Number + b::Number => a + b
       end;

julia> @areequal algebra_rules a - (a + b) -b
missing

If I explicitly add Metatheory@v1 though, this returns true, as expected. Any ideas?

Investigating now

It's becoming quite nasty to debug. I believe we now need visualization of the saturation process in order to understand what rules are being applied and what rules are not in each iteration step.

Well, you're missing a - a => 0 in the theory, if you add it, it works. Question is, how did it know that? It seems that MT 1.0 was doing magic then.

Shouldn't this work out to a + -a -> 1 * a + -1 * a -> (1 + -1) * a -> 0 * a -> 0?

Can you try now on latest main?

I still get missing unfortunately

changing a + 0 == a to a + 0 --> a solved it locally for me.
the thing is that in 2.0 the Schedulers.BackoffScheduler got more efficient, but it is still a naive algorithm. a + 0 == a will produce a lot of matches and thus the rule will be banned in most iterations by the exponential backoff algorithm, because every single enode in the egraph will match, and it will spam the buffer. There should be some optimization done for "always matching" rules.

Yeah, I think that works as a workaround for me for now

Can close?

Yeah, feel free to close. I don't know much about this package's internals but could the heuristic be taught to differentiate between the two directions in repeatedly applying the rules? In this case, it ideally would have realized that replacing a with a + 0 is often a bad idea but replacing a + 0 with a is generally worthwhile.

Yeah, feel free to close. I don't know much about this package's internals but could the heuristic be taught to differentiate between the two directions in repeatedly applying the rules? In this case, it ideally would have realized that replacing a with a + 0 is often a bad idea but replacing a + 0 with a is generally worthwhile.

Any (theoretical) contribution to heuristics would be really appreciated. For the internals, you can check out https://dl.acm.org/doi/pdf/10.1145/3434304 - a short and interesting read. The heuristics are defined in src/EGraphs/Schedulers.jl