dubiousconst282/DistIL

Implement Value Numbering / CSE / Redundancy Elimination

Closed this issue · 0 comments

It seems that even in native compilers, value numbering is only applied at the lowest IR level and completely ignore/misses well known "pure-ish" library calls: https://godbolt.org/z/hvc7MdeGq

Opportunities would include string calls (they are immutable), collection lookups (obviously we'll need to account side-effects such as in-between mutation calls), but I believe it can be generalized with some sort of interprocedural analysis that identifies functions which never changes global memory or otherwise cause visible side effects.

It's not possible to directly extend the LVN into a dominator based VN for non-pure instructions (e.g. memory loads and function calls), because a traversal over the dominator tree doesn't exactly represent an actual execution path. We might instead rely on a available-expressions data-flow analysis for this (which can be further extended to more significant optimizations such as code motion, see Eng. a Compiler section 10.3).
Perhaps a even simpler approach based on a upward CFG traversal is enough, see Simple and Efficient SSA construction.

TODOs

  • Initial LVN implementation
  • Global VN/CSE
  • Invalidate location load tags for other instructions: newobj, intrinsics(ld/st/init obj)
    • Might be a good idea to pessimistically invalidate them for unknown instructions
  • Find more bugs

Resources