exo-lang/exo

Pattern matching discussion

yamaguchi1024 opened this issue · 3 comments

Python's pattern matching doesn't support any extension of semantic equivalence (there's a draft for it). The only double-underscore method you can specify is __match_args__ , which lets you specify which value you want to pattern match against. Therefore, without user-facing IR, we'll end up doing a structural matching on internal cursors, which is not possible because we don't want to expose internal cursors to users. Let's look at how consumer_tile in blur/unsharp masking example can be implemented using different pattern matching options.

  • Option 1: using Python's pattern match
    API Cursor can be replaced by a user-facing IR wrapper, maybe revive the stale QAST.
import exo.QAST as e

def consumer_tile(p: Proc, cons_c : ForSeqCursor):
    match cons_c.type():
        case e.For(e.For(e.For(_) as gc) as cc):
            p = reorder(cc, gc)
            cons_c = cc
 
    p = split(p, cons_c, 32)
    return vectorize(p, cons_c, 16)
  • Option 2: write custom pattern match syntax
    I think this complements the isinstance(..) way of doing things well.
def consumer_tile(p: Proc, cons_c : ForSeqCursor):
    child, gchild = exo_match(cursor, """
                                      for _ in _:
                                      $0{ for _ in _:
                                        $1{ _ }$ }$
                                      """)
    if child and gchild:
        p = reorder(child, gchild)
        cons_c = child

    p = split(p, cons_c, 32)
    return vectorize(p, cons_c, 16)

Thanks for opening this issue! Just echoing some of the things from the synchronous meeting:

  • @jrk mentioned that attempting to overload things with Python might be doomed. I generally agree: there is only so far that we can overload the pythonic syntax before things stop making sense.
  • I'm personally a fan of the exo_match operator. It gives us a lot of control on the syntax we expose and looks a lot more natural.

@gilbo might have more thoughts on this so pinging here.

gilbo commented

Yes, I was suggesting the exo_match possibility. One implementation idea I had in mind (and why I put in the weird $ tokens) would proceed in two steps.

  1. Scan the string for special tokens, record their position, and erase them from the string.
  2. Feed the cleaned up string through the Python parser to get a Python AST.
  3. Use the lexical location of the removed tokens to identify nodes of the parsed AST.

This has the benefit of adhering to the "Let's parse Python patterns using the Python parser" convention already used by Exo. (This convention has the additional virtue of preventing us from straying too far from normal, expected Python syntax)

A third possibility between option 1 (which can be made more powerful using dunder methods that hook into pattern matching) and option 2 (which requires using annoying multi-line strings) is to somehow perform an AST hijack. The downside of this option is that you need something like a function definition (not merely a statement) that you can decorate with some @exo_match decorator. I briefly considered this but then decided it would probably be worse than the multi-line string.

gilbo commented

By two steps, I obviously meant 3. LOL