RustPython/Parser

Add `Node` union

MichaReiser opened this issue · 5 comments

Add a new Node union that is an enum over all node types. This can be useful when implementing methods that can operate on any node. For example, Ruff's formatter has (roughly) the API format(node: AnyNode) -> String

I haven't figured out the full requirements yet, and I don't know yet if we'll need both the owned and reference versions:

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
enum Node {
	IfStmt(ast::IfStmt),
	ConstantExpr(ast::ConstantExpr),
	...
}

#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash, is_macro)]
enum NodeRef<'a> {
	IfStmt(&'a ast::IfStmt),
	ConstantExpr(&'a ast::ConstantExpr)
	...
}

Note: An alternative naming more in line with Path and PathBuf would be to name the types NodeBuf (owned) and Node (reference)

The enums should have the following basic methods:

  • if_stmt(self) -> Option<ast::IfStmt>
  • as_if_stmt(&self) -> Option<&ast::IfStmt>
  • const is_if_stmt(&self) -> bool

Node could also implement AsRef that returns a NodeRef

I may have time to work on this sometime soon but I wanted to put this up for discussion first to get feedback.

  1. Is it not covered by Node trait?
    If you need a concrete type of union, it will not be covered. If you need only a common method interface, it will be enough.

  2. Does it need to include all of the nodes, not the top level nodes? I think the top level ast.AST node in Python matches to Rust

pub enum AST {
    Stmt(ast::Stmt),
    Expr(ast::Expr),
    ...
}

Then IfStmt and ConstantExpr will be under each variant. Do you need to flatten all the nodes?

Is it not covered by Node trait?
In our use case, we need to call the formatting function for the specific node variant, extract all fields and then perform the formatting. We could implement this with the node trait in a suboptimal way if it has a downcast_ref(type) -> Option<T> method that downcasts the node to the given type. However, this would require that Ruff calls downcast_ref for every possible type until it finds the right type. Having an enum would allow us to implement this static-dispatch to the right node.

Does it need to include all of the nodes, not the top level nodes? I think the top level ast.AST node in Python matches to Rust

I haven't thought about it much but we can probably do either and both approaches have pros and cons:

  • top-level: Fewer variants to handle per level.
  • flat:
    • Easier to get to a specific node without nested matches (you can use as_constant_expr directly
    • Should be smaller because we avoid nested enums.

Because adding top-level nodes to the root type AST makes consistent interface, how about adding it first and see if it is enough or not?
We can add a new one if it is not enough.

What do you mean by top level? Do you mean the 'Stmt' and Expr (ond others) enums or something else?

We can also add this to ruff_python_ast first and upstream it when we've figured out the api and you believe that this would be useful for RustPython too

In ast module of python, nodes directly inheriting AST. (= T.__base__ == ast.AST) This is the full list.

  • Mod
  • Stmt
  • Expr
  • ExprContext
  • Boolop
  • Operator
  • Unaryop
  • Cmpop
  • Comprehension
  • Excepthandler
  • Arguments
  • Arg
  • Keyword
  • Alias
  • Withitem
  • MatchCase
  • Pattern
  • TypeIgnore