google/latexify_py

Package name trimming as a NodeTransformer

odashi opened this issue · 5 comments

Prefixes of well-known packages are trimmed by the following code block in function_codegen.py

# Removes common prefixes: math.sqrt -> sqrt
# TODO(odashi): This process can be implemented as a NodeTransformer.
for prefix in constants.PREFIXES:
if func_str.startswith(f"{prefix}."):
func_str = func_str[len(prefix) + 1 :]
break

As I noted in the TODO, this can be implemented as a NodeTransformer preprocessor. This change would make the behavior more flexible.

For example, the AST of math.sqrt(x) is:

Call(
    func=Attribute(
        value=Name(id='math', ctx=Load()),
        attr='sqrt',
        ctx=Load()),
    args=[
        Name(id='x', ctx=Load())],
    keywords=[])

and we can implement NodeTransformer to modify the above to:

Call(
    func=Name(id='sqrt', ctx=Load()),
    args=[
        Name(id='x', ctx=Load())],
    keywords=[])

The modification may be applied on only the func subtree.

Just found out about this lib and was playing around with it. The trimming on functions amazed me, but I noticed that package constants don't have the prefixes trimmed. Is this on purpose? Should I open an issue for it?

This is not a problem to my current work, but I thought it was a nice observation to make here.

Example:
image

@Eric-Mendes Thanks!

My first thought about the strategy is as follows:

  1. If the parser found that an identifier involves some prefix (in your case, np), the parser investigates whether the prefix is pointing to a module or not.
  2. If it is a module, try to obtain the real name of the module (np -> numpy)
  3. Checks if the real module name is listed in the pre-defined list (e.g., ["math", "numpy", "tensorflow", "torch"]), the parser replaces the original identifier by removing the prefix (np.inf -> inf).

I think this should also work for the case you mentioned.

It's been a while since my last compilers class in college, but can I try to work on this, @odashi?

@Eric-Mendes Yeah feel free to try it. The task is to implement a new NodeTransformer class in src/latexify/transformers that returns the modified AST. Other classes in the same directory may help to implement it.

Just tried it, @odashi 😄
I hope I'm on the right track.