pwwang/python-varname

varname doesn't play well with type annotations

jaycunningham-8451 opened this issue ยท 13 comments

Thanks for making this available -- it's a very clever package.

I've noticed that varname doesn't play well with type annotations. Minimal code to reproduce an error:

from varname import varname

class Foo:
    def __init__(self):
        self.id = varname()

foo: Foo = Foo()
# => varname.VarnameRetrievingError: Failed to retrieve the variable name.

It also doesn't like Generic[T]:

from typing import Generic, TypeVar

T = TypeVar("T")

class Foo(Generic[T]):
    def __init__(self):
        self.id = varname()

foo = Foo[int]()
foo.id  # => 'result'

Are there any plans to support this better? I plead ignorance as to the complexity, admittedly, but it would be helpful for my use case.

Sure, I will definitely look into this.

First problem is that an assignment with an annotation is actually a different AST node, an AnnAssign, not related to a regular Assign.

import ast
print(ast.dump(ast.parse("x = 1")))
# Module(body=[Assign(targets=[Name(id='x', ctx=Store())], value=Constant(value=1, kind=None), type_comment=None)], type_ignores=[])
print(ast.dump(ast.parse("x: int = 1")))
# Module(body=[AnnAssign(target=Name(id='x', ctx=Store()), annotation=Name(id='int', ctx=Load()), value=Constant(value=1, kind=None), simple=1)], type_ignores=[])

That's a very good clue to move forward. @alexmojaki Do you want to work on a PR for this? Otherwise, I can take care of this.

I don't, please take care of it. Should be very straightforward, just make sure you find all references to Assign.

Second problem is that Foo[int]() calls _GenericAlias.__call__ which in turn constructs a Foo and calls __init__, so when varname looks back one frame it's looking at the wrong line. varname(caller=2) solves the problem.

No worries. I will take care of it.
It makes sense for the second problem.

Thank you for the hints.

@alexmojaki
The first one is straightforward. However, the second one is a little bit tricky.

The intermediate frame is exactly what you said:

Traceback(filename='/home/pwwang/miniconda3/lib/python3.7/typing.py', 
          lineno=678, 
          function='__call__', 
          code_context=['        result = self.__origin__(*args, **kwargs)\n'], 
          index=0)

There is no reliable information telling us that this call is from typing and there may be a Generic[T] stuff.
Once we have this kind of confirmation then we can go back one more frame, which is the right one where the assignment is.

One could definitely intentionally design a self.__origin__(*args, **kwargs) function and get the variable name that the return value is assigned to, by calling varname inside.

What I meant was that @jay-h-cunningham should use caller=2. The feature exists for this kind of situation.

I was thinking to adopt both ways (with and without type annotations), but suddenly realizing that the Foo class was designed to be initialized that way. That is the solution.

I assume you mean allowing both Foo() and Foo[int]() which would require different values for caller. I think this is a good example of how trying to specify a value for caller is often difficult or impossible. An alternative strategy is to mark certain functions as functions that varname should 'ignore', as in "I never want the varname from here". So you might write something like:

import typing
import varname

varname.ignore(typing._GenericAlias.__call__)

And then when varname is looking for the caller, it skips frames with the code object from there.

This gets more complicated when the function has been decorated and you don't have easy access to the original. In that case one possibility is to pass the qualified function name as a string, maybe something like:

import typing
import varname

varname.ignore(typing, "_GenericAlias.__call__")

executing can usually get the qualified name from a code object, although now of course many caveats apply like source code being available and the function name being unique.

Similarly instead of writing this:

def function():
    # I know that at which stack this will be called
    return varname(caller=3)

def function1():
    return function()

def function2():
    return function1()

func = function2()  # func == 'func'

You might write:

def function():
    return varname()

@ignore
def function1():
    return function()

@ignore
def function2():
    return function1()

func = function2()  # func == 'func'

The biggest concern that came into my mind is the uniqueness when we do it in the __qualname__ way. Decorators are good since there is no ambiguity. However, some of the intermediate calls may not be accessible; thus the __qualname__ is needed. But we have to make sure it's not too aggressive.

@jay-h-cunningham
v0.5.6 adds an argument ignore to ignore some intermediate calls. Now you can simply write your classes as follows:

import typing
from varname import varname

class Foo:
    def __init__(self):
        # ignore any intermediate calls from typing module
        self.id = varname(ignore=[typing])

foo = Foo()
# foo.id == 'foo'
foo: Foo = Foo()
# foo.id == 'foo'
import typing
from typing import Generic, TypeVar

T = TypeVar("T")

class Foo(Generic[T]):
    def __init__(self):
        self.id = varname(ignore=[typing])

foo = Foo()
# foo.id == 'foo'
foo = Foo[int]()
# foo.id  == 'foo'
foo: Foo = Foo[int]()
# foo.id == 'foo'

Feel free to reopen this issue if you still have related questions, or open another one if you have other issues.

@jay-h-cunningham As of v0.6.0, you don't need that ignore=[typing] anymore. Calls from standard libraries will be automatically ignored.