Filters: `--async` and `--function` and `--class` and various typing ones

Question

Filters: `--async` and `--function` and `--class` and various typing ones

simonw opened this issue 2 years ago · 16 comments

The --signatures option turns out to be a pretty great way to start navigating a new codebase.

It might be useful to be able to filter by types of content. Potentially the following:

--async - return async function (and method) definitions
--function - just functions
--class - just classes
--method - just class methods

These would be additive, so the following:

symbex --signatures --method --function

Would return all methods and all functions.

But... what would this do?

symbex -s --async

Would it return all async functions AND async methods? If so, would combining it with --function or --method not make a difference?

Or should there be a --async-method filter that's different from --async (which only gets async functions)?

Answer 1 · 2023-06-19T18:33:18.000Z

I'll prototype it and play with it and see how it feels.

Answer 2 · 2023-06-19T18:43:02.000Z

This is a pretty fun prototype:

% symbex --async -d ../datasette -s | head -n 20 
# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)

# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 60
async def render_response(request)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/memory_name.py Line: 6
async def startup(datasette)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/show_open_files.py Line: 6
async def show_open_files()

Needs a bit more thought, then tests and docs.

Answer 3 · 2023-06-19T18:43:26.000Z

I didn't implement --method yet, not completely convinced it is necessary.

Answer 4 · 2023-06-19T18:45:10.000Z

A --typed filter that just returns things that have type signatures - and a --untyped one that returns things without type signatures - might be neat too.

Answer 5 · 2023-06-19T20:14:51.000Z

Maybe this:

--typed - any function with at least one of its arguments or return using type annotations
--untyped - not a single type annotation
--fully-typed - all arguments plus the return value have type annotations (a subset of --typed)
--partially-typed - some but not all of the arguments and return value have annotations

If you are working on a project and trying to add types to every single function, you can iterate on it using this to find the functions that still need some work:

symbex --untyped --partially-typed --signatures

Answer 6 · 2023-06-19T20:18:59.000Z

Maybe --method isn't necessary because you can use a '*.*' selector instead?

Or maybe --method is a shortcut for that adds the '*.*' selector?

It would be slightly surprising that while most of these filter options add together, --method --async would filter methods to just the async ones.

For that reason I think using *.* to filter methods may be more consistent.

Answer 7 · 2023-06-20T09:43:50.000Z

Confirmed, this works already to get all async methods:

symbex -d ../datasette -s --async '*.*'

Answer 8 · 2023-06-20T10:03:45.000Z

I built a messy prototype of the typing ones and I like them a lot:

symbex --typed -s

# File: tests/example_symbols.py Line: 50
def func_type_annotations(a: int, b: str) -> bool

# File: tests/example_symbols.py Line: 94
def function_with_non_pep_0484_annotation(x: ?, xx: ?, yy: ?, y: ?, zz: float) -> ?

# File: tests/example_symbols.py Line: 104
def complex_annotations(code: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]

# File: symbex/lib.py Line: 11
def find_symbol_nodes(code: str, filename: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]

# File: symbex/lib.py Line: 35
def code_for_node(code: str, node: AST, class_name: str, signatures: bool) -> Tuple[(str, int)]

# File: symbex/lib.py Line: 66
def match(name: str, symbols: Iterable[str]) -> bool

# File: symbex/lib.py Line: 91
def function_definition(function_node: AST)

# File: symbex/lib.py Line: 193
def annotation_definition(annotation: AST) -> str

# File: symbex/lib.py Line: 241
def annotation_summary(node: AST) -> AnnotationSummary

symbex --untyped -s

# File: tests/example_symbols.py Line: 10
def func_positional_args(a, b, c)

# File: tests/example_symbols.py Line: 15
async def async_func(a, b, c)

# File: tests/example_symbols.py Line: 20
def func_default_args(a, b=2, c=3)

...

Answer 9 · 2023-06-20T10:36:25.000Z

Am I right that these should always be OR and not AND?

I just tried this and it didn't give me what I expected:

symbex --async --typed -s

Turns out I expected just typed async functions, but I got back ALL async functions and then ALL typed functions.

Answer 10 · 2023-06-20T10:36:52.000Z

I'm going to try these as AND rather than OR and see how that feels.

Answer 11 · 2023-06-20T10:45:01.000Z

Yes, AND for filters feels better. Switching to that.

Answer 12 · 2023-06-20T10:49:13.000Z

This method is failing the test for --fully-typed:

class ClassForTypedTests:
    def method_fully_typed(self, a: int, b: str) -> bool:
        pass

Because self has no type. So I need a special case check for class methods.

Answer 13 · 2023-06-20T10:50:56.000Z

I'm going to refactor AnnotationSummary - it looks like this:

AnnotationSummary = namedtuple(
    "Summary", ["num_arguments", "num_typed", "return_is_typed"]
)

I'm going to change it to this:

class TypeSummary:
    fully: bool
    partially: bool
)

Answer 14 · 2023-06-20T11:05:58.000Z

Because self has no type. So I need a special case check for class methods.

That caused another problem: now --typed is returning any method with a self argument because I wrote code that says "treat that as typed" - even though there are no type annotations on anything else.

Answer 15 · 2023-06-20T11:11:37.000Z

Here's a fun trick:

 llm --system 'Suggest several ways to refactor this' '    if (
        not symbols
        and not signatures
        and not async_
        and not function
        and not class_
        and not typed
        and not untyped
        and not partially_typed
        and not fully_typed
    ):
        ctx = click.get_current_context()
        click.echo(ctx.get_help())
        ctx.exit()'

Output:

There are several ways to refactor this code:

Consolidate all the boolean checks into a list and use all() to check if all items are False:

checks = [
    not symbols, not signatures, not async_,
    not function, not class_, not typed,
    not untyped, not partially_typed,
    not fully_typed
]

if all(checks):
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()

Use De Morgan's Law to simplify the boolean checks:

if (
    symbols or signatures or async_ or function or
    class_ or typed or untyped or partially_typed or
    fully_typed
):
    pass
else:
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()

Use any() with the negation of the boolean checks:

if not any([
    symbols, signatures, async_, function,
    class_, typed, untyped, partially_typed,
    fully_typed
]):
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()

All three options achieve the same result, so it's up to personal preference on which one to choose.

Answer 16 · 2023-06-20T11:21:48.000Z

One last thought: an __init__() method should not need a return type annotation to qualify as not breaing the fully typed rule.