Filters: `--async` and `--function` and `--class` and various typing ones
simonw opened this issue · 16 comments
The --signatures
option turns out to be a pretty great way to start navigating a new codebase.
It might be useful to be able to filter by types of content. Potentially the following:
--async
- return async function (and method) definitions--function
- just functions--class
- just classes--method
- just class methods
These would be additive, so the following:
symbex --signatures --method --function
Would return all methods and all functions.
But... what would this do?
symbex -s --async
Would it return all async functions AND async methods? If so, would combining it with --function
or --method
not make a difference?
Or should there be a --async-method
filter that's different from --async
(which only gets async functions)?
I'll prototype it and play with it and see how it feels.
This is a pretty fun prototype:
% symbex --async -d ../datasette -s | head -n 20
# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)
# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)
# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)
# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)
# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 60
async def render_response(request)
# File: /Users/simon/Dropbox/Development/datasette/pugins-older/memory_name.py Line: 6
async def startup(datasette)
# File: /Users/simon/Dropbox/Development/datasette/pugins-older/show_open_files.py Line: 6
async def show_open_files()
Needs a bit more thought, then tests and docs.
I didn't implement --method
yet, not completely convinced it is necessary.
A --typed
filter that just returns things that have type signatures - and a --untyped
one that returns things without type signatures - might be neat too.
Maybe this:
--typed
- any function with at least one of its arguments or return using type annotations--untyped
- not a single type annotation--fully-typed
- all arguments plus the return value have type annotations (a subset of--typed
)--partially-typed
- some but not all of the arguments and return value have annotations
If you are working on a project and trying to add types to every single function, you can iterate on it using this to find the functions that still need some work:
symbex --untyped --partially-typed --signatures
Maybe --method
isn't necessary because you can use a '*.*'
selector instead?
Or maybe --method
is a shortcut for that adds the '*.*'
selector?
It would be slightly surprising that while most of these filter options add together, --method --async
would filter methods to just the async ones.
For that reason I think using *.*
to filter methods may be more consistent.
Confirmed, this works already to get all async methods:
symbex -d ../datasette -s --async '*.*'
I built a messy prototype of the typing ones and I like them a lot:
symbex --typed -s
# File: tests/example_symbols.py Line: 50
def func_type_annotations(a: int, b: str) -> bool
# File: tests/example_symbols.py Line: 94
def function_with_non_pep_0484_annotation(x: ?, xx: ?, yy: ?, y: ?, zz: float) -> ?
# File: tests/example_symbols.py Line: 104
def complex_annotations(code: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]
# File: symbex/lib.py Line: 11
def find_symbol_nodes(code: str, filename: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]
# File: symbex/lib.py Line: 35
def code_for_node(code: str, node: AST, class_name: str, signatures: bool) -> Tuple[(str, int)]
# File: symbex/lib.py Line: 66
def match(name: str, symbols: Iterable[str]) -> bool
# File: symbex/lib.py Line: 91
def function_definition(function_node: AST)
# File: symbex/lib.py Line: 193
def annotation_definition(annotation: AST) -> str
# File: symbex/lib.py Line: 241
def annotation_summary(node: AST) -> AnnotationSummary
symbex --untyped -s
# File: tests/example_symbols.py Line: 10
def func_positional_args(a, b, c)
# File: tests/example_symbols.py Line: 15
async def async_func(a, b, c)
# File: tests/example_symbols.py Line: 20
def func_default_args(a, b=2, c=3)
...
Am I right that these should always be OR and not AND?
I just tried this and it didn't give me what I expected:
symbex --async --typed -s
Turns out I expected just typed async functions, but I got back ALL async functions and then ALL typed functions.
I'm going to try these as AND rather than OR and see how that feels.
Yes, AND for filters feels better. Switching to that.
This method is failing the test for --fully-typed
:
class ClassForTypedTests:
def method_fully_typed(self, a: int, b: str) -> bool:
pass
Because self
has no type. So I need a special case check for class methods.
I'm going to refactor AnnotationSummary
- it looks like this:
AnnotationSummary = namedtuple(
"Summary", ["num_arguments", "num_typed", "return_is_typed"]
)
I'm going to change it to this:
class TypeSummary:
fully: bool
partially: bool
)
Because
self
has no type. So I need a special case check for class methods.
That caused another problem: now --typed
is returning any method with a self
argument because I wrote code that says "treat that as typed" - even though there are no type annotations on anything else.
Here's a fun trick:
llm --system 'Suggest several ways to refactor this' ' if (
not symbols
and not signatures
and not async_
and not function
and not class_
and not typed
and not untyped
and not partially_typed
and not fully_typed
):
ctx = click.get_current_context()
click.echo(ctx.get_help())
ctx.exit()'
Output:
There are several ways to refactor this code:
- Consolidate all the boolean checks into a list and use
all()
to check if all items are False:checks = [ not symbols, not signatures, not async_, not function, not class_, not typed, not untyped, not partially_typed, not fully_typed ] if all(checks): ctx = click.get_current_context() click.echo(ctx.get_help()) ctx.exit()
- Use De Morgan's Law to simplify the boolean checks:
if ( symbols or signatures or async_ or function or class_ or typed or untyped or partially_typed or fully_typed ): pass else: ctx = click.get_current_context() click.echo(ctx.get_help()) ctx.exit()
- Use
any()
with the negation of the boolean checks:if not any([ symbols, signatures, async_, function, class_, typed, untyped, partially_typed, fully_typed ]): ctx = click.get_current_context() click.echo(ctx.get_help()) ctx.exit()All three options achieve the same result, so it's up to personal preference on which one to choose.
One last thought: an __init__()
method should not need a return type annotation to qualify as not breaing the fully typed
rule.