pylint-dev/astroid

Replace inference tip cache with a single reference to the last inference tip result

Closed this issue · 3 comments

The inference tip cache is byzantine, still misses a portion of the proper cache key (**kwargs), and for that reason we can't rewrite it as an lru cache:

def _inference_tip_cached(func: InferFn[_NodesT]) -> InferFn[_NodesT]:
"""Cache decorator used for inference tips."""
def inner(
node: _NodesT,
context: InferenceContext | None = None,
**kwargs: Any,
) -> Generator[InferenceResult, None, None]:
partial_cache_key = (func, node)
if partial_cache_key in _CURRENTLY_INFERRING:
# If through recursion we end up trying to infer the same
# func + node we raise here.
raise UseInferenceDefault
try:
yield from _cache[func, node, context]
return
except KeyError:
# Recursion guard with a partial cache key.
# Using the full key causes a recursion error on PyPy.
# It's a pragmatic compromise to avoid so much recursive inference
# with slightly different contexts while still passing the simple
# test cases included with this commit.
_CURRENTLY_INFERRING.add(partial_cache_key)
result = _cache[func, node, context] = list(func(node, context, **kwargs))
# Remove recursion guard.
_CURRENTLY_INFERRING.remove(partial_cache_key)
yield from result
return inner

Since a cache with a maxsize of 1 still produces most of the performance upside, explore replacing the cache with a single reference to the last inference tip result. Then also provide a way to clear that result from clear_inference_tip_cache().

See discussion at #2181.

Measure the performance difference when linting large packages like home-assistant.

A start:

diff --git a/astroid/inference_tip.py b/astroid/inference_tip.py
index 9eda5b4f..c2b8ed7a 100644
--- a/astroid/inference_tip.py
+++ b/astroid/inference_tip.py
@@ -6,7 +6,6 @@
 
 from __future__ import annotations
 
-from collections import OrderedDict
 from collections.abc import Generator
 from typing import Any, TypeVar
 
@@ -19,9 +18,9 @@ from astroid.typing import (
     TransformFn,
 )
 
-_cache: OrderedDict[
+_cache: dict[
     tuple[InferFn[Any], NodeNG, InferenceContext | None], list[InferenceResult]
-] = OrderedDict()
+] = {}
 
 _CURRENTLY_INFERRING: set[tuple[InferFn[Any], NodeNG]] = set()
 
@@ -61,10 +60,12 @@ def _inference_tip_cached(func: InferFn[_NodesT]) -> InferFn[_NodesT]:
             # test cases included with this commit.
             _CURRENTLY_INFERRING.add(partial_cache_key)
             try:
-                # May raise UseInferenceDefault
-                result = _cache[func, node, context] = list(
-                    func(node, context, **kwargs)
-                )
+                result = list(func(node, context, **kwargs))
+            except UseInferenceDefault:
+                raise
+            else:
+                _cache.clear()
+                _cache[func, node, context] = result
             finally:
                 # Remove recursion guard.
                 try:
@@ -72,9 +73,6 @@ def _inference_tip_cached(func: InferFn[_NodesT]) -> InferFn[_NodesT]:
                 except KeyError:
                     pass  # Recursion may beat us to the punch.
 
-                if len(_cache) > 64:
-                    _cache.popitem(last=False)
-
         # https://github.com/pylint-dev/pylint/issues/8686
         yield from result  # pylint: disable=used-before-assignment

Or, it's possible #529 should be investigated first.

Checked a couple packages and there is still a speed cost to reducing the cache size from 64 to 1, so not planning to change this ATM.