Improve `inferred` interface
ostr00000 opened this issue · 7 comments
Steps to reproduce
- Consider following code:
node = extract_node(
"""
if hasattr(typing, 'some_future_function'):
future_function = typing.some_future_function
else:
def future_function(arg: int) -> bool:
return False
future_function
"""
)
inferred_any = node.inferred()[0]
assert inferred_any is not util.Uninferable
Current behavior
This code fail, because it inferred
function return inferred values in order of its occurrence in analyzed code.
Expected behavior
A NodeNG
interface should have a method to take only one best inferred value.
Considered changes
- If function
inferred
find any nonUninferable
value, then remove allUninferable
from returning list:- probably many tests will fail,
- maybe sometimes we want to know about uncertainty inferred value,
- In function
inferred
before returning, we can sort values with most "valuable" values at the beginning:- some tests fail, because previously there wasn't such assumption about order,
- Add another method
def inferred_best(self) -> InferenceResult
toNodeNG
, where it returns single inferred value, possibly avoid Uninferable.
I think the 3rd option is the best and I prepared some code, but before creating pull request I want to know others opinions.
Rationale
It is important for other functionality (especially in pylint-dev/pylint#7565 - where inferred value is considered as Uninferable
despite having another better value).
inferred()[0]
is a frequent code (about 60 occurrences in astroid and pylint), so it seems reasonable to replace it with inferred_best()
- I tested such a change locally, and it seems ok.
python -c "from astroid import __pkginfo__; print(__pkginfo__.version)"
output
5.16.0dev0
I think this make sense if we want to decreases false positives in pylint. It should not help too much because we're supposed not to raise when we can't infer already. When you say you tested did you mean you ran pylint's tests with the modified code ? I would be very surprised if it did not lead at least to some behavior changes.
When you say you tested did you mean you ran pylint's tests with the modified code ?
Yes, pylint and astroid test. I created PR to run tests in astroid.
In pylint there are only 2 possible replacement locations - one of them is the code used by mentioned pylint issue.
I'm not sure if I agree with the proposal. It feels like we are hiding results with magic that makes the behaviour less explicable.
Uninferables
aren't incorrect or useless, they show that the inference "engine" has genuine difficulty with understanding the code. I think using this in pylint
only increases the risk of creating false positives as something that is now indeterminable would be considered determined by just taking part of the inference result.
I don't think we should support this a public API.
Well, I am ok if you reject it. But the problem still remains.
In a Python typing system, the expression future_function
from "Steps to reproduce" in this issue is Union[Any, Callable[[int], bool]]
. Currently, when we are considering the result of inferred
, we take only Any
type. In the ideal world, we should consider all possible union types. I do not understand why we take inferred()[0]
instead of for example inferred()[-1]
? - this is what I would call It feels like we are hiding results with magic
.
For me, the result is "random" - if we swap if-branches order in user code, we got totally different result.
Of course, we can still keep the result of such inference randomly, but then issues similar to pylint-dev/pylint#7565 are unable to resolve.
Prioritizing any "know type" over Uninferable
is in my opinion the most desired behavior. Alternatively, maybe we should choose Uninferable
over "know type", to make this function more deterministic?
The first change that I considered while solving pylint-dev/pylint#7565 was to make this sorting right here. But then I thought: "wait, that should be in interface of NodeNG
", so here is this issue.
I agree that Uninferables
aren't incorrect or useless. We have information whenever the type is fully/partially inferable from inferred
function. But the real use case of replaced calls in this PR is to get something better than Any
/Uninferable
.
Actually, now when I considered this typing topic deeper, I think that in the ideal world, instead of using inferred()[0]
, we should use a loop over all possible values. Obviously, this is totally impractical (both for library code and from the user point of view). This is the reason why Union
is not allowed as a return type in typeshed
(typeshed convention: avoid union return types
). Additional explanation:
an operation is okay for a union type if it is valid for all items of the union
python/mypy#1693 (comment)
Summing up:
- We should decide about determinism of
inferred
(+ maybe some comment in docs?), - Is function
inferred_best
improve the usefulness of pylint or make it worse? - How the original issue that I was going to fix pylint-dev/pylint#7565 should be solved if this PR will be rejected?
Well, I am ok if you reject it. But the problem still remains.
In a Python typing system, the expression
future_function
from "Steps to reproduce" in this issue isUnion[Any, Callable[[int], bool]]
. Currently, when we are considering the result ofinferred
, we take onlyAny
type. In the ideal world, we should consider all possible union types. I do not understand why we takeinferred()[0]
instead of for exampleinferred()[-1]
? - this is what I would callIt feels like we are hiding results with magic
. For me, the result is "random" - if we swap if-branches order in user code, we got totally different result.
This is why we have safe_infer
. Ideally we would always check all results of infer()
or use safe_infer
to make sure we are using the "full" result and not [0]
or [-1]
.
I know there are still cases in the code where we don't do this, but those should ideally be refactored.
Of course, we can still keep the result of such inference randomly, but then issues similar to PyCQA/pylint#7565 are unable to resolve. Prioritizing any "know type" over
Uninferable
is in my opinion the most desired behavior. Alternatively, maybe we should chooseUninferable
over "know type", to make this function more deterministic?
I think we shouldn't prioritize anything. We should either check that all results are of the same type or indicate that there is ambiguity. See safe_infer
as mentioned above.
The first change that I considered while solving PyCQA/pylint#7565 was to make this sorting right here. But then I thought: "wait, that should be in interface of
NodeNG
", so here is this issue. I agree thatUninferables
aren't incorrect or useless. We have information whenever the type is fully/partially inferable frominferred
function. But the real use case of replaced calls in this PR is to get something better thanAny
/Uninferable
. Actually, now when I considered this typing topic deeper, I think that in the ideal world, instead of usinginferred()[0]
, we should use a loop over all possible values. Obviously, this is totally impractical (both for library code and from the user point of view). This is the reason whyUnion
is not allowed as a return type intypeshed
(typeshed convention:avoid union return types
). Additional explanation:an operation is okay for a union type if it is valid for all items of the union
python/mypy#1693 (comment)
That explanation of not using Union
is irrelevant here and we shouldn't care too much about typing anyway. pylint
doesn't concern itself with type annotations but only with what it can infer statically.
Summing up:
1. We should decide about determinism of `inferred` (+ maybe some comment in docs?),
As said above, let's not do determinism but allow the user/library to determine the degree of determinism they want. astroid
/pylint
can't get that degree right 100% of the time so we should not try to do so.
2. Is function `inferred_best` improve the usefulness of pylint or make it worse?
Use safe_infer
or consider all results of infer()
3. How the original issue that I was going to fix [typing_extensions assert_never's Never not supported pylint#7565](https://github.com/PyCQA/pylint/issues/7565) should be solved if this PR will be rejected?
This is a different issue again. It is expected that code that isn't available on Python 3.10
will show up as Uninferable
. We don't maintain a list of things that are added in different version so that you can still lint Python 3.10
code with a 3.7
interpreter. The suggested fix should be applied so that pylint
ran with 3.11
behaves correctly, but we don't do "backwards compatibility" of linting new features on older interpreters (that would be way too much work to maintain).
maybe we should choose Uninferable over "know type"
This is why we have safe_infer.
It seems that this is what I overlooked.
pylint doesn't concern itself with type annotations but only with what it can infer statically.
Yes, I know it, but the inferred types (union of types) are still subject to typing laws, but I see you are aware of it (Ideally we would always (...) make sure we are using the "full" result and not [0] or [-1]
).
It is expected that code that isn't available on Python 3.10 will show up as Uninferable
It makes sense. Although, from this sentence it can be concluded that pylint will never support a "compatibility" library. By compatibility, I mean a lot of if-statements with hasattr
, import redirections and alternative declarations (ex. typing_extensions
).
The only code without errors is when we use the newest version of a redirected library (ex. typing
), which make the compatibility library a bit useless.
Anyway, I am closing this issue along with PR, since this I have no more concerns. Thanks for a clean explanation!
It makes sense. Although, from this sentence it can be concluded that pylint will never support a "compatibility" library. By compatibility, I mean a lot of if-statements with
hasattr
, import redirections and alternative declarations (ex.typing_extensions
). The only code without errors is when we use the newest version of a redirected library (ex.typing
), which make the compatibility library a bit useless.
You could look into using py-version
. We try to maintain backwards compatibility for newer interpreters for old versions. So, we warn when you use something that is not available in the lower bound version you set with py-version
. It also has its flaws and is not implemented for 100% of the checks (although the most important ones are), but it might help out here!
Thanks for a clean explanation!
Thanks for the compliment! And thank you for opening the issue and engaging in a meaningful discussion. Even though we haven't helped you directly it is also good for us to see how people use pylint
and what limitations they run into.