IDAES/idaes-pse

Add Pylint checks forcing Pyomo rules to use only local variables

dallan-keylogic opened this issue · 5 comments

While debugging issues with collocation discretization for dynamics, we found that nonintuitive behavior of Python closures when defining functions with nonlocal variables can cause major problems. The behavior at issue is this:

In [1]: string = "foo"

In [2]: def print_foo():
   ...:     print(string)
   ...:

In [3]: string = "bar"

In [4]: print_foo()
bar

The behavior I expected was that when print_foo is defined, it would freeze the value of string to be what it was when print_foo was defined. Instead, it is still updated from the outer context. This can cause a ton of trouble when Pyomo rules are created using nonlocal variables. For example, in the Separator, a rule is defined using multiple variables from outside the local scope:

image

The same problem occurs: the nonlocal variables change in a loop, the equations are generated correctly the first time, but then are incorrectly generated at additional timepoints by PyomoDAE.

To prevent this problem in the future, I propose that checks be added (via Pylint?) that any function with rule in the name, and any functions with the @Constraint or @Expression decorators be forced to use only local variables.

I've searched for "closures" in the Pylint 2.12.2 docs and cell-var-from-loop seems to be the only match: https://pylint.readthedocs.io/en/v2.12.2/technical_reference/features.html#variables-checker-messages

image

Unfortunately, I'm not sure if that check can be used as-is for this purpose. I'll try to look into (a) searching for similar checks that might have been added in later versions of Pylint, and (b) how the check is implemented in Pylint, to get a sense of how hard it would be to adapt.

I think that check would work to prevent the sorts of problems we're getting from these instances. Being more paranoid than that might be good practice in general, but if we can get most of the value from something built in, no need to put in a huge amount of effort.

I agree with the general principle that "something is better than nothing" when it comes to these checks.

My doubt about this specific check is whether it might be too narrow to be useful: from its name and description (i.e. without having performed tests or looked at how it's implemented), my understanding is that it only triggers if the closure is defined in a for loop, and not in other scopes (such as the if block in the initial example).

The original example is actually nested inside a big for loop, which is causing the problem. It's just too big to actually paste in (look at the PR I opened if you want to see what's going on in context).

Ooh, I see, thanks for clarifying. In this case, I agree it makes sense to at least try enabling the check and see what happens (possibly as part as #1242 itself?).