Discussion : Improving handling of co_flags

Question

Discussion : Improving handling of co_flags

Closed this issue 7 years ago · 14 comments

Currently the flags for a code object need to be specified manually as a single integer. This approach offers the maximum flexibility but is also error prone as the flags and could I believe be improved. Follow are the 'official' flags (ie excluding future flags such as CO_FUTURE_BARRY_AS_BDFL and CO_FUTURE_GENERATOR_STOP).

From dis.COMPILER_FLAG_NAMES

 1 OPTIMIZED
  2 NEWLOCALS
  4 VARARGS
  8 VARKEYWORDS
 16 NESTED
 32 GENERATOR
 64 NOFREE
128 COROUTINE
256 ITERABLE_COROUTINE

Among those we can identify kind of three families (please correct me if I got this wrong):

the flags completely independent of the underlying code : NEWLOCALS, VARARGS, VARKEYWORDS, ITERABLE_COROUTINE
the flags completely dependent on the underlying code: OPTIMIZED, NO_FREE, GENERATOR
and in between flags:
- NESTED: apply only to function like code defined in another function (and honestly I have difficulty understanding what it does...)
- COROUTINE: can be obvious if GET_AWAITABLE is used but it is not always so

Furthermore COROUTINE and ITERABLE_COROUTINE are incompatible.

I hence believe that it would be profitable to be able to:

specify manually the value for the first kind through a higher level construct
have the proper value be computed in an automatic fashion for the second kind
be able to specify if the code is nested (byteplay does this by passing a keyword arg to to_code) and if the code is from a function (byteplay only guess here).
be able to force the coroutine behavior or have it inferred.

Looking at how byteplay handles this, there is a number of attribute on the code object itself allowing to specify the flags (and the from_function keyword arg in to_code). Moreover generator behavior can be forced.

I do not have a specific implementation in mind but I think that keeping the flag logic in a separate class differentiating between default values (from the original code or guessed) and forced user value may help. The conversion to an int would obsiously require the code.

What do people think ?

Answer 1 · 2017-01-09T13:36:53.000Z

I propose to experiment creating a new bytecode.Flags class.

Maybe bytecode could accept both Flags and int types for flags (convert int to Flags implicitly)?

Answer 2 · 2017-02-27T12:33:41.000Z

Why not use enum.IntFlags rather of enum.IntEnum?

Answer 3 · 2017-02-27T14:14:32.000Z

Because IntFlags is new in 3.6

Answer 4 · 2017-02-27T14:34:42.000Z

Maybe import external enum implementation in 3.5 and earlier?

Answer 5 · 2017-02-27T15:05:00.000Z

I would like @Haypo point of view before adding an external dependency and one that would be version dependent.

Answer 6 · 2017-02-27T15:34:23.000Z

Sorry, I don't know well enum.IntFlag. Can you please show some examples compared to Matthieu's class?

Answer 7 · 2017-02-27T15:59:58.000Z

It is similar to IntEnum, but supports combinations. It is just a bit mask with fanny representation.

>>> import enum
>>> class Flags(enum.IntFlag):
...     CO_OPTIMIZED             = 0x00001  # noqa
...     CO_NEWLOCALS             = 0x00002  # noqa
...     CO_VARARGS               = 0x00004  # noqa
... 
>>> Flags.CO_OPTIMIZED|Flags.CO_NEWLOCALS
<Flags.CO_NEWLOCALS|CO_OPTIMIZED: 3>
>>> print(Flags.CO_OPTIMIZED|Flags.CO_NEWLOCALS)
Flags.CO_NEWLOCALS|CO_OPTIMIZED
>>> int(Flags.CO_OPTIMIZED|Flags.CO_NEWLOCALS)
3
>>> Flags(3)
<Flags.CO_NEWLOCALS|CO_OPTIMIZED: 3>
>>> (Flags.CO_OPTIMIZED|Flags.CO_NEWLOCALS) & Flags.CO_NEWLOCALS
<Flags.CO_NEWLOCALS: 2>
>>> (Flags.CO_OPTIMIZED|Flags.CO_NEWLOCALS) & ~Flags.CO_NEWLOCALS
<Flags.CO_OPTIMIZED: 1>

Answer 8 · 2017-02-27T16:13:52.000Z

If we use enum.IntFlag, we need to write Matthieu logic to compute "implicit flags" someone else, near ConcreteBytecode.to_code().

How do you represent "no flag" (flags=0) using IntFlag?

Answer 9 · 2017-02-27T16:29:19.000Z

You can just use Flags(0). Or add explicit name for value 0.

>>> class Flags(enum.IntFlag):
...     NO_FLAGS = 0
...     CO_OPTIMIZED = 1
...     CO_NEWLOCALS = 2
... 
>>> Flags(0)
<Flags.NO_FLAGS: 0>
>>> print(Flags(0))
Flags.NO_FLAGS
>>> Flags.NO_FLAGS|Flags.CO_OPTIMIZED
<Flags.CO_OPTIMIZED: 1>

Answer 10 · 2017-02-27T18:41:55.000Z

The issue I see is that there will be no way to let the user override the flag inference in such a case (what the _forced dict is currently used for).

Answer 11 · 2017-03-03T11:15:38.000Z

I addressed some of the comments in #20 but we need to decide on a representation of the flags before I complete. If we go with IntFlags we need a third party library to provide it for python < 3.6 (or rewrite it), one possible way to handle the inference would be a helper function returning a new flag from a flag and a bytecode (can generalize to non concrete I think) and it is then up to the user to update the flags after modifying the bytecode. Otherwise we keep the two classes we have and I just need to address the last comments.
After thinking a bit about it, I think using IntFlags would probably lead to the less confusing situation (the None/True/False vales of inferable flags is not really obvious). Once we settle on a provider of IntFlags I can update my patch.

Answer 12 · 2017-03-03T11:54:34.000Z

Sorry, I don't have a strong opinion on the flags API. I let you decide with @serhiy-storchaka :-)

Answer 13 · 2017-03-03T12:08:21.000Z

@serhiy-storchaka are you aware of any backport of python 3.6 enum ? Otherwise I fear I will have to copy-paste it from CPython, which is not ideal.

Answer 14 · 2017-11-19T21:04:38.000Z

Closed by #23