datetime extension module isolation
Closed this issue · 22 comments
PEP-687 is about to be implemented to the _datetime
module.
Two approaches are proposed at present. I would like Steering Council to choose one.
-
(regular)
Multi-phase-init with per-module heaptypes (and C-API).
abi: Store a reference (to the C-API struct) inPyInterpreterState
. -
(special)
Multi-phase-init with per-interpreter heaptypes (and C-API).
abi: Store at least 7 heaptypes' references inPyInterpreterState
.
The following shows the replacements of _datetime
heaptypes (multi-phase init module) when running test_datetime
in the refleak test:
main interp sequence | 1.per-module | 2.per-interp |
---|---|---|
import module | heaptypes_A | heaptypes_A |
module reload (new module) | heaptypes_B (new) | (no change) |
module reload (new module) | heaptypes_C (new) | (no change) |
interp finish |
The approach (2) avoids possible regressions1 caused by mixing modules with forbidden hacky usage23. Technically, (2) does not make sense (1) should be enough. In my opinion, unknown regressions should not be hidden by (2). Also, (2) will be a bad official example of isolation. However, there seems to be a case where maintaining (1) is not good for mental health, which is also serious.
Related issue/PR: gh-117398, gh-118357
Thanks
UPDATED: a few rewords
Footnotes
FYI, _datetime
code owners are not involved in PEP-687 implementation (no response).
Intentionally reloading modules isn't forbidden usage. It's typically weird usage, but it's not forbidden. We include explicit workarounds in the import machinery so the interpreter doesn't die when extension modules using single-phase init are reloaded, and the impact of those workarounds results in behaviour similar to what @vstinner has proposed in gh-117398.
The benefit of adopting an approach like the one Victor suggests in a case like this one is that it allows "support per-interpreter imports" to be separated from "generate completely new heap types when reloading the module" (in this particular case, it isn't clear the latter is actually desirable behaviour, and it is definitely backwards incompatible behaviour).
The extra pointer to a substruct in PyInterpreterState
isn't ideal, but getting rid of it by migrating to a different C API design that doesn't need it is better deferred until after the migration to multi-phase init (especially since it also makes it much clearer where the problems are actually arising).
gh-118337 fully implements:
1.(regular)
Multi-phase-init with per-module heaptypes (and C-API).
abi: Store a reference (to the C-API struct) in PyInterpreterState.
It passes refleak bot tests where the module is reloaded for a good reason.
As noted in python/cpython#117398 (comment), this may be a test coverage concern moreso than an SC design decision. I doubt the existing test suite explicitly covers the situation we're concerned about (a capsule or C API pointer reference acquired before the reload being used after the reload and crashing the interpreter), so even green CI may not be enough to be certain that full reload support is backwards compatible.
I would llke an SC decision about how much a module can depend on PyInterpreterState
.
_datetime
and other modules are not so different except the presence of a capsule C-API, and the capsule is not so different from the module state.
While you're correct that PyModule_GetState
and PyCapsule_GetPointer
are both technically exposed to the same pointer invalidation problem if a C module is reloaded, there's a key difference in their intended usage model (albeit not a well-documented one, since the distinction never mattered when single-phase init was the only option and hence there was no true reloading of extension modules):
PyModule_GetState
is for the module itself to use. If the module gets genuinely reloaded, references to the old module state will be replaced by references to the new module state. Other code should NOT be callingPyModule_GetState
, and if it is, and the offending code breaks on module reload, the response is going to be "Don't do that then".- By contrast,
PyCapsule_GetPointer
is explicitly an export API. Once another module has received a capsule object from the exporting module, there isn't any expectation that reloading the exporting module might invalidate previously exported capsule instances (and under single phase initialisation, they don't get invalidated, since the extension modules are never truly reloaded). Code that is using the capsule API can break through no fault of its own if we don't preserve capsule pointer stability across module reloads.
Don't get me wrong, I think you've identified a genuine functional gap in the multi-phase init feature here: the existence of the capsule API means that there really should be a public C API for extension modules using multi-phase init to define and manage per-interpreter-instance state in addition to the existing support for defining per-module-instance state. While the standard library can work around this problem by making use of PyInterpreterState
, third party modules don't have that luxury.
In the specific case of migrating datetime
to multi-phase init, though, it's OK for the sequence of events to be:
- migrate
datetime
usingPyInterpreterState
to maintain pointer stability across reloads for the capsule export API and the public C API - subsequently use
datetime
and its capsule API as a motivating example for adding a public API to allow multi-phase init extension modules to define and access per-interpreter-instance state in a general way (this could potentially be as simple as a new internal state dict keyed by module name, akin to thesys.modules
cache, but not exposed via a Python API) - adjust any standard library modules affected by this problem to use the new public per-interpreter state management API rather than private module specific workarounds
While the standard library can work around this problem by making use of
PyInterpreterState
, third party modules don't have that luxury.
What's wrong with using PyInterpreterState_GetDict
for this? (Personally I use it to stash a reference to my module so that I have something to PyModule_GetState
on later.)
Threading while reloading is probably a nightmare, but it's always going to be a nightmare. Ignoring that, PyDict_SetItemString(...)
during module exec seems safe enough?
- By contrast,
PyCapsule_GetPointer
is explicitly an export API.
If datetime
API includes the direct use of PyCapsule_GetPointer()
, a capsule should be held until the interpreter finishes (by using PyInterpreterState_GetDict()
?). The API struct could be updated by memcpy()
or something on every module-reload. (I'm honestly not so worried about C-API here)
Per-module heaptypes should be enough, but may be too technical may not be intuitive. I would like to know the rule and the exception.
@erlend-aasland also pointed out this past discussion of the external-resource-management variant of this problem (which leads to needing the ability to set up per-process locks rather than per-interpreter ones): https://discuss.python.org/t/a-new-c-api-for-extensions-that-need-runtime-global-locks/20668
Did @erlend-aasland point out that the problem applies to _datetime
?
The linked discussion was not about _datetime
, IIRC. However, such an API could have been used to solve the probablems with the encapsulated _datetime
C API in a nice way, I think.
Right, the specific problem in the _datetime
case is ensuring pointer stability across module reloads for both its C API and its capsule API.
There is a similar-but-distinct problem that comes up when managing external (non-Python) resources that don't allow reinitialisation within a single process: module reloads need to avoid reinitialising the affected external resources, and instead make them available to the reinitialised instance of the module.
While the referenced thread is specifically about the external resource management variant of the problem, any public API we come up with needs to handle both cases.
Getting back to the original question for the SC here though, I think it boils down to:
- We have established that some extension modules need to preserve some of their state across reloads when migrating to multi-phase initialisation. However, there is not yet a public API for this, so the simplest option for affected modules is to stick with single-phase initialisation until such an API is defined.
- We'd like to make more standard library extension modules compatible with subinterpreters running without a shared GIL, which means migrating them to use multi-phase initialisation
- Some standard library extension modules are affected by the need to maintain pointer stability or preserve external resources across reloads. In such cases, the current workaround is to use
PyInterpreterState
fields that point to nested structs until such time as a standardised public mechanism is defined for management of extension module state that needs to be preserved across reloads. - Migrating the
_datetime
module to multi-phase initialisation without ensuring public pointer stability across module reloads doesn't actually cause any CI failures, but this is potentially due to a lack of relevant CI test coverage rather than due to it actually being a safe change.
I think the only SC level decision here is "Is it OK to continue using the PyInterpreterState
workaround to allow standard library extension modules to be migrated to multi-phase initialisation in the absence of a public API for preserving state across reloads?". It doesn't solve the problem for third party modules, but it does move the problem off the critical path for improvements to subinterpreter support.
The specific case of _datetime
is better resolved by adding the missing test coverage to the multi-phase init migration PR and seeing if it fails (and assuming it fails as expected, using the PyInterpreterState
workaround to eliminate the failure).
_datetime
is the last remaining extension module that need to be migrated.
While we can suggest that the new generic C-API is needed, there seems to be no individual discussion (issue) about its design related to _datetime
as well, which I think is one reason for the delay of _datetime
migration.
so the simplest option for affected modules is to stick with single-phase
I think it is OK for third party modules to keep single-phase init. The migration is up to users.
1.We have established that some extension modules need to preserve
3.Some standard library extension modules are affected by the need to maintain pointer stability
What standard library extension modules are you suggesting?
The specific case of
_datetime
is better resolved by adding the missing test coverage to the multi-phase init migration PR and seeing if it fails (and assuming it fails as expected, using the PyInterpreterState workaround to eliminate the failure).
You mean per-interpreter heaptypes for _datetime
to eliminate the failure, right?
If so, do you have a regression case which cannot follow the current convention (consensus)?
@ncoghlan, what do you think of introducing the a new (internal, for now) API, as discussed, and then adopt _datetime
to use it? There should be sufficient time before the 3.14 alpha to get it in.
@erlend-aasland Can you explain how the discussion resolves _datetime
isolation? If it is really effective, why didn't you either propose nor apply it before the 3.12 beta?
[...] why didn't you either propose nor apply it before the 3.12 beta?
I work on CPython on my spare time; I'm not paid to work on CPython.
Then, why didn't you suggest it on your spare time?
I don't think this is a productive line of conversation; let's drop it.
_datetime
needs rationales if per-interpreter heaptypes are required.
_datetime needs rationales if per-interpreter heaptypes are required.
If you contact me in private, I propose to discuss it with you.
If you contact me in private, I propose to discuss it with you.
I'm here to follow SC because you don't look in good condition now to discuss it.
There are more appropriate avenues for resolving this issue than asking for the SC to make a decision. Those include opening a new issue on the CPython tracker or commenting on an existing relevant issue, engaging in a discussion on discuss.python.org, or reaching out the the C API working group. In all such forums, please keep in mind that the Python Code of Conduct applies to all workspaces in order to keep the conversations respectful and productive.