[BUG] Problem when creating derived Python objects from C++ (inheritance slicing)
mhochsteger opened this issue · 17 comments
EDIT(eric): Adding a tracking list.
- Enumerate possible solutions and drawbacks, e.g.
- #2757 Using
py::wrapper<>
and GC ressurrection - Others...
- #2757 Using
- Identify best path forward, implement.
Issue description
Hello,
Consider deriving a C++ class in Python and overloading virtual functions, including a 'Clone' function.
When passing a derived Python object to C++ without holding a reference (like calling Clone()), the information about the Python type is lost. In this case the wrong base-class function is called. More complex cases may result in a SEGFAULT.
Is there any way to work around/fix this without causing memory leaks?
Thanks!
Reproducible example code
test.cpp
// test.cpp
#include <pybind11/pybind11.h>
#include <Python.h>
#include <iostream>
#include <memory>
using std::cout;
using std::endl;
using std::unique_ptr;
using std::shared_ptr;
using std::make_shared;
namespace py = pybind11;
class Base {
public:
virtual void Print() {
cout << "Base!" << endl;
}
virtual shared_ptr<Base> Clone() {
return make_shared<Base>();
}
};
class PyBase : public Base {
public:
using Base::Base; // Inherit constructors
void Print() override { PYBIND11_OVERLOAD(void, Base, Print, ); }
shared_ptr<Base> Clone() override { PYBIND11_OVERLOAD(shared_ptr<Base>, Base, Clone, ); }
};
PYBIND11_MODULE(libtest, m) {
py::class_<Base, py::wrapper<PyBase>, shared_ptr<Base>>(m, "Base")
.def(py::init<>())
;
m.def("Test", [] ( shared_ptr<Base> b ) {
shared_ptr<Base> d = b->Clone();
// Python object of d is already dead!
d->Print(); // prints "Base!"
});
}
test.py
from libtest import *
class Derived(Base):
def __init__(self):
super().__init__()
def Print(self):
print("Derived!")
def Clone(self):
return Derived()
Test(Derived())
This seems a surprisingly hard problem. The main issue is that the C++ Clone
and the Python Clone
do fundamentally different things: the former returns a new C++ instance, the latter returns a new Python instance. The way referencing works is that the Python instance extends the C++ instance, rather than the other way around, which means casting to the C++ type isn't sufficient to keep the Python instance alive: you can easily end up being the only holder of the C++ instance, as happens here. (I don't think this will segfault: I think overload lookups will simply always fail in such a case since the owning object no longer exists).
The PyBase
class could hold a py::object
reference to the constructed object (you'd have to expand the PYBIND11_OVERLOAD
and change it to save the auto o
--which here is py::object
), but then you've got a circular reference that can't be automatically cleaned up.
As I said, I don't see an easy way around this.
I worked it around by creating a manager class to hold the reference to the derived Python objects. It's ad hoc and ugly, but since I know exactly when to release the derived Python objects, it works for me.
I think this issue duplicates with #1145 . boost.python doesn't have this issue. I think this is a missing piece in pybind11.
Agreed that it would be nice. (Retitling).
Just curious (for some of my WIP PRs related to #1145): Can I ask if you know of an example boost.python code snippet that shows this working?
Also, do you happen to know where in the boost.python code this case is specifically handled? (preventing type slicing between C++ / Python?)
We are currently use a semi-hacky fork of pybind11
that handles this case for the Underactuated Robotics course / textbook; however, if there is a more elegant solution, I'm all for it.
I am not aware of an open source or self-containing example of it, but pretty sure boost.python works great. I didn’t trace that deep to boost.python to learn its mechanism for that feature. I’d guess it somehow holds the reference to the Python incarnation, since C++ types essentially is different from Python.
Gotcha, thanks!
I've briefly skimmed through the Boost.Python
source code for the component I think that's responsible, wrapper<T>
+ wrapper_base
, and I do see self-referencing via PyObject*
, but I have not yet seen any reference count increments to increase Python lifetime to match C++:
https://github.com/boostorg/python/blob/7352c9c0f770633e695aa8f48b647aa7a78e49c7/include/boost/python/detail/wrapper_base.hpp#L78
https://github.com/boostorg/python/blob/7352c9c0f770633e695aa8f48b647aa7a78e49c7/include/boost/python/object/pointer_holder.hpp#L200
Will see if I can dig a wee bit more to figure out if it does handle this, and maybe whip up a test case akin to what's in #1146.
Another possible solution was suggested here: #1049 (comment)
I'll close this for now, as the issue is +2 years old. Do reopen if necessary!
I really like this issue in lieu of the other 6 or so issues that we have for this bug. I'm going to reopen this, and close the others.o
@YannickJadoul mentioned a related class of issue (mismatch between Python and C++ instance lifetimes): #1941
Hey - I've just come across this issue in working on OpenSpiel (https://github.com/deepmind/open_spiel)
The workaround I'm going with for now is that when created, the C++ trampoline class owns the Python object that backs it. The trampoline can then be passed around in C++ land without the Python object disappearing.
Then if / when the object is returned to Python, the C++ object releases the reference held to the Python object, after the pybind side has taken ownership of the Python object.
None of this is very pretty in user code, but could presumably be much nicer if incorporated into pybind11.
Howdy Edward! Sorry this bit ya, but that's cool you got it to work! I think that may be along the lines of what Jason mentioned up at the beginning of the thread (#1333 (comment)); also, I had done a similar thing in our fork, but it is a bit kludgy (and required in-place modifications to detect ownership transfer between C++ and Python) - https://github.com/RobotLocomotion/pybind11/blob/58a368ea8e89638e01a7385cad9ce4345d70003d/include/pybind11/pytypes.h#L1517-L1614
Any chance you're able to share a consolidated repro of your workaround in user code?
If you're affected by this issue, please check out the smart_holder branch, it is solved there:
https://github.com/pybind/pybind11/blob/smart_holder/README_smart_holder.rst
We've just ran into this issue. Whilst we wait for #2839 (or similar) to fix this, I would be very grateful if one of the pybind gods could take a look at our workaround (iterated from the workaround proposed in #1546) to flag any gotchas (tests so far seem to work).
The basic idea is to derive a PySharedPtr
from shared_ptr
, and use this new PySharedPtr
in the signature of .def
ed functions where we need to keep alive the Python object.
A custom type_caster
for PySharedPtr
then returns an aliasing shared_ptr
that wraps the py::object
lifetime but dereferences to the C++ type.
I believe this works by decoupling the originating shared_ptr
of the C++ object and the new shared_ptr
passed to C++ functions, using the Python refcount to link the two.
template <class T>
struct PySharedPtr : std::shared_ptr<T> {
using std::shared_ptr<T>::shared_ptr;
};
namespace pybind11::detail {
template <typename T>
class type_caster<PySharedPtr<T>> : public type_caster_holder<T, std::shared_ptr<T>> {
PYBIND11_TYPE_CASTER(PySharedPtr<T>, _("PySharedPtr"));
using BaseCaster = type_caster_holder<T, std::shared_ptr<T>>;
bool load(pybind11::handle src, bool convert) {
if (!BaseCaster::load(src, convert)) {
return false;
}
auto pyObj = reinterpret_borrow<object>(src);
// Construct a shared_ptr to the py::object
auto pyObjPtr = std::shared_ptr<object>{new object{pyObj}, [](auto pyObjectPtr) {
gil_scoped_acquire gil;
delete pyObjectPtr;
}};
value = PySharedPtr<T>(pyObjPtr, static_cast<T*>(BaseCaster::value));
return true;
}
};
} // namespace pybind11::detail
Gotchas... that's a very difficult question. When I worked on the smart_holder implementation, several subtle issues only became known through extensive testing. Anticipating just based on code and imagination is hard.
After only a quick look:
Minor:
class type_caster<PySharedPtr<T>> : public type_caster_holder<T, std::shared_ptr<T>> {
This is more direct (elides a conditional_t
that is not actually conditional):
class type_caster<PySharedPtr<T>> : public copyable_holder_caster<T, std::shared_ptr<T>> {
IIUC this workaround doesn't solve #1138, i.e. this will only work with class_<T, std::shared_ptr<T>>
, is that a correct understanding?
Why do you prefer a workaround over using the smart_holder branch? — This is for me to understand what's missing.
Thanks for the feedback @rwgk!
-
Correct, this assumes the originating holder is a
shared_ptr
and the.def
ed function wants ashared_ptr
from pybind, but also wants to ensure the associated Python instance remains alive and isn't garbage collected because the trampoline needs it. -
I would love to use the smart_holder branch. However many of the adopters of our library will be integrating within an application already using pybind, in a rather conservative industry, so using anything other than an official release is probably not acceptable. Even upgrading to a new official release would have to be carefully considered.
Looking again for a couple minutes, what I see there looks intriguing, but the way I usually convince myself what I have is working as intended in all corner cases, is to write a bunch of unit tests. To be honest (to myself), usually I overlook a thing or two, and only after I have convinced myself (and reviewers) that the unit tests cover all use cases, and the testing passes with all sanitizers I can get my hands on, do I conclude that I'm probably OK, and count on time to prove me right or wrong. In the latter case, the response is to add more unit tests to cover what I overlooked.
You could mine the smart_holder unit tests (mostly test_class_sh_*) for ideas / starting points.
You may also want to look at this analysis: #2672 (comment)
The explicit and reinterpret casts between s<B>
and s<D>
(shared_ptr
base, derived, as defined there) are, to the best of my knowledge, undefined behavior, although evidently in practice that does not matter. Just something you may want to consider in your (company level) decision to not use the smart_holder branch.
in a rather conservative industry, so using anything other than an official release is probably not acceptable
Google "lives at head" as I often hear my colleagues say, i.e. we seem to be at the opposite ends of a spectrum. Which leads to: sorry I don't have the free energy right now that make the smart_holder branch "official" in some way. I'd love to do that sooner rather than later, but realistically, sooner will only happen with external help.