C++ Model
Say I have the following C++ data structures I wish to expose to Python.
#include <memory>
#include <vector>
struct mystruct
{
int a, b, c, d, e, f, g, h, i, j, k, l, m;
};
typedef std::vector<std::shared_ptr<mystruct>> mystruct_list;
Boost Python
I can wrap these fairly effectively using boost::python with the following code, easily allowing me to use the existing mystruct (copying the shared_ptr) rather than recreating an existing object.
#include "mystruct.h"
#include <boost/python.hpp>
using namespace boost::python;
BOOST_PYTHON_MODULE(example)
{
class_<mystruct, std::shared_ptr<mystruct>>("MyStruct", init<>())
.def_readwrite("a", &mystruct::a);
// add the rest of the member variables
class_<mystruct_list>("MyStructList", init<>())
.def("at", &mystruct_list::at, return_value_policy<copy_const_reference>());
// add the rest of the member functions
}
Cython
In Cython, I have no idea how to extract an item from mystruct_list, without copying the underlying data. I have no idea how I could initialize MyStruct
from the existing shared_ptr<mystruct>
, without copying all the data over in one of various forms.
from libcpp.memory cimport shared_ptr
from cython.operator cimport dereference
cdef extern from "mystruct.h" nogil:
cdef cppclass mystruct:
int a, b, c, d, e, f, g, h, i, j, k, l, m
ctypedef vector[v] mystruct_list
cdef class MyStruct:
cdef shared_ptr[mystruct] ptr
def __cinit__(MyStruct self):
self.ptr.reset(new mystruct)
property a:
def __get__(MyStruct self):
return dereference(self.ptr).a
def __set__(MyStruct self, int value):
dereference(self.ptr).a = value
cdef class MyStructList:
cdef mystruct_list c
cdef mystruct_list.iterator it
def __cinit__(MyStructList self):
pass
def __getitem__(MyStructList self, int index):
# How do return MyStruct without copying the underlying `mystruct`
pass
I see many possible workarounds, and none of them are very satisfactory:
I could initialize an empty MyStruct
, and in Cython assign over the shared_ptr. However, this would result in wasting an initalized struct for absolutely no reason.
MyStruct value
value.ptr = self.c.at(index)
return value
I also could copy the data from the existing mystruct
to the new mystruct
. However, this suffers from similar bloat.
MyStruct value
dereference(value.ptr).a = dereference(self.c.at(index)).a
return value
I could also expose a init=True
flag for each __cinit__
method, which would prevent reconstructing the object internally if the C-object exists already (when init is False). However, this could cause catastrophic issues, since it would be exposed to the Python API and would allow dereferencing a null or uninitialized pointer.
def __cinit__(MyStruct self, bint init=True):
if init:
self.ptr.reset(new mystruct)
I could also overload __init__
with the Python-exposed constructor (which would reset self.ptr
), but this would have risky memory safety if __new__
was used from the Python layer.
Bottom-Line
I would love to use Cython, for compilation speed, syntactical sugar, and numerous other reasons, as opposed to the fairly clunky boost::python. I'm looking at pybind11 right now, and it may solve the compilation speed issues, but I would still prefer to use Cython.
Is there any way I can do such a simple task idiomatically in Cython? Thanks.
return dereference(self.c.at(index).get())
work? I.e. retrieve theshared_ptr
from the vector,get()
the stored pointer anddereference
it. Or maybe simplyreturn dereference(self.c.at(index))
(in C++ you can dereference the shared pointer directly). – Ceramistmystruct
instead of aMyStruct
. I guess you would need a second constructordef __cinit__(MyStruct self, new_ptr): self.ptr.reset(new_ptr)
and then doreturn MyStruct(self.c.at(index))
. – Ceramistdef
(unlike acdef
), and initialization functions cannot becdef
-only. If Cython let me define custom constructors withcdef
, that would solve everything. Unfortunately, it does not. It's probably doable via the Python C-API, or by overloading__init__
, but the docs pretty clearly state the object should be valid when__init__
is called, and__init__
may not be called at all. cython.readthedocs.io/en/latest/src/userguide/… – Pomiculture__cinit__
plusreturn MyStruct.__new__(self.c.at(index))
could work. – Ceramist__new__
was used from the Python layer« You are raising your standards to an unreasonable and ridiculous level. If somebody calls__new__
on the Python level they better know what they are doing. If you want memory safety just rewrite your whole code in Python. – CeramistCannot convert 'type' to Python object
. B). Expecting memory safety from choices made in a memory-safe language is not a trivial concern. It's essential. – Pomiculture, which highlights how it prevents initialization without required data (in this case, it needs to know the type). If I force the type with
c = a.MyStruct.__new__(MyStruct)`, and then try to use c, it automatically checks that the struct is invalid before I access any member functions. That is useful memory safety. – Pomiculturecdef struct
, but not withcdef cppclass
so maybe I should change the title? Either way, it does not work unless I do manual memory management, sinceshared_ptr
is clearly a cppclass. Either way, this seems to be a major design flaw that I don't see an obvious solution to.... – Pomiculture__init__
override, I can give you the answer. – Pomiculture