How to create python C++ extension with submodule that can be imported
Asked Answered
J

5

5

I'm creating a C++ extension for python. It creates a module parent that contains a sub-module child. The child has one method hello(). It works fine if I call it as

import parent
parent.child.hello()
> 'Hi, World!'

If I try to import my function it fails

import parent
from parent.child import hello
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ModuleNotFoundError: No module named 'parent.child'; 'parent' is not a package

parent.child
> <module 'child'>

here is my code setup.py

from setuptools import Extension, setup
  
# Define the extension module
extension_mod = Extension('parent',
                          sources=['custom.cc'])

# Define the setup parameters
setup(name='parent',
      version='1.0',
      description='A C++ extension module for Python.',
      ext_modules=[extension_mod],
      )

and my custom.cc

#include <Python.h>
#include <string>

std::string hello() {
    return "Hi, World!";
}

static PyObject* hello_world(PyObject* self, PyObject* args) {
    return PyUnicode_FromString(hello().c_str());
}

static PyMethodDef ParentMethods[] = {
    {nullptr, nullptr, 0, nullptr}
};

static PyMethodDef ChildMethods[] = {
    {"hello", hello_world, METH_NOARGS, ""},
    {nullptr, nullptr, 0, nullptr}
};

static PyModuleDef ChildModule = {
    PyModuleDef_HEAD_INIT,
    "child",
    "A submodule of the parent module.",
    -1,
    ChildMethods,
    nullptr,
    nullptr,
    nullptr,
    nullptr

};

static PyModuleDef ParentModule = {
    PyModuleDef_HEAD_INIT,
    "parent",
    "A C++ extension module for Python.",
    -1,
    ParentMethods,
    nullptr,
    nullptr,
    nullptr,
    nullptr
};

PyMODINIT_FUNC PyInit_parent(void) {
    PyObject* parent_module = PyModule_Create(&ParentModule);
    if (!parent_module) {
        return nullptr;
    }
    PyObject* child_module = PyModule_Create(&ChildModule);
    if (!child_module) {
        Py_DECREF(parent_module);
        return nullptr;
    }

    PyModule_AddObject(parent_module, "child", child_module);

    return parent_module;
}

I install and build with python setup.py build install.

So, how do I make sure that my parent is a package?

My code is a toy example but I actually want both modules defined on C++ level. I don't want to split them into several modules - since they are sharing some C++ code.

I'm hoping for something similar to approach of this answer Python extension with multiple modules

Jawbone answered 10/5, 2023 at 20:41 Comment(4)
Did you have a chance to take a look at nanobind or friend? They make creating C++ extensions considerably easier.Lithopone
Is there a reason you are not using pybind11 or Boost.Python?Doubt
@Doubt yes, I was trying to reorganize some legacy code without drastic changes. pybind11 is definitely nice especially since it's headers only.Jawbone
Maybe not the answer you are looking for, but if you wrap your C++ in plain C, you can use ctypes (docs.python.org/3/library/ctypes.html) to create a python module that wraps your C wrapper. Using that you can structure the python wrapper to access the underlying code however you want including what you describe above.Cincinnati
A
1

Doing this from within the extension is a "simple" matter of emulating the behavior for modules that the import system recognizes as packages. (Depending on context, it might be nicer to provide an import hook that did the same thing from the outside.) Just a few changes are needed:

  1. Make the name of the child "parent.child".
  2. Make parent a package:
    PyModule_AddObject(parent_module, "__path__", PyList_New(0));
    PyModule_AddStringConstant(parent_module, "__package__", "parent");
    
  3. Make child a member of that package:
    PyModule_AddStringConstant(child_module, "__package__", "parent");
    
  4. Update sys.modules as Python would if it had performed the import:
    PyDict_SetItemString(PyImport_GetModuleDict(), "parent.child", child_module);
    

Of course, several of these calls can fail; note that if PyModule_AddObject fails you have to drop the reference to the object being added (which I very much did not do here for clarity).

Amery answered 22/5, 2023 at 2:12 Comment(3)
I think this is what I’m looking for, can’t test right now, hope it works.Jawbone
@Jawbone This is interesting, but really hacky to achieve a simple and common task. I wonder why you are fighting against the Pythonic way of building & publishing your package. If you want to use standard tools like poetry, pip, conda to release your package to PiPy and Anaconda, simply follow the convention of the python community. Is it really so hard to add the __init__.py file and set the correct option in your setup.py / setup.cfg / pyproject.toml?Bandylegged
This method is really useful when, for some reasons, the package (containing sub-modules) must end up as a single binary file in the user's site-package folder.Bandylegged
B
2

The trick is to use packages option.

This is the standard way, as many popular C/C++ python packages like tensorflow, pyarrow meinheld and greenlet are using it. Also it is recommended by the Python Packaging Authority.

The only important points are adding the packages keyword option to setup and naming your module as "parent.child":

# Define the extension module
extension_mod = Extension('parent.child', ... )

setup(
  name='parent', 
  packages=["parent"], 
  ext_modules=[extension_mod], ... 
)

That's it. No need to hack around. Also note that ext_modules is not always required, because you can build the C/C++ assets of sub modules separately. See the setup scripts of tensorflow and pyarrow for examples.

It is important to add __init__.py under the "parent" folder, because , otherwise, it would not be treated as a package.

when you say packages = ['foo'] in your setup script, you are promising that the Distutils will find a file foo/__init__.py relative to the directory where your setup script lives.

If your parent package is located in another directory, use package_dir = {'parent': 'path/to/your/parent_package'}. For huge packages like tensorflow, it is also common to use the helper function find_packages to automatically update the packages' list.

Later, when your parent package grows bigger, sub-modules can be split into separated projects. New child project can refer to the parent using ext_package=['parent'].

Bandylegged answered 22/5, 2023 at 2:50 Comment(2)
What if I have several children?Jawbone
@Jawbone Simply append them to ext_modules=[Extension('parent.child1', ...) , Extension('parent.child2', ... ), ...].Bandylegged
L
1

Option1

If you just want to make it a package. Go to Path\to\Python\Python311\Lib\site-packages\<package_name>. By looking at your source code package_name might look like this: parent-1.0-py3.11-win-amd64.egg. Rename the <package_name> to parent and then create a __init__.py file there and it's done.

Results in:

└── site-packages\
    └── parent\
        ├── __init__.py
        └── parent.py

But, will it do what you want? I don't think so

#The line below works
from parent.parent import child
child.hello()
#The line below will still not work as parent.parent is a module
from parent.parent.child import hello

The second import doesn't work because the python file in this package is only parent.py.

Option2

To do from parent.child import hello you have to make an individual module child which contains function hello then follow the package making process explained in Option 1.

So your result after compilation looks like this:

└── site-packages\
    └── parent\
        ├── __init__.py
        └── child.py

Option 3

Or if don't want to restructure and want a better way to import, use:

from parent import child
child.hello()
#OR
hello = child.hello
hello()

Option 4 (Last Option)

My code is a toy example but I actually want both modules defined on C++ level. I don't want to split them into several modules - since they are sharing some C++ code.

This is more of a manual way
setup.py

from setuptools import Extension, setup
  
# Define the extension module
extension_mod = Extension('main',
                          sources=['custom.cc'])

# Define the setup parameters
setup(name='parent',
      version='1.0',
      description='A C++ extension module for Python.',
      ext_modules=[extension_mod],
      )

Create a structure like this:

└── site-packages\
    └── parent\
        ├── __init__.py
        ├── main.py
        └── child.py

init.py

from . import main

child.py

from . import main
hello = main.child.hello

Now you can do this:

>>> from parent.child import hello
>>> hello()
'Hi, World!'

I personally recommend the third option if there are no strict requirements but if you can't restructure and need the exact structure you may try implementing the last one.

Lock answered 15/5, 2023 at 12:15 Comment(4)
Check out the last part I added. This is the best I can think of.Lock
it has multiple files, but they are all compiled into one shared library. Number of c++ files is irrelevant (they can be merged in one file). Their trick was nice because in the end I have one shred library and several top level modules in itJawbone
Why not use Option 4? You can just add a few lines to .py files, without needing to tweak your .cc codeLock
I can use Option 4 it still looks like a workaround. I would need to re-import everything on Python level. I was hoping that I was missing some trick that would’ve exposed the submodules hierarchy “stored” in the shared library.Jawbone
B
1

I'm not sure if this is the standard way or not but works the same way you want. please correct me if it's not correct.

I deleted the parts related to parent in your cpp code and made child.cpp. This is the structure of the package dir. For setup.py I got the idea from this link

|package
|- parent/
|- - __init__.py
|- - child/
|- - - child.cpp
|- setup.py

This is child.cpp:

#include <Python.h>
#include <string>

std::string hello() {
    return "Hi, World!";
}

static PyObject* hello_world(PyObject* self, PyObject* args) {
    return PyUnicode_FromString(hello().c_str());
}

static PyMethodDef ChildMethods[] = {
    {"hello", hello_world, METH_NOARGS, ""},
    {nullptr, nullptr, 0, nullptr}
};

static PyModuleDef ChildModule = {
    PyModuleDef_HEAD_INIT,
    "child",
    "A submodule of the parent module.",
    -1,
    ChildMethods,
    nullptr,
    nullptr,
    nullptr,
    nullptr

};


PyMODINIT_FUNC PyInit_child(void) {
    PyObject* child_module = PyModule_Create(&ChildModule);
    if (!child_module) {
        return nullptr;
    }

    return child_module;
}

parent/__init__.py is:

from .child import *

And changed setup.py to this:

from setuptools import Extension, setup
  
# Define the extension module
extension_mod = Extension('parent.child',
                          sources=['parent/child/child.cpp'])

# Define the setup parameters
setup(name='parent',
      packages=["parent"],
      version='1.0',
      description='A C++ extension module for Python.',
      ext_modules=[extension_mod],
      )

You can also check this link

Braeunig answered 15/5, 2023 at 12:21 Comment(3)
It is important to note that setuptools will be deprecated soon. PyBind11 is preferrableBaines
@Baines will setuptools be deprecated? or just command line usage of setup.py? I can achieve the same result with pip install .Jawbone
I'm not sure pybind11 and setuptools do the same thing as each other? And is it not distutils that's deprecated?Pelias
C
1

I'll have to admit that I (unsuccessfully) tried a few things (including adding packages=["parent"], in setup.py (which is deprecated BTW: [SO]: 'setup.py install is deprecated' warning shows up every time I open a terminal in VSCode)) before thinking of this:

  1. Take the behavior from (referenced) [Python.Docs]: importlib - Importing a source file directly

  2. Replicate it, at least the relevant parts (like adding the module(s) in [Python.Docs]: sys.modules - thanks @DavisHerring for the shorter variant) in PyInit_parent

    Notes:

    • Although it works, I have a feeling that this is not the most orthodox way, so some might see it as a workaround (gainarie) - there is a chance that in some cases it might behave unexpectedly

    • The @TODOs are not necessarily required (functionally, it works without them), but I placed them there in order to be "by the book" (and they are part of the things I tried earlier). There's the possibility that some other module constants will be required

    • Functions computing namespace names are designed for a nesting level of 1 (maximum). If you need higher ones (e.g. parent.child.grandchild) they have to be adapted (maybe to work with strings organized in tree structures), anyway you've got the gist

I modified your code to a working example.

custom.cc:

#include <string>
#include <vector>

#include <Python.h>


using std::string;
using std::vector;

typedef vector<string> StrVec;
typedef vector<PyObject*> PPOVec;

static const string dotStr = ".";
static const string parentStr = "parent";
static const string childStr = "child";


static string join(const string &outer, const string &inner, const string &sep = dotStr)
{
    return outer + sep + inner;
}


static StrVec join(const string &outer, const StrVec &inners, const string &sep = dotStr)
{
    StrVec ret;
    for (const string &inner : inners) {
        ret.emplace_back(join(outer, inner, sep));
    }
    return ret;
}


static int tweakModules(const string &pkg, const StrVec &names, const PPOVec &mods)
{
    PyObject *dict = PyImport_GetModuleDict();
    if (!dict) {
        return -1;
    }
    int idx = 0;
    for (StrVec::const_iterator it = names.cbegin(); it != names.cend(); ++it, ++idx) {
        if (PyDict_SetItemString(dict, join(pkg, *it).c_str(), mods[idx])) {
            return -2;
        }
    }
    return 0;
}


static void decrefMods(const PPOVec &mods)
{
    for (const PyObject *mod : mods) {
        Py_XDECREF(mod);
    }
}


static std::string hello()
{
    return "Hi, World!";
}

static PyObject* py_hello(PyObject *self, PyObject *args)
{
    return PyUnicode_FromString(hello().c_str());
}


static PyMethodDef parentMethods[] = {
    {nullptr, nullptr, 0, nullptr}
};

static PyMethodDef childMethods[] = {
    {"hello", py_hello, METH_NOARGS, ""},
    {nullptr, nullptr, 0, nullptr}
};


static string childFQ = join(parentStr, childStr);

static PyModuleDef childDef = {
    PyModuleDef_HEAD_INIT,
    //childStr.c_str(),
    childFQ.c_str(),  // @TODO - cfati: Module FQName (rigorousity's sake)
    "A submodule of the parent module.",
    -1,
    childMethods,
    nullptr,
    nullptr,
    nullptr,
    nullptr
};


static PyModuleDef parentDef = {
    PyModuleDef_HEAD_INIT,
    parentStr.c_str(),
    "A C++ extension module for Python.",
    -1,
    parentMethods,
    nullptr,
    nullptr,
    nullptr,
    nullptr
};


PyMODINIT_FUNC PyInit_parent()
{
    PyObject *parentMod = PyModule_Create(&parentDef);
    if (!parentMod) {
        return nullptr;
    }
    PyObject *childMod = PyModule_Create(&childDef);
    if (!childMod) {
        Py_XDECREF(parentMod);
        return nullptr;
    }

    // @TODO - cfati: Add modules and their names here (to be handled automatically)
    StrVec childrenStrs = {childStr};
    PPOVec mods = {childMod};

    if (tweakModules(parentStr, childrenStrs, mods) < 0) {
        decrefMods(mods);
        Py_XDECREF(parentMod);
        return nullptr;
    }

    int idx = 0;
    for (StrVec::const_iterator it = childrenStrs.cbegin(); it != childrenStrs.cend(); ++it, ++idx) {
        if (PyModule_AddObject(parentMod, it->c_str(), mods[idx]) < 0) {  // @TODO - cfati: Add modules under parent
            decrefMods(mods);
            Py_XDECREF(parentMod);
            return nullptr;
        }
    }

    mods.push_back(parentMod);
    for (PyObject *mod : mods) {
        if (PyModule_AddStringConstant(mod, "__package__", parentStr.c_str()) < 0) {  // @TODO - cfati: Set __package__ for each module
            decrefMods(mods);
            return nullptr;
        }
    }
    return parentMod;
}

Output:

(py_pc064_03.10_test0) [cfati@cfati-5510-0:/mnt/e/Work/Dev/StackExchange/StackOverflow/q076222409]> ~/sopr.sh
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[064bit prompt]>
[064bit prompt]> ls
custom.cc  setup.py
[064bit prompt]>
[064bit prompt]> python setup.py build
running build
running build_ext
building 'parent' extension
creating build
creating build/temp.linux-x86_64-cpython-310
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC -I/home/cfati/Work/Dev/VEnvs/py_pc064_03.10_test0/include -I/usr/include/python3.10 -c custom.cc -o build/temp.linux-x86_64-cpython-310/custom.o
custom.cc:24:15: warning: ‘StrVec join(const string&, const StrVec&, const string&)’ defined but not used [-Wunused-function]
   25 | static StrVec join(const string &outer, const StrVec &inners, const string &sep = dotStr)
      |               ^~~~
creating build/lib.linux-x86_64-cpython-310
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -g -fwrapv -O2 build/temp.linux-x86_64-cpython-310/custom.o -L/usr/lib/x86_64-linux-gnu -o build/lib.linux-x86_64-cpython-310/parent.cpython-310-x86_64-linux-gnu.so
[064bit prompt]>
[064bit prompt]> ls
build  custom.cc  setup.py
[064bit prompt]> ls build/lib.linux-x86_64-cpython-310/
parent.cpython-310-x86_64-linux-gnu.so
[064bit prompt]>
[064bit prompt]> PYTHONPATH=${PYTHONPATH}:build/lib.linux-x86_64-cpython-310 python -c "from parent.child import hello;print(hello());print(\"Done.\n\")"
Hi, World!
Done.

References (that might be useful):

Colligate answered 20/5, 2023 at 10:23 Comment(0)
A
1

Doing this from within the extension is a "simple" matter of emulating the behavior for modules that the import system recognizes as packages. (Depending on context, it might be nicer to provide an import hook that did the same thing from the outside.) Just a few changes are needed:

  1. Make the name of the child "parent.child".
  2. Make parent a package:
    PyModule_AddObject(parent_module, "__path__", PyList_New(0));
    PyModule_AddStringConstant(parent_module, "__package__", "parent");
    
  3. Make child a member of that package:
    PyModule_AddStringConstant(child_module, "__package__", "parent");
    
  4. Update sys.modules as Python would if it had performed the import:
    PyDict_SetItemString(PyImport_GetModuleDict(), "parent.child", child_module);
    

Of course, several of these calls can fail; note that if PyModule_AddObject fails you have to drop the reference to the object being added (which I very much did not do here for clarity).

Amery answered 22/5, 2023 at 2:12 Comment(3)
I think this is what I’m looking for, can’t test right now, hope it works.Jawbone
@Jawbone This is interesting, but really hacky to achieve a simple and common task. I wonder why you are fighting against the Pythonic way of building & publishing your package. If you want to use standard tools like poetry, pip, conda to release your package to PiPy and Anaconda, simply follow the convention of the python community. Is it really so hard to add the __init__.py file and set the correct option in your setup.py / setup.cfg / pyproject.toml?Bandylegged
This method is really useful when, for some reasons, the package (containing sub-modules) must end up as a single binary file in the user's site-package folder.Bandylegged

© 2022 - 2024 — McMap. All rights reserved.