How to efficiently build a Python dictionary in C++
Asked Answered
S

1

10

For performance reasons I want to port parts of my python program to C++ and I therefore try to write a simple extension for my program. The C++ part will build a dictionary, which then needs to be delivered to the Python program.

One way I found seems to be to build my dict-like object in C++, e.g. a boost::unordered_map, and then translate it to Python using the Py_BuildValue[1] method, which is able to produce Python dicts. But this method which includes converting the container into a string representation and back seems a bit too much 'around the corner' to be the most performant solution!?

So my question is: What is the most performant way to build a Python dictionary in C++? I saw that boost has a Python library which supports mapping containers between C++ and Python, but I didn't find the exact thing I need in the documentation so far. If there is such way I would prefer to directly build a Python dict in C++, so that no copying etc. is needed. But if the most performant way to do this is another one, I'm good with that too.

Here is the (simplified) C++-code I compile into a .dll/.pyd:

#include <iostream>
#include <string>
#include <Python.h>
#include "boost/unordered_map.hpp"
#include "boost/foreach.hpp"

extern "C"{
typedef boost::unordered_map<std::string, int> hashmap;

static PyObject*
_rint(PyObject* self, PyObject* args)
{
    hashmap my_hashmap; // DO I NEED THIS?
    my_hashmap["a"] = 1; // CAN I RATHER INSERT TO PYTHON DICT DIRECTLY??
    BOOST_FOREACH(hashmap::value_type i, my_hashmap) {
            // INSERT ELEMENT TO PYTHON DICT
    }
    // return PYTHON DICT
}

static PyMethodDef TestMethods[] = {
    {"rint", _rint, METH_VARARGS, ""},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC
inittest(void)
{
    Py_InitModule("test", TestMethods);
}

} // extern "C"

This I want to use in Python like:

import test
new_dict = test.rint()

The dictionary will map strings to integers. Thanks for any help!

Servitude answered 8/12, 2011 at 17:4 Comment(0)
M
12
  • Use the CPython API directly yes:
    PyObject *d = PyDict_New()
    for (...) {
      PyDict_SetItem(d, key, val);
    }
    return d;
  • Or write a python object that emulate a dict, by overriding __setitem__ and __getitem__. In both method, use your original hashmap. At the end, no copy will happen!
Mistakable answered 8/12, 2011 at 17:12 Comment(1)
One small problem with implementing a type in C that defines __getitem__ is that you will create Python objects each time when accessing the items whereas with PyDict_New/PyDict_SetItem you create them once. If values are strings, data has to be copied, so the choice may depend on how the mapping is then used from Python.Campobello

© 2022 - 2024 — McMap. All rights reserved.