I have a C function that mallocs() and populates a 2D array of floats. It "returns" that address and the size of the array. The signature is
int get_array_c(float** addr, int* nrows, int* ncols);
I want to call it from Python, so I use ctypes.
import ctypes
mylib = ctypes.cdll.LoadLibrary('mylib.so')
get_array_c = mylib.get_array_c
I never figured out how to specify argument types with ctypes. I tend to just write a python wrapper for each C function I'm using, and make sure I get the types right in the wrapper. The array of floats is a matrix in column-major order, and I'd like to get it as a numpy.ndarray. But its pretty big, so I want to use the memory allocated by the C function, not copy it. (I just found this PyBuffer_FromMemory stuff in this StackOverflow answer: https://mcmap.net/q/21546/-getting-data-from-ctypes-array-into-numpy)
buffer_from_memory = ctypes.pythonapi.PyBuffer_FromMemory
buffer_from_memory.restype = ctypes.py_object
import numpy
def get_array_py():
nrows = ctypes.c_int()
ncols = ctypes.c_int()
addr_ptr = ctypes.POINTER(ctypes.c_float)()
get_array_c(ctypes.byref(addr_ptr), ctypes.byref(nrows), ctypes.byref(ncols))
buf = buffer_from_memory(addr_ptr, 4 * nrows * ncols)
return numpy.ndarray((nrows, ncols), dtype=numpy.float32, order='F',
buffer=buf)
This seems to give me an array with the right values. But I'm pretty sure it's a memory leak.
>>> a = get_array_py()
>>> a.flags.owndata
False
The array doesn't own the memory. Fair enough; by default, when the array is created from a buffer, it shouldn't. But in this case it should. When the numpy array is deleted, I'd really like python to free the buffer memory for me. It seems like if I could force owndata to True, that should do it, but owndata isn't settable.
Unsatisfactory solutions:
Make the caller of get_array_py() responsible for freeing the memory. That's super annoying; the caller should be able to treat this numpy array just like any other numpy array.
Copy the original array into a new numpy array (with its own, separate memory) in get_array_py, delete the first array, and free the memory inside get_array_py(). Return the copy instead of the original array. This is annoying because it's an ought-to-be unnecessary memory copy.
Is there a way to do what I want? I can't modify the C function itself, although I could add another C function to the library if that's helpful.