Cython - converting list of strings to char **
Asked Answered
D

2

8

How can I convert a python list of python strings to a null-terminated char** so I can pass it to external C function?

I have:

struct saferun_task:
    saferun_jail   *jail
    saferun_limits *limits

    char **argv
    int stdin_fd  
    int stdout_fd
    int stderr_fd

int saferun_run(saferun_inst *inst, saferun_task *task, saferun_stat *stat)

in cdef extern block

I want to convert something like ('./a.out', 'param1', 'param2') to something that I can assign to saferun_task.argv

How?

Dungaree answered 12/3, 2012 at 10:13 Comment(2)
Check this: groups.google.com/forum/?fromgroups#!searchin/cython-users/char**/cython-users/ldtOV1QwITA/bxL1AtiALkwJNecrotomy
Possible duplicate of Fast string array - CythonChromaticness
C
5

From the Cython docs:

char* PyString_AsString (PyObject *string)

Returns a null-terminated representation of the contents of string. The pointer refers to the internal buffer of string, not a copy. The data must not be modified in any way. It must not be de-allocated.

I don't have a Cython compiler setup and handy atm (I can run this later and check) but, this should results in code that looks something like:

from libc.stdlib cimport malloc, free

cdef char **string_buf = malloc(len(pystr_list) * sizeof(char*))

for i in range(len(pystr_list)):
    string_buf[i] = PyString_AsString(pystr_list[i])

# Do stuff with string_buf as a char**
# ...

free(string_buf)

The pointer stringBuf is now a char ** to your original data without copying any strings -- though you shouldn't edit the data in each string as the strings should be treated as const char* (from docs). If you need to manipulate the strings you will have to memcpy the data or make new objects which you don't care about trashing in Python -- though since you have a tuple of strings I doubt you are editing them.

Chickpea answered 9/7, 2012 at 21:6 Comment(1)
PyString_AsString is python2 only, so this solution will not work for python3Viperish
D
1

Python has its right to keep internal representation of a string in any non-standard format. Hence you have to convert your strings to bytes first, using for example .encode('utf-8') or any other encoding format.

After you have bytes available you can convert them easily to pointer just by assigning bytes to char * variable, inside Cython code just do:

s = 'abc'
b = s.encode('utf-8') + b'\x00'
cdef const char * ptr = b

Notice that in code above I appended b'\x00' to bytes because bytes representation doesn't have to include zero byte at the end and C/C++ needs that zero byte when accepting char * string.

Samely if C/C++ code returned char * then you can convert it back to string easily as follows:

cdef const char * ptr = .... # This pointer is filled-in by C code
b = <bytes>ptr
s = s.decode('utf-8') # Now it contains string

In code above notice conversion from char * to bytes through <bytes>ptr. Cython casts char * to bytes by searching for the first zero byte and truncating string up to it, final bytes will not contain zero byte.

Now you can also create an array of char ** to pass it to C/C++, as in following code. I'm assuming you're compiling 64-bit binary (with 64-bit pointers):

# Imports
import numpy as np
cimport numpy as np
cimport cython
from libc.stdint cimport *

# Cython func
def cython_func():
    ss = ['ab', 'cde', 'f']
    bs = [e.encode('utf-8') + b'\x00' for e in ss]
    a = np.zeros(len(bs), dtype = np.uint64)
    for i in range(len(bs)):
        a[i] = <uint64_t>(<char *>bs[i])

    cdef uint64_t[:] ca = a
    cdef char ** final_ptr = <char **>&ca[0]

    with nogil:
        some_c_func(final_ptr)
Dunfermline answered 21/10, 2021 at 11:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.