Python has its right to keep internal representation of a string in any non-standard format. Hence you have to convert your strings to bytes first, using for example .encode('utf-8')
or any other encoding format.
After you have bytes available you can convert them easily to pointer just by assigning bytes to char *
variable, inside Cython code just do:
s = 'abc'
b = s.encode('utf-8') + b'\x00'
cdef const char * ptr = b
Notice that in code above I appended b'\x00'
to bytes because bytes representation doesn't have to include zero byte at the end and C/C++ needs that zero byte when accepting char *
string.
Samely if C/C++ code returned char *
then you can convert it back to string easily as follows:
cdef const char * ptr = .... # This pointer is filled-in by C code
b = <bytes>ptr
s = s.decode('utf-8') # Now it contains string
In code above notice conversion from char *
to bytes through <bytes>ptr
. Cython casts char *
to bytes by searching for the first zero byte and truncating string up to it, final bytes will not contain zero byte.
Now you can also create an array of char **
to pass it to C/C++, as in following code. I'm assuming you're compiling 64-bit binary (with 64-bit pointers):
# Imports
import numpy as np
cimport numpy as np
cimport cython
from libc.stdint cimport *
# Cython func
def cython_func():
ss = ['ab', 'cde', 'f']
bs = [e.encode('utf-8') + b'\x00' for e in ss]
a = np.zeros(len(bs), dtype = np.uint64)
for i in range(len(bs)):
a[i] = <uint64_t>(<char *>bs[i])
cdef uint64_t[:] ca = a
cdef char ** final_ptr = <char **>&ca[0]
with nogil:
some_c_func(final_ptr)