TypeError: argument of type 'char const *'

Asked 3/1, 2014 at 14:20 Answered 17/11, 2016 at 21:0

I'm currently working with Freeswitch and its event socket library (through the mod event socket). For instance:

from ESL import ESLconnection

cmd = 'uuid_kill %s' % active_call # active_call comes from a Django db and is unicode
con = ESLconnection(config.HOST, config.PORT, config.PWD)
if con.connected():
    e = con.api(str(cmd))
else:
    logging.error('Couldn\'t connect to Freeswitch Mod Event Socket')

As you can see, I had to explicitly cast con.api()'s argument with str(). Without that, the call ends up in the following stack trace:

Traceback (most recent call last):
  [...]
    e = con.api(cmd)
  File "/usr/lib64/python2.7/site-packages/ESL.py", line 87, in api
    def api(*args): return apply(_ESL.ESLconnection_api, args)
TypeError: in method 'ESLconnection_api', argument 2 of type 'char const *'

I don't understand this TypeError: what does it mean ? cmd contains a string, so what does it fix it when I cast it with str(cmd) ?
Could it be related to Freeswitch's python API, generated through SWIG ?

Centuple answered 3/1, 2014 at 14:20 Comment(6)

What is the type of active_call? If it is unicode, then cmd will be a Unicode string, and str(cmd) will convert it to string. You can insert an import pdb; pdb.set_trace() line before the con.api call, and inspect cmd. – Manara 3/1, 2014 at 14:24

active_call is most probably unicode, indeed (comes out of a Django database), but there is nothing in this error message that makes me think about unicode: do you think it might be related ? – Centuple 3/1, 2014 at 14:35

It's certainly related: the API is likely exported through an interface-building framework that only knows how to convert Python string objects to char const * (by calling PyString_AsString on the object). Python 2 Unicode strings do not consist of C chars — they consist of wchar_t, which means they cannot be trivially "cast" to const char *, they need to be converted into a new buffer, which must be allocated, etc. By pre-converting the Unicode string to string in Python, you perform the hard part before the object ever reaches C. – Manara 3/1, 2014 at 16:16

Mhmmm, that would be consistent with some of the FS's python files, mentioning SWIG (swig.org), which describes itself as a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. There still are some things I don't know about, like what's a char const *, a C char or a wchar_t, but imo your comment definitely deserves to be an answer. Please feel free to provide more tech details if you are in the mood ;) (Also I'll add details to my question) – Centuple 3/1, 2014 at 16:50

I wasn't aware you weren't aware of C fundamentals. :) I've now added an answer that explains this in some detail. – Manara 3/1, 2014 at 19:48

Thanks a lot for the great details and this excellent reminder of C fundamentals (yeah, I forgot most of it over time, not proud about that...) Sad I can't upvote more than once ! – Centuple 4/1, 2014 at 2:26

Short answer: cmd likely contains a Unicode string, which cannot be trivially converted to a const char *. The error message might come from a wrapper framework that automates writing Python bindings for C libraries, such as SWIG or ctypes. The framework knows what to do with a byte string, but punts on Unicode strings. Passing str(cmd) helps because it converts the Unicode string to a byte string, from which a const char * value expected by C code can be trivially extracted.

Long answer:

The C type char const *, more customarily spelled const char *, can be read as "read-only array of char", char being C's way to spell "byte". When a C function accepts a const char *, it expects a "C string", i.e. an array of char values terminated with a null character. Conveniently, Python strings are internally represented as C strings with some additional information such as type, reference count, and the length of the string (so the string length can be retrieved with O(1) complexity, and also so that the string may contain null characters themselves).

Unicode strings in Python 2 are represented as arrays of Py_UNICODE, which are either 16 or 32 bits wide, depending on the operating system and build-time flags. Such an array cannot be passed to code that expects an array of 8-bit chars — it needs to be converted, typically to a temporary buffer, and this buffer must be freed when no longer needed.

For example, a simple-minded (and quite unnecessary) wrapper for the C function strlen could look like this:

PyObject *strlen(PyObject *ignore, PyObject *obj)
{
  if (!PyString_Check(obj)) {
    PyErr_Format(PyExc_TypeError, "string expected, got %s", Py_TYPE(obj)->tp_name);
    return NULL;
  }
  const char *c_string = PyString_AsString(obj);
  size_t len = strlen(c_string);
  return PyInt_FromLong((long) len);
}

The code simply calls PyString_AsString to retrieve the internal C string stored by every Python string and expected by strlen. For this code to also support Unicode objects (provided it even makes sense to call strlen on Unicode objects), it must handle them explicitly:

PyObject *strlen(PyObject *ignore, PyObject *obj)
{
  const char *c_string;
  PyObject *tmp = NULL;
  if (PyString_Check(obj))
    c_string = PyString_AsString(obj);
  else if (PyUnicode_Check(obj)) {
    if (!(tmp = PyUnicode_AsUTF8String(obj)))
      return NULL;
    c_string = PyString_AsString(tmp);
  }
  else {
    PyErr_Format(PyExc_TypeError, "string or unicode expected, got %s",
                 Py_TYPE(obj)->tp_name);
    return NULL;
  }
  size_t len = strlen(c_string);
  Py_XDECREF(tmp);
  return PyInt_FromLong((long) len);
}

Note the additional complexity, not only in lines of boilerplate code, but in the different code paths that require different management of a temporary object that holds the byte representation of the Unicode string. Also note that the code needed to decide to on an encoding when converting a Unicode string to a byte string. UTF-8 is guaranteed to be able to encode any Unicode string, but passing a UTF-8-encoded sequence to a function expecting a C string might not make sense for some uses. The str function uses the ASCII codec to encode the Unicode string, so if the Unicode string actually contained any non-ASCII characters, you would get an exception.

There have been requests to include this functionality in SWIG, but it is unclear from the linked report if they made it in.

Manara answered 3/1, 2014 at 19:47 Comment(0)

I had similar problem, and I solved it by doing this: cmd = 'uuid_kill %s'.encode('utf-8')

Electorate answered 17/11, 2016 at 21:0 Comment(0)

Recommended topics

Hot tags