What is visible in an executable built with Cython, in case non-compiled Python code is executed?

import bottle, random, json app = bottle.Bottle() @bottle.route('/') def index(): return 'hello' @bottle.route('/random') def testrand(): return str(random.randint(0, 100)) @bottle.route('/jsontest') def testjson(): x = json.loads('{ "1": "2" }') return 'done' bottle.run()

static const char __pyx_k_1_2[] = "{ \"1\": \"2\" }"; static const char __pyx_k_json[] = "json"; static const char __pyx_k_main[] = "__main__"; static const char __pyx_k_name[] = "__name__"; static const char __pyx_k_test[] = "__test__"; static const char __pyx_k_loads[] = "loads"; static const char __pyx_k_import[] = "__import__"; static const char __pyx_k_cline_in_traceback[] = "cline_in_traceback";

In general you won't be able to avoid having those strings in the resulting executable, this is just how python works - they are needed at the run time.

If we look at a simple C-code:


void do_nothing(){...}

int main(){
  do_nothing();
  return 0;
}

compile and link it statically. When the linker is done, the call of do_nothing (let's assume it is not inlined or optimized out) is just a jump to a memory-address - the name of the function is no longer needed and can be erased from the resulting executable.

Python works differently: there is no linker, we don't use raw memory-addresses during the run time to call some functionality, but use Python-machinery to find it for us given the name of the package/module and of the function - thus we need this information - the names - during the run time. And thus they must be provided during the runtime.

However, if you are game changing the produced c-file you could make the life of the "hacker" somewhat harder.

When there is a string needed for calling Python-functionality, this will result in the following code (e.g. import json):

static const char __pyx_k_json[] = "json";

static PyObject *__pyx_n_s_json;

static __Pyx_StringTabEntry __pyx_string_tab[] = {
  ...
  {&__pyx_n_s_json, __pyx_k_json, sizeof(__pyx_k_json), 0, 0, 1, 1},
  ...
  {0, 0, 0, 0, 0, 0, 0}
};

static CYTHON_SMALL_CODE int __Pyx_InitGlobals(void) {
  if (__Pyx_InitStrings(__pyx_string_tab) < 0) __PYX_ERR(0, 1, __pyx_L1_error);
...
}
...
__pyx_t_1 = __Pyx_Import(__pyx_n_s_json, 0, 0); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 1, __pyx_L1_error)

so one could save "json" as "irnm" (every character shifted by -1) and then restore the real name during the run time before __Pyx_InitStrings is called in __Pyx_InitGlobals.

So now, just dumping the strings in exe would lead to nothing saying combination of characters. One even could go further and load the real names from somewhere after the program started, if this is worth the trouble.

Recommended topics

Hot tags