How hard to reverse engineer .pyd files?
Asked Answered
D

2

22

After reading How do I protect Python code? , I decided to try a really simple extension module on Windows. I compiled my own extension module on Linux before, but this is the first time I compiled it on Windows. I was expecting to get a .dll file, but instead, I got a .pyd file. Docs says they are kind of same, but it must have an init[insert-module-name]() function.

Is it safe to assume, it is as hard to reverse engineer them as dll files. If not, what is their hardness to reverse engineer in a scale from .pyc file to .dll files?

Dam answered 22/8, 2012 at 14:10 Comment(3)
If it says "Yes, .pyd files are dll’s," what's the point in asking if they are less hard to reverse engineer than dll files? That's still native code...Dielle
@MatteoItalia I am having hard time understanding how different they actually are. For example, .pyc files are compiled code too, but they are easier to reverse-engineer than dll files.Dam
@yasar11732 .pyc files are not native code though.Jewfish
J
17

They are, as you already found out, equivalent to DLL files with a certain structure. In principle, they are equally hard to reverse-engineer, they are machine code, need very little metadata, and the code may have been optimized beyond recognition.

However, the required structure, and knowing that many functions will be handling PyObject *s and other well-defined CPython types, may have some effect. It won't really help with mapping the assembly code to C (if anything, it gets harder due to CPython-specific macros). Code that mostly interacts with Python types will look quite different from code manipulating C structs (and comparatively bloated). This may make it even harder to comprehend, or it may give away code which does nothing interesting and allows an reverse engineer to skip over it and get to your trade secrets earlier.

None of these concerns apply to pieces of code which are pure C code (i.e. do not interact with Python). And you probably have a lot of those. So it shouldn't make a significant difference in the end.

Jewfish answered 22/8, 2012 at 14:20 Comment(0)
G
2

They are basically native code. But because every function have funny argument lists, it might be harder to see what each function does. I would say they are as hard as dll, if not harder.

Germain answered 22/8, 2012 at 14:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.