Decompiling a .pyd that contains frozen python objects
Asked Answered
A

1

8

I am trying to figure out the best way to decompile a python .pyd file. Everywhere I look I am coming up with dead ends. There seems to be no program to do this, except for 'Antifreeze' by Aaron Portnoy and Ali Rizvi-Santiago as demonstrated in 2008 here. However the project has long since been lost and abandoned.

I spoke to one of the developers of the program yesterday on twitter (Aaron Portnoy). Here is the conversation.

So my question is, How would I easily decompile a .pyd containing frozen python objects.

OR

How would I modify one of the decompilers out there that do this with .pyo and .pyc to decompile a .pyd as Aaron pointed out? Also which would be the best to go with if this is what I end up doing?

OR

If you have antifreeze or know where to get it that would be a miracle. Even a developer doesn't know where to get it. I have searched for it for days with no luck.

Agonist answered 25/8, 2014 at 21:16 Comment(1)
Just some additional info about pyd-files: docs.python.org/2/faq/…Irrupt
E
5

I didn't see this until now. I should note, though, that we didn't "decompile" it, but just disassembled it. Fortunately, knowing the version, decompilation should be trivial because bytecode generation is mostly 1:1 unless it's optimized (-O parameter).

I'm pretty sure I have the lower-level components that compose this on an external hard drive. Although, I'm not sure about the really f*cking awesome UI (that wraps them) which Aaron wrote.

But, essentially, it consisted of scanning the .pyd for a table that was near (or in) an export, and then using marshal.loads to decode each object from the table back into native Python objects. There's a table that is stored within the .pyd which consists of the marshal'd python code.

At the entrypoint of the .pyd there's a copy that looks like the following:

.text:1000102F 014 8B 15 30 20 00 10                       mov     edx, ds:PyImport_FrozenModules
.text:10001035 014 8B F8                                   mov     edi, eax
.text:10001037 014 B9 82 11 00 00                          mov     ecx, 1182h
.text:1000103C 014 BE 88 55 44 11                          mov     esi, offset off_11445588 ; "Pmw"
.text:10001041 014 F3 A5                                   rep movsd

From this, you can infer the size and the table itself. Each entry in the table contains a pointer to the marshall'd Python, the size, and the naming information that you're looking for. To unmarshal it, you need the same version of Python, and you can just use marshal.loads.

.data:114456B4 28 EF 00 10                                 dd offset str.directcontrolsObserverWalker ; "direct.controls.ObserverWalker"
.data:114456B8 68 69 03 10                                 dd offset unk_10036968
.data:114456BC 9C 0B 00 00                                 dd 0B9Ch

Anyways, once you have the objects, you can disassemble them using the dis.disassemble function from the dis module. However, I still have the original assembler/disassembler in one of my projects on GitHub at github.com/arizvisa, just search for antifreeze.

To insert your objects back into the table, you'd use marshal.dumps and just write it back into the file, although you might need to shift your table and such.

Also, Python has changed significantly since back then, so some things just aren't relevant anymore.

Eyrir answered 5/6, 2020 at 19:10 Comment(2)
Can you please explain further what you mean with "scanning" and "decode each object"? It would be great if you could provide a simple example, or the code you refer to in your github, wherever that is.Tirza
Updated it w/ your suggestions.Eyrir

© 2022 - 2024 — McMap. All rights reserved.