Is there a way to make dis.dis() print code objects recursively, pre-Python 3.7?
Asked Answered
S

2

6

I've been using the dis module to observe CPython bytecode. But lately, I've noticed some inconvenient behavior of dis.dis().

Take this example for instance: I first define a function multiplier with a nested function inside of it inner:

>>> def multiplier(n):
    def inner(multiplicand):
        return multiplicand * n
    return inner

>>> 

I then use dis.dis() to disassemble it:

>>> from dis import dis
>>> dis(multiplier)
  2           0 LOAD_CLOSURE             0 (n)
              3 BUILD_TUPLE              1
              6 LOAD_CONST               1 (<code object inner at 0x7ff6a31d84b0, file "<pyshell#12>", line 2>)
              9 LOAD_CONST               2 ('multiplier.<locals>.inner')
             12 MAKE_CLOSURE             0
             15 STORE_FAST               1 (inner)

  4          18 LOAD_FAST                1 (inner)
             21 RETURN_VALUE
>>>

As you can see, it disassembled the top-level code object fine. However, it did not disassemble inner. It simply showed that it created a code object named inner and displayed the default (uninformative) __repr__() for code objects.

Is there a way I can make dis.dis() print the code objects recursively? That is, if I have nested code objects, it will print the bytecode for all of the code objects out, rather than stopping at the top-level code object. I'd mainly like this feature for things such as decorators, closures, or generator comprehensions.

It appears that the latest version of Python - 3.7 alpha 1 - has exactly the behavior I want from dis.dis():

>>> def func(a): 
    def ifunc(b): 
        return b + 10 
    return ifunc 

>>> dis(func)
  2           0 LOAD_CONST               1 (<code object ifunc at 0x7f199855ac90, file "python", line 2>)
              2 LOAD_CONST               2 ('func.<locals>.ifunc')
              4 MAKE_FUNCTION            0
              6 STORE_FAST               1 (ifunc)

  4           8 LOAD_FAST                1 (ifunc)
             10 RETURN_VALUE

Disassembly of <code object ifunc at 0x7f199855ac90, file "python", line 2>:
  3           0 LOAD_FAST                0 (b)
              2 LOAD_CONST               1 (10)
              4 BINARY_ADD
              6 RETURN_VALUE 

The What’s New In Python 3.7 article makes note of this:

The dis() function now is able to disassemble nested code objects (the code of comprehensions, generator expressions and nested functions, and the code used for building nested classes). (Contributed by Serhiy Storchaka in bpo-11822.)

However, besides Python 3.7 not being formally released yet, what if you don't want or cannot use Python 3.7? Are there ways to accomplish this in earlier versions of Python such as 3.5 or 2.7 using the old dis.dis()?

Shuck answered 3/7, 2017 at 4:18 Comment(0)
G
2

First off, if you need this for anything other than interactive use, I would recommend just copying the code from the Python 3.7 sources and backporting it (hopefully that isn't difficult).

For interactive use, an idea would be to use one of the ways to access an object by its memory value to grab the code object by its memory address, which is printed in the dis output.

For example:

>>> def func(a):
...     def ifunc(b):
...         return b + 10
...     return ifunc
>>> import dis
>>> dis.dis(func)
  2           0 LOAD_CONST               1 (<code object ifunc at 0x10cabda50, file "<stdin>", line 2>)
              3 LOAD_CONST               2 ('func.<locals>.ifunc')
              6 MAKE_FUNCTION            0
              9 STORE_FAST               1 (ifunc)

  4          12 LOAD_FAST                1 (ifunc)
             15 RETURN_VALUE

Here I copy-paste the memory address of the code object printed above

>>> import ctypes
>>> c = ctypes.cast(0x10cabda50, ctypes.py_object).value
>>> dis.dis(c)
  3           0 LOAD_FAST                0 (b)
              3 LOAD_CONST               1 (10)
              6 BINARY_ADD
              7 RETURN_VALUE

WARNING: the ctypes.cast line will segfault the interpreter if you pass it something that doesn't exist in memory (say, because it's been garbage collected). Some of the other solutions from the above referenced question may work better (I tried the gc one but it didn't seem to be able to find code objects).

This also means that this won't work if you pass dis a string, because the internal code objects will already be garbage collected by the time you try to access them. You need to either pass it a real Python object, or, if you have a string, compile() it first.

Galligaskins answered 22/9, 2017 at 21:35 Comment(0)
O
2

You could do something like this (Python 3):

import dis

def recursive_dis(code):
    print(code)
    dis.dis(code)

    for obj in code.co_consts:
        if isinstance(obj, type(code)):
            print()
            recursive_dis(obj)

https://repl.it/@solly_ucko/Recursive-dis

Note that you have to call it with f.__code__ instead of just f. For example:

def multiplier(n):
    def inner(multiplicand):
        return multiplicand * n
    return inner

recursive_dis(multiplier.__code__)
Ovum answered 14/6, 2019 at 0:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.