Is there a limit to the size of a Python module?
Asked Answered
P

1

8

Is there a limit to the size of a Python module?

It seems to me that the Python bytecode instruction POP_JUMP_IF_FALSE takes a 1-byte operand, telling it the instruction index to jump to.

Quoting some of the relevant CPython code from ceval.c (comment mine):

case TARGET(POP_JUMP_IF_FALSE): {
    PREDICTED(POP_JUMP_IF_FALSE);
    PyObject *cond = POP();
    int err;
    if (cond == Py_True) {
        Py_DECREF(cond);
        FAST_DISPATCH();
    }
    if (cond == Py_False) {
        Py_DECREF(cond);
        JUMPTO(oparg);  # <--- this
        FAST_DISPATCH();
    }

Does this mean a Python module cannot contain more than 255 bytecode instructions? What am I missing here?

Pu answered 20/4, 2019 at 19:1 Comment(4)
Idk about Python internals, maybe this is a short jump like in x86Troup
Bytecode operands are two bytes, not one - and there's an extension opcode that supplies an additional two bytes if needed, although I'm not sure if that works with every instruction. Anyway, the limit would be on the size of individual functions, NOT the module as a whole.Goeselt
Answering the question in title: yes, if you don't have enough memory. I once tried to run an autogenerated script (it was some JS (I mean... a buttload of JS) transpiled to Python), and I couldn't because the interpreter kept crashing.Protectionism
@jasonharper: They've been one byte since the wordcode change in 3.6, but the docs haven't quite kept up. They still go up to four bytes with EXTENDED_ARG opcodes, though.Mitchmitchael
A
3

Note: I am no expert in Python and definitely not in interpreting bytecode, this is just what I found after experimenting a while.

Note: I am using Python 3.7.3 if you use a different version you might get different disassembly output (credit goes to @dunes for pointing this out).

# module.py
x = 0
while True:
  if x == 0:
    continue

Will produce the following instructions: (via python3 -m dis module.py)

  1           0 LOAD_CONST               0 (0)
              2 STORE_NAME               0 (x)

  2           4 SETUP_LOOP              14 (to 20)

  3     >>    6 LOAD_NAME                0 (x)
              8 LOAD_CONST               0 (0)
             10 COMPARE_OP               2 (==)
             12 POP_JUMP_IF_FALSE        6

  4          14 JUMP_ABSOLUTE            6
             16 JUMP_ABSOLUTE            6
             18 POP_BLOCK
        >>   20 LOAD_CONST               1 (None)
             22 RETURN_VALUE

At offset 12 is the POP_JUMP_IF_FALSE instruction. After adding a whole bunch of code at the top of the file (I just repeated x = 0 many times):

271        1080 SETUP_LOOP              20 (to 1102)

272     >> 1082 LOAD_NAME                0 (x)
           1084 LOAD_CONST               0 (0)
           1086 COMPARE_OP               2 (==)
           1088 EXTENDED_ARG             4
           1090 POP_JUMP_IF_FALSE     1082

273        1092 EXTENDED_ARG             4
           1094 JUMP_ABSOLUTE         1082
           1096 EXTENDED_ARG             4
           1098 JUMP_ABSOLUTE         1082
           1100 POP_BLOCK
        >> 1102 LOAD_CONST               1 (None)
           1104 RETURN_VALUE

The compiler added a EXTENDED_ARG instruction at offset 1088 which allows for a bigger operand.

Alicaalicante answered 21/4, 2019 at 8:38 Comment(1)
I was wondering why I wasn't getting the same dis output as you. Until 3.5 the size of the arg for an opcode was two bytes. 3.6 introduced an optimisation to make the default size 1 byte and a way to increase the size of arg if needed (the EXTENDED_ARG opcode). This is the issue that introduced the change bugs.python.org/issue27097 and commit github.com/python/cpython/commit/…Hollandia

© 2022 - 2024 — McMap. All rights reserved.