What does "del sys.modules[module]" actually do?
Asked Answered
H

1

13

As everyone knows, you can do del sys.modules[module] to delete an imported module. So I was thinking: how is this different from rewriting sys.modules? An interesting fact is, rewriting sys.modules can't truely delete a module.

# a_module.py
print("a module imported")

Then

import sys

def func1():
    import a_module
    # del sys.modules['a_module']
    sys.modules = {
        k: v for k, v in sys.modules.items() if 'a_module' not in k}
    print('a_module' not in sys.modules)  # True

def func2():
    import a_module

func1()  # a module imported
func2()  # no output here

If I use del sys.modules['a_module'], calling func2() also prints a module imported, which implies that a_module is successfully deleted.

My question is: What does del sys.modules[module] actually do, besides changing the dictionary?

Horsewhip answered 3/4, 2017 at 9:48 Comment(0)
M
23

sys.modules is the Python-accessible reference to the canonical data structure for tracking what modules are imported. Removing a name from that dictionary means that any future attempts to import the module will result in Python loading the module from scratch.

In other words, whenever you use an import statement, the first thing Python does is check if the dictionary that sys.modules references already has an entry for that module and proceed with the next step (binding names in the current namespace) without loading the module first. If you delete entries from sys.modules, then Python won't find the already-loaded module and loads again.

Note the careful wording about Python-accessible here. The actual dictionary lives on the Python heap and sys.modules is just one reference to it. You replaced that reference with another dictionary by assigning to sys.modules. However, the interpreter has more references to it; they just are not accessible from Python code, not without ctypes trickery to access the C-API anyway.

Note that this is explicitly documented:

However, replacing the dictionary will not necessarily work as expected and deleting essential items from the dictionary may cause Python to fail.

From the C-API, you'd have to use PyThreadState_Get()->interp->modules to get the internal reference to the dictionary.

Midbrain answered 3/4, 2017 at 9:50 Comment(5)
Thank you. Is this documented somewhere? I didn't know there are "more refs".Horsewhip
@laike9m: a lot of this is implementation detail; I did a quick deep-dive in the source code. Other Python implementations are likely to handle interpreter state differently.Midbrain
@laike9m: I note, for example, that the C-API documentation for PyInterpreterState makes no mention of any struct members, and the documentation for sys.modules explicitly warns against replacing the sys.modules reference.Midbrain
What good does it do for the C-API to have separate references? The "obvious"/natural way for it to work would be for modules to be ref counted like everything else, so if you did del sys.modules[module_name] it would be completely unloaded as long as no other objects held references to it.Afreet
Also possibly worth drawing a distinction between del sys.modules[foo] and reassigning sys.modules to a different dictionary instance -- the latter seems more likely to cause problems with references on the C side to me, but this is based on how I imagine it should work, not real reading of the sources.Afreet

© 2022 - 2024 — McMap. All rights reserved.