Is the import order of extensions in module filenames guaranteed in Python?
Asked Answered
S

1

2

Experimentally, I verified that when a compiled extension.pyd (or .so) and plain extension.py both exist in the same directory, the .pyd file gets imported first ; the .py is only imported if the .pyd file is not found:

In [1]: import extension

In [2]: extension.__file__
Out[2]: 'extension.pyd'

In [3]: import glob; glob.glob("extension.py*")
Out[3]: ['extension.py', 'extension.pyd']

Is that guaranteed to be the same for all versions of Python, and can I rely on this to add logic to the .py file that is only executed when the .pyd file is not found?

Schaaf answered 6/5, 2019 at 12:6 Comment(0)
S
3

FWIW, I was not able to find a reference stating, that extensions must be loaded before py-files, thus it is probably safer to treat it as an implementation detail (unless somebody provides a reference). Even if this details is stable for all versions at least back to 2.7.

When a module is imported, it first looked-up in the cache (i.e. sys.modules) and if not yet there, the finders from sys.meta_path are used. Usually, sys.meta_path consist of BuiltinImporter, FrozenImporter and PathFinder, where PathFinder is responsible for finding the modules on disk/python-path.

PathFinder provides some caching functionality to speed-up the look-up, but it basically delegates the search to hooks from sys.path_hooks - an overview can be found for example in PEP 302.

Usually, sys.path_hooks consist of zipimporter, which make the import of zipped files possible, and a wrapped FileFinder, which is the working horse of the whole import-machinery.

FileFinder tries out different suffices (i.e. .so, .py, .pyc) in a given order, which is established by _get_supported_file_loaders()-method:

def _get_supported_file_loaders():
    """Returns a list of file-based module loaders.
    Each item is a tuple (loader, suffixes).
    """
    extensions = ExtensionFileLoader, _imp.extension_suffixes()
    source = SourceFileLoader, SOURCE_SUFFIXES
    bytecode = SourcelessFileLoader, BYTECODE_SUFFIXES
    return [extensions, source, bytecode]

As one can see:

  • extensions come before source-files (i.e py-files)
  • source-files come before pyc-files

Obviously, sys.meta_path as well as sys.path_hooks can be manipulated in a way, which establish an arbitrary order of load-preferences.

As personal note: I would try to avoid the situation where py- and so/pyd-files are next to eachother.

Storms answered 13/5, 2019 at 21:16 Comment(1)
Thanks, that's a well-researched answer! I basically defaulted to the same idea (avoiding this situation) in the absence of a well-defined behavior in the standard.Schaaf

© 2022 - 2024 — McMap. All rights reserved.