Python 3.6 project structure leads to RuntimeWarning
Asked Answered
B

6

64

I'm trying to package up my project for distribution, but I'm hitting a RuntimeWarning when I run the module.

I've found a bug report on the Python mailing list which indicates that the RuntimeWarning is new behaviour that was introduced in Python 3.5.2.

Reading through the bug report, it appears that there is a double-import which happens, and this RuntimeWarning is correct in alerting the user. However, I don't see what changes that I need to make to my own project structure to avoid this issue.

This is the first project that I have attempted to structure "correctly". I would like to have a tidy layout for when I push the code, and a project structure which can be cloned and run easily by others.

I have based my structure mainly on http://docs.python-guide.org/en/latest/writing/structure/.

I have added details of a minimum working example below.

To replicate the issue, I run the main file with python -m:

(py36) X:\test_proj>python -m proj.proj
C:\Users\Matthew\Anaconda\envs\py36\lib\runpy.py:125: RuntimeWarning: 
'proj.proj' found in sys.modules after import of package 'proj', but prior 
to execution of 'proj.proj'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
This is a test project.`

Running my tests are fine:

(py36) X:\test_proj>python -m unittest tests.test_proj
This is a test project.
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

A project structure to replicate the issue is as follows:

myproject/
    proj/
        __init__.py
        proj.py
    tests/
        __init__.py
        context.py
        test_proj.py

In the file proj/proj.py:

def main():
    print('This is a test project.')
    raise ValueError

if __name__ == '__main__':
    main()

In proj/__init__.py:

from .proj import main

In tests/context.py:

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
import proj

Finally, in tests/test_proj.py:

import unittest

from .context import proj


class SampleTestCase(unittest.TestCase):
    """Test case for this sample project"""
    def test_raise_error(self):
        """Test that we correctly raise an error."""
        with self.assertRaises(ValueError):
            proj.main()


if __name__ == '__main__':
    unittest.main()

Can anyone help me correct my project structure to avoid this double-import scenario? Any help with this would be greatly appreciated.

Barberabarberry answered 13/4, 2017 at 13:34 Comment(0)
L
79

For this particular case, the double import warning is due to this line in proj/__init__.py:

from .proj import main

What that line means is that by the time the -m switch implementation finishes the import proj step, proj.proj has already been imported as a side effect of importing the parent package.

Avoiding the warning

To avoid the warning, you need to find a way to ensure that importing the parent package doesn't implicitly import the package being executed with the -m switch.

The two main options for resolving that are:

  1. Drop the from .proj import main line (as @John Moutafis suggested), assuming that can be done without breaking API compatibility guarantees; or
  2. Delete the if __name__ == "__main__": block from the proj submodule and replace it with a separate proj/__main__.py file that just does:

    from .proj import main
    main()
    

If you go with option 2, then the command line invocation would also change to just be python -m proj, rather than referencing a submodule.

A more backwards compatible variant of option 2 is to add __main__.py without deleting the CLI block from the current submodule, and that can be an especially good approach when combined with DeprecationWarning:

if __name__ == "__main__":
    import warnings
    warnings.warn("use 'python -m proj', not 'python -m proj.proj'", DeprecationWarning)
    main()

If proj/__main__.py is already being used for some other purpose, then you can also do things like replacing python -m proj.proj with python -m proj.proj_cli, where proj/proj_cli.py looks like:

if __name__ != "__main__":
    raise RuntimeError("Only for use with the -m switch, not as a Python API")
from .proj import main
main()

Why does the warning exist?

This warning gets emitted when the -m switch implementation is about to go and run an already imported module's code again in the __main__ module, which means you will have two distinct copies of everything it defines - classes, functions, containers, etc.

Depending on the specifics of the application, this may work fine (which is why it's a warning rather than an error), or it may lead to bizarre behaviour like module level state modifications not being shared as expected, or even exceptions not being caught because the exception handler was trying to catch the exception type from one instance of the module, while the exception raised used the type from the other instance.

Hence the vague this may cause unpredictable behaviour warning - if things do go wrong as a result of running the module's top level code twice, the symptoms may be pretty much anything.

How can you debug more complex cases?

While in this particular example, the side-effect import is directly in proj/__init__.py, there's a far more subtle and hard to debug variant where the parent package instead does:

import some_other_module

and then it is some_other_module (or a module that it imports) that does:

import proj.proj # or "from proj import proj"

Assuming the misbehaviour is reproducible, the main way to debug these kinds of problems is to run python in verbose mode and check the import sequence:

$ python -v -c "print('Hello')" 2>&1 | grep '^import'
import zipimport # builtin
import site # precompiled from /usr/lib64/python2.7/site.pyc
import os # precompiled from /usr/lib64/python2.7/os.pyc
import errno # builtin
import posix # builtin
import posixpath # precompiled from /usr/lib64/python2.7/posixpath.pyc
import stat # precompiled from /usr/lib64/python2.7/stat.pyc
import genericpath # precompiled from /usr/lib64/python2.7/genericpath.pyc
import warnings # precompiled from /usr/lib64/python2.7/warnings.pyc
import linecache # precompiled from /usr/lib64/python2.7/linecache.pyc
import types # precompiled from /usr/lib64/python2.7/types.pyc
import UserDict # precompiled from /usr/lib64/python2.7/UserDict.pyc
import _abcoll # precompiled from /usr/lib64/python2.7/_abcoll.pyc
import abc # precompiled from /usr/lib64/python2.7/abc.pyc
import _weakrefset # precompiled from /usr/lib64/python2.7/_weakrefset.pyc
import _weakref # builtin
import copy_reg # precompiled from /usr/lib64/python2.7/copy_reg.pyc
import traceback # precompiled from /usr/lib64/python2.7/traceback.pyc
import sysconfig # precompiled from /usr/lib64/python2.7/sysconfig.pyc
import re # precompiled from /usr/lib64/python2.7/re.pyc
import sre_compile # precompiled from /usr/lib64/python2.7/sre_compile.pyc
import _sre # builtin
import sre_parse # precompiled from /usr/lib64/python2.7/sre_parse.pyc
import sre_constants # precompiled from /usr/lib64/python2.7/sre_constants.pyc
import _locale # dynamically loaded from /usr/lib64/python2.7/lib-dynload/_localemodule.so
import _sysconfigdata # precompiled from /usr/lib64/python2.7/_sysconfigdata.pyc
import abrt_exception_handler # precompiled from /usr/lib64/python2.7/site-packages/abrt_exception_handler.pyc
import encodings # directory /usr/lib64/python2.7/encodings
import encodings # precompiled from /usr/lib64/python2.7/encodings/__init__.pyc
import codecs # precompiled from /usr/lib64/python2.7/codecs.pyc
import _codecs # builtin
import encodings.aliases # precompiled from /usr/lib64/python2.7/encodings/aliases.pyc
import encodings.utf_8 # precompiled from /usr/lib64/python2.7/encodings/utf_8.pyc

This particular example just shows the base set of imports that Python 2.7 on Fedora does at startup. When debugging a double-import RuntimeWarning like the one in this question, you'd be searching for the "import proj" and then "import proj.proj" lines in the verbose output, and then looking closely at the imports immediately preceding the "import proj.proj" line.

Leta answered 13/7, 2017 at 2:43 Comment(5)
Thanks for adding this answer. So, to distill this, the message is: you cannot run a file as an executable module (python -m WHATEVER.FILE) when that file is also automatically being imported within the package (any __init__.py, which is loaded, contains from WHATEVER import FILE). Right?Bowknot
Essentially, yeah. It's not always wrong (hence why it's only a warning rather than an error), but it does reliably give you two copies of the "same" module under different names, and debugging state management issues arising from that can be challenging. As a result, it's far more reliable to just not do it & instead have a separate submodule provide the command line interface.Leta
I've also updated my answer to cover how to debug more complex instances of this, where you really want to be able to run python in verbose mode so it prints out the exact import sequence and lets you find where the initial implicit import is happening.Leta
Thanks @Leta for explanation. With Robot Framework we've encountered this exact problem and I'm now trying to figure out the best way to fix/workaround it. It seems I could restructure our code base to avoid the warning, but it feels like extra work as it seems that the warning isn't really relevant in our case. I'm thus contemplating should I just ignore all RuntimeWarnings by runpy instead. For more details see github.com/robotframework/robotframework/issues/2552 and my separate answer below.Groschen
much appreciate this answer, had some really silly imported mains in my module that were leading to this error. Moved them to a bin dir and out of the moduleDonelson
S
14

If you take a look at the double import trap you will see this:

This next trap exists in all current versions of Python, including 3.3, and can be summed up in the following general guideline: “Never add a package directory, or any directory inside a package, directly to the Python path”.

The reason this is problematic is that every module in that directory is now potentially accessible under two different names: as a top level module (since the directory is on sys.path) and as a submodule of the package (if the higher level directory containing the package itself is also on sys.path).

In tests/context.py

remove: sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

which probably causes the problem and your code still works as expected.


Edit due to comment:

You can try and change some parts in your code:

  1. proj/__init__.py Can be completely empty
  2. On test_proj.py should change the imports as follows:

    import unittest
    
    from proj import proj
    

PS: I wasn't able to reproduce the warning on Linux with your initial code or with my suggestions either.

Stoush answered 20/4, 2017 at 8:31 Comment(2)
Thanks for your reply. Unfortunately, that didn't resolve the issue. Though your suggestion makes perfect sense, I still see the exact same message after removing the sys.path.insert line.Barberabarberry
I have added another suggestion to the answer, have a look.Stoush
U
8

@ncoghlan answer is right. I just want to add to his solution 1 that you only need to remove the import in __init__.py if you execute your package with the -m switch. That boils down to figuring out in __init__.py whether python was called with the -m switch. sys.flags unfortunately does not contain an entry for the -m switch, but sys.argv seems to contain a single element containing "-m" (I did, however, not figure out whether this behaviour is doumented). So change __init__.py in the following way:

import sys
if not '-m' in sys.argv:
    from .proj import main

If you execute the package with the -m switch .proj will not be imported by __init__.py and you avoid the double import. If you import the package from another script .proj is imported as intended. Unfortunately, sys.argv does not contain the argument to the -m switch! So maybe moving the main() function to a separate file is the better solution. But I really like to have a main() function in my modules for quick and simple testing/demonstrations.

Unbrace answered 3/9, 2020 at 21:0 Comment(1)
I wanted to say this should be the 'accepted' answer, IMO. It's simple and doesn't require you to change the way you use Python's -m switch to test code. The current accepted answers are just like 1: 'don't use init files' which largely defeats the purpose of packaging (so nope.) Or 2: 'don't use the -m switch' which removes the benefits of having Rust-style tests at the bottom of a module (it's really handy.) I think these proposals are bad.Shaniqua
A
4

If you are certain the warning is not relevant for you, an easy way to avoid it is ignoring RuntimeWarnings by the runpy module that implements the logic behind the -m switch:

import sys
import warnings

if not sys.warnoptions:  # allow overriding with `-W` option
    warnings.filterwarnings('ignore', category=RuntimeWarning, module='runpy')

This obviously may hide relevant warnings as well, but at least at the moment this is the only RuntimeWarning that runpy uses. Alternatively filtering could be made more strict by specifying pattern for the message or line number where warning must occur, but both of these may be broken if runpy is edited later.

Actor answered 15/3, 2018 at 17:41 Comment(0)
L
2

python -m is a bit tricky. @ncoghlan have already provided detailed information. when we try to run with python -m by default all packages within sys.path/pythonpath are imported. if your package have import statement to anything within the directories in the PATHs the above warning occurs.See the Pic

My PYTHONPATH already have the Project directory. Thus when I do

from reader.reader import Reader

System throws the warning. Thus no need to have explicit imports if the path is in python path

Linearity answered 2/9, 2017 at 13:38 Comment(0)
O
0

Pekka's answer worked best for me. I modified to include the message regex. This function will ignore the warning for whatever module or list of modules that you pass into it.

def ignore_m_warning(modules=None):
    """Ignore python -m package.module import warning that module was imported after package, but before module."""
    if not isinstance(modules, (list, tuple)):
        modules = [modules]

    try:
        import warnings
        import re

        msg = "'{module}' found in sys.modules after import of package"
        for module in modules:
            module_msg = re.escape(msg.format(module=module))
            warnings.filterwarnings("ignore", message=module_msg, category=RuntimeWarning, module='runpy')  # ignore -m
    except (ImportError, KeyError, AttributeError, Exception):
        pass
Overijssel answered 20/10, 2021 at 1:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.