Avoiding `sys.path.append(..)` for imports
Asked Answered
K

5

16

This isn't the first time I am cringing over imports in Python. But I guess this one is an interesting use case, so I thought to ask it here to get a much better insight. The structure of my project is as follows:

sample_project
   - src
        - __init__.py
        - module1
           - __init__.py
           -  utils.py
        - module2
           - __init__.py 
           - models.py
        - app.py

The module1 imports methods from module2 and app imports method from all the other. Also, when you run the app it needs to create a folder called logs outside of src folder. There are now to ways to run the app:

  1. From inside src folder flask run app
  2. From outside of src folder flask run src.app

To make sure that I don't get import errors because of the change of the top level module where the app is started, I do this:

import sys
sys.path.append("..")

Is there any better solution to this problem?

Knowledge answered 18/6, 2021 at 10:46 Comment(0)
C
17

The pythonic solution for the import problem is to not add sys.path (or indirectly PYTHONPATH) hacks to any file that could potentially serve as top-level script (incl. unit tests), since this is what makes your code base difficult to change and maintain. Assume you have to reorganize your project structure or rename folders.

Instead this is what editable installs are made for. They can be achieved in 2 ways:

  1. pip install --editable <path> (requires a basic setup.py)
  2. conda develop <path> (requires the conda-build package)

Either way will add a symlink into your site-packages folder and make your local project behave as if it was fully installed while at the same time you can continue editing.

Always remember: KEEP THINGS EASY TO CHANGE

Closestool answered 8/7, 2021 at 10:18 Comment(3)
I don't see how packaging is useful to share project data that has to change frequently. In any other language you can simply import from another folder... In python this is a pain. And the "right way" is simply not right...Epistemic
Editable installs is not the Pythonic solution, either. Structuring your project correctly, following the rules of Python packages and modules is the correct way to resolve this problem. If you absolutly cannot structure your project within those constraints, then you may be forced to look for an alternative solution. Editable installations with setuptools might be a next-best resort.Isopod
I've never needed to use sys.path.append and been developing with Python for 10+ years -- services, CLI tools and other applications. You can always setup project so that your (isolated) python finds packages without sys hacks. Sometimes I've resorted to having PYTHONPATH=. pytest in Makefile or similar, but I think even that was only because I didn't know how to do it correctly. I think the answer is mostly correct here, but I'd also say that editable installs are not a requirement.Floorage
S
4

After doing research (here1, here2, here3, here4, here5, here6), I come up with the better solution at this time. This is in each python file, you can add its current path before import. The example code below:

import os
import sys
if os.path.dirname(os.path.abspath(__file__)) not in sys.path:
    sys.path.append(os.path.dirname(os.path.abspath(__file__)))
Supplejack answered 24/12, 2021 at 5:26 Comment(1)
This is not a Pythonic solution, nor is is particularly elegant or maintainable. It is hard to understand what this does.Isopod
I
2

Pythonic Solution: Use packages and modules as they are intended to be used

Ask yourself, why did you want to create a src directory?

I would suggest that more than likely you wanted to follow a convention you knew from another language. (Maybe Java, maybe C, C++, or something else.)

However, if you use Python packages in the way they are intended to be used, there is a far simpler solution.

First lets review a few key points.

  • To run the Python file main.py, you run it using the Python interpreter like so: python3 main.py.
  • When the Python interpreter starts, it adds the current working directory to its path. (via sys.path)
  • It will also add the directory containing the module to be run to its path
  • More information can be found in this documentation page: https://docs.python.org/3/tutorial/modules.html
  • One issue with that documentation page is it explains how to modify sys.path from within Python code. The issue with that is it gives developers the idea that this is possible and therefore should be used as a solution to import and path problems when it should not.
  • The Python interpreter will search the sys.path and PYTHONPATH directories list for modules and packages to resolve when it sees an import statement
  • A Python package is a directory with an __init__.py file
  • A Python module is just a regular Python file
  • The __init__.py file is there to signal to the Python interpreter that it needs to recursively search subdirectories for more Python packages and modules. This is why an __init__.py is usually empty.
  • Without an __init__.py the Python interpreter will simply ignore a directory
  • This rule is actually an optmization to prevent the interpreter from becoming slow to start up if there are a large number of subdirectories and files to search
  • From this we conclude that all local source code should be resolvable from the same directory as the one used to run the target module (main.py)

With that information, you can re-structure your project:

sample_project/
    my_python_package/
        __init__.py
        sub_package_1/
            __init__.py
            utils.py
        sub_package_2/
            __init__.py 
            models.py
    app.py

Run app.py from the directory sample_project: python3 app.py

You can actually go further. If your project becomes very large, it sometimes makes sense to run modules within packages using python3 -m some_package.some_module. Then everything, including app.py becomes a package. I don't think you need this in this particular case, but if you have large numbers of "executable" Python files which are better grouped into a set of directories, then this is the approach to take.

Note that:

  • This solution is simple (bordering on trivial, if not necessarily that obvious)
  • There is no src directory. Forget about src. This works well in other languages, it doesn't fit into the Python model for how a project should be structured
  • You did not need to modify PYTHONPATH
  • You did not need to modify sys.path
  • You did not need to write extra code to be able to resolve imports
  • This solution is easy to understand, it is straight forward and has minimal complexity

An experiment to learn about PYTHONPATH and sys.path

You can find out what PYTHONPATH and sys.path are set to with a short experimental code:

$ cd ~
$ mkdir python-path-test
$ touch python-path-test/main.py 
# main.py

import os
import sys

print(f'PYTHONPATH:')
for string in os.environ.get('PYTHONPATH').split(';'):
    print(string)

print(f'sys.path:')
for string in sys.path:
    print(string)
$ export PYTHONPATH=`pwd`
$ python3 python-path-test/main.py
PYTHONPATH:
/home/username
sys.path:
/home/username
/home/username/python-path-test
/usr/lib/python311.zip
/usr/lib/python3.11
/usr/lib/python3.11/lib-dynload
/usr/local/lib/python3.11/dist-packages
/usr/lib/python3/dist-packages
/usr/lib/python3.11/dist-packages

Further explanation in regards to other answers

Let me address the issues with the other answers here. All of the answers provided will work, but none of them take the simplest and "most obviously correct" approach.

The reason for this is the "most obviously correct" approach is not that obvious, especially if you come to Python from other languages where things work differently.

Just to say as well - it took me a long time to figure out the solution to the exact same problem which is shown in the question and I only figured out the solution when I went to work for a firm where someone else had figured this out before me.

Also: None of this is really explained on any documentation page anywhere, so it is hardly surprising that most people get it wrong, or do something unneccessarily complex when it isn't needed.

Overview of other solutions:

So far several other solutions have been proposed:

  1. Use setuptools and virtual environments to manage what is known as an "editable install".

I don't like this for two reasons: It is more work than is necessary, and you are pretending that some local source code is a PIP package, when it isn't. It just seems like a bizzare thing to do. (This is exactly what I used to do before realizing there is an easier way.)

  1. Write Python code to modify the sys.path or PYTHONPATH environment variable

I don't like this because it is a hack:

  • The PYTHONPATH environment variable is intended to be used to store the locations of installed packages on your system
  • It should be a semi-permenant thing which doesn't change (often)
  • This is similarly the case with sys.path
  • The other reason modifying PYTHONPATH is bad is because you are embedding (hiding) some code within your project which does unexpected things
  • PYTHONPATH should be managed by the Operating System, or at least by the user in a shell
  • In my experience, twiddling things which should be managed by your operating system from within code frequently leads to hard to find bugs and hard to understand code
  1. Modify PYTHONPATH from a shell

This is better than the above proposal of modifying it from with Python code, but it just isn't necessary, for the reasons I explained above.

Isopod answered 9/7 at 9:22 Comment(0)
D
0

Take a look at the Python import system documentation and at the PYTHONPATH environment variable.

When your code does import X.Y, what the Python runtime does is look in each folder listed in your PYTHONPATH for a package X (a package simply being a folder containing an __init__.py file) containing a Y package.

Most of the time, appending to sys.path is a poor solution. It is better to take care of what your PYTHONPATH is set to : check that it contains your root directory (which contains your top-level packages) and nothing else (except site-packages which i). Then, from wherever you run your commands, it will work the same (at least for imports, os.cwd is another problem).

Depending of the way to run your Python scripts, . may be the only relevant paths in it, so that it depends on your current directory, requiring to append .. if you run it from inside one of your top-level package.

And maybe you should not run your scripts from a directory that is not the root of your project ?

TL;DR : a good PYTHONPATH makes for way less import errors.

Dismissal answered 18/6, 2021 at 18:56 Comment(0)
S
-1

Wraps the flask command into a small script in the sample_project directory and set PYTHONPATH according to your project:

#!/bin/env bash

# Assuming script is sample_project
path=`dirname ${BASH_SOURCE[0]}`
full_path=`realpath "$p"`
export PYTHONPATH=$full_path/src:$PYTHONPATH

flask run app

You can also switch current directory to a working directory.

But it is best to package your project using setuptools and install it (possibly in developpement mode), in user space according to PYTHONUSERBASE or in virtual environment.

Shrovetide answered 26/6, 2021 at 18:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.