tl;dr
In a nutshell, what we want to do is to inject code that gets executed by the python interpreter before our main code gets executed.
The best way to achieve this is to create a virtualenv and add
sitecustomize.py
in the virtualenv's site-packages.
Demonstration
Let's assume that the application we want to run is called my_app.py
and that its
logger has the same name.
$ cat my_app.py
import logging
logger = logging.getLogger("my_app")
logger.debug("A debug message")
logger.info("An info message")
logger.warning("A warning message")
logger.error("An error message")
Running my_app.py
should only show the messages whose severity is > WARNING
(which
is the default behavior in python's logging).
$ python my_app.py
A warning message
An error message
Now let's create a virtual environment
python3 -m venv my_venv
And let's add sitecustomize.py
to the virtualenv's site packages.
$ cat my_venv/lib/python3.7/site-packages/sitecustomize.py
import logging
# Setup logging for my_app
# We will only setup a console handler
logger = logging.getLogger("my_app")
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setFormatter(
logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
)
logger.addHandler(ch)
Now let's try to run my_app.py
using the virtualenv:
$ ./my_venv/bin/python my_app.py
2019-01-25 16:03:16,815 - my_app - DEBUG - A debug message
2019-01-25 16:03:16,815 - my_app - INFO - An info message
2019-01-25 16:03:16,815 - my_app - WARNING - A warning message
2019-01-25 16:03:16,815 - my_app - ERROR - An error message
And that was all)
We got proper logging without having to modify my_app.py
or write a wrapper!
Now if you want to know why this is the optimal approach, keep on reading.
(Really) long answer
Before understanding why using a virtualenv + sitecustomize.py
is the correct approach
to this problem, we need to make a not so short introduction.
Note: I am going to assume that you create virtual environments using the
venv
module which uses
stdlib's site.py
. The
virtuaelnv
library, uses its own
site.py
and might be doing things slightly differently. Nevertheless, after reading this answer
you should be able to examine if there are any differences between venv
and
vitualenv
and understand how to handle them.
What are site-packages
The short answer is that site-packages
is the place where python installs 3rd party
code (as in non-stdlib code). For more info read
this
and the provided links.
How to inject code?
Python does allow you to customize the python interpreter while it starts up, i.e.
before our main code/script/whatever gets executed. This can be useful for e.g.:
- code-coverage
- profilers
- and to generally inject code, as we need to do for this question.
The way you achieve the injection is by creating/modyfing either sitecustomize.py
or
usercustomize.py
. You can also use a "path configuration file" (i.e. *.pth
) file
with an import statement but I will not cover this case here since:
- These import statements feel like a terrible hack
- I am not convinced that it offers any real advantage compared to
sitecustomize
/usercustomize
.
- I would like to keep things simple
Anyway, if you need more info WRT to path configuration files you can check
PyMOTW and if you want
an example of using them with import statement check this blog
post.
sitecustomize
& usercustomize
So, sitecustomize
and usercustomize
are special files that don't exist by default,
but if we create them python will automatically import them before it starts to execute
our code. We can create these files:
- either on the global site-packages (e.g.
/usr/lib/python3.7/site-packages/
)
- or on the user site-packages (e.g.
~/.local/lib/python3.7/site-packages/
)
sitecustomize
is always imported before usercustomize
. If either file is missing,
the ImportError
is silently ignored.
As a security precaution, if there is a mismatch between user or group id and the
effective id, then user site-packages are disabled
(source).
Moreover, the python interpreter has CLI arguments that either completely disable
site-packages (both system and user
ones) or disable user
site-packages. Assuming that
we don't have an id mismatch and that we don't use any CLI flags, then user
site-packages have higher priority than system site-packages. So if we have both:
~/.local/lib/python3.7/site-packages/sitecustomize.py
/usr/lib/python3.7/site-packages/sitecustomize.py
the first one is the one that will be imported. We can actually check the sys.path
priority by executing the site.py
module:
$ python3 -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/home/username/.local/lib/python3.7/site-packages', # user site-packages
'/usr/lib/python3.7/site-packages', # system site-packages
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: True
The important information here is the value of ENABLE_USER_SITE
. If it is True
then
user site-packages are enabled. If it False
then we can only use global site-packages.
E.g. if we use python -s
:
$ python3 -s -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/usr/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: False
Note that in this case ENABLE_USER_SITE
is False
.
Just for completeness, let's completely disable the site-packages:
$ python3 -S -msite
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: None
Experiment
In order to better understand this, let's do an experiment. First let's create
usercustomize
sitecustomize
modules in both system and user site-packages.
WARNING: We will be creating files in system site-packages. This will be
interfering with your python distribution. Be careful and REMEMBER to remove
them when we are done.
# system site packages
echo 'print(f"-> {__file__}")' | sudo tee /usr/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | sudo tee /usr/lib/python3.7/site-packages/sitecustomize.py
# user site packages
echo 'print(f"-> {__file__}")' | tee ~/.local/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ~/.local/lib/python3.7/site-packages/sitecustomize.py
Let's also create a python module:
echo 'print("Inside foo")' | tee foo.py
Now let's execute foo.py
:
$ python3 foo.py
-> /home/username/.local/lib/python3.7/site-packages/sitecustomize.py
-> /home/username/.local/lib/python3.7/site-packages/usercustomize.py
Inside foo
As we can see:
- both
sitecustomize
and usercustomize
are being imported
- they are being imported from user site-packages
Now what will happen if we disable user site-packages?
$ python3 -s foo.py
-> /usr/lib/python3.7/site-packages/sitecustomize.py
Inside foo
This time we see that we:
- only
sitecustomize
is being imported. Even though usercustomize
exists in
system site-packages python does not import it! This is important! Keep it mind for when
we discuss virtualenvs! (TIP: this has to do with ENABLE_USER_SITE
; do you remember
what value does it have in this case?)
sitecustomize
is being imported from system site-packages
Finally, if we completely disable site-packages, obviously usercustomize
and
sitecustomize
will be ignored:
$ python3 -S foo.py
Inside foo
What about virtualenvs?
OK, now let's bring virtualenv's into the game too. There are two types of virtualenvs:
- the normal ones
- those created with
--system-site-packages
.
Let's create virtualenvs of both types
python3 -mvenv venv_no_system
python3 -mvenv --system-site-packages venv_system
Let's also create sitecustomize.py
and usercustomize.py
modules into the
virtualenv's site-packages:
echo 'print(f"-> {__file__}")' | tee ./venv_no_system/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_no_system/lib/python3.7/site-packages/sitecustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_system/lib/python3.7/site-packages/usercustomize.py
echo 'print(f"-> {__file__}")' | tee ./venv_system/lib/python3.7/site-packages/sitecustomize.py
and let's see the differences:
$ ./venv_no_system/bin/python -msite
/tmp/test/venv_no_system/lib/python3.7/site-packages/sitecustomize.py
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/tmp/test/venv_no_system/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: False
What do we see here? That on normal virtualenv's ENABLE_USER_SITE
is False
,
which means that:
- user site-packages are being ignored
- only
sitecustomize
is being imported! I.e. we can't use usercustomize
to
inject code!!!
We can also see that instead of our global site-packages, the virtualenv is using its
own one (i.e. /tmp/test/venv_no_system/lib/python3.7/site-packages
).
Now let's repeat this but this time with the virtualenv that uses the system
site-packages:
$ ./venv_system/bin/python -msite
-> /home/username/.local/lib/python3.7/site-packages/sitecustomize.py
-> /home/username/.local/lib/python3.7/site-packages/usercustomize.py
sys.path = [
'/tmp/test',
'/usr/lib/python37.zip',
'/usr/lib/python3.7',
'/usr/lib/python3.7/lib-dynload',
'/tmp/test/venv_system/lib/python3.7/site-packages',
'/home/username/.local/lib/python3.7/site-packages',
'/usr/lib/python3.7/site-packages',
]
USER_BASE: '/home/username/.local' (exists)
USER_SITE: '/home/username/.local/lib/python3.7/site-packages' (exists)
ENABLE_USER_SITE: True
In this case, the behavior is different...
ENABLE_USER_SITE
is True
which means:
- that user site-packages are being enabled
- that
usercustomize
is being imported normally.
But there is also one more difference. In this case we have 3 site-packages
directories. The virtualenv one is the one with the higher priority followed by the
user site-packages while the system site-packages is the last one.
So what to use?
I think that there are three options:
- use the system python installation
- use a normal virtualenv
- use a virtualenv with system site-packages
I think that in the majority of the cases, e.g. on normal servers/desktops, modifying
the system python installation should generally be avoided. At least on *nix, too many
things depend on Python. I would be really reluctant to change its behavior. Possible
exceptions are ephemeral or static "systems" (e.g. inside a container).
As far as virtual environments go, unless we know we are going to need system
site-packages, I think it makes sense to stick with the current practise and use normal
ones. If we adhere to this, then in order to inject code before our script gets
executed we only have one option:
To add a sitecustomize.py
in the virtuaelenv's site packages.
Cleaning up
# remove the virtualenvs
rm -rf my_venv
rm -rf venv_system
rm -rf venv_no_system
# remove our usercustomize.py and sitecustomize.py
sudo rm /usr/lib/python3.7/site-packages/sitecustomize.py
sudo rm /usr/lib/python3.7/site-packages/usercustomize.py
rm ~/.local/lib/python3.7/site-packages/sitecustomize.py
rm ~/.local/lib/python3.7/site-packages/usercustomize.py
# remove the modules
rm foo.py
rm my_app.py
$PYTHONRUNFIRST
but it never really took off: bugs.python.org/issue14803 – Caliber