Equivalent of `package.json' and `package-lock.json` for `pip`
Asked Answered
I

3

47

Package managers for JavaScript like npm and yarn use a package.json to specify 'top-level' dependencies, and create a lock-file to keep track of the specific versions of all packages (i.e. top-level and sub-level dependencies) that are installed as a result.

In addition, the package.json allows us to make a distinction between types of top-level dependencies, such as production and development.

For Python, on the other hand, we have pip. I suppose the pip equivalent of a lock-file would be the result of pip freeze > requirements.txt.

However, if you maintain only this single requirements.txt file, it is difficult to distinguish between top-level and sub-level dependencies (you would need for e.g. pipdeptree -r to figure those out). This can be a real pain if you want to remove or change top-level dependencies, as it is easy to be left with orphaned packages (as far as I know, pip does not remove sub-dependencies when you pip uninstall a package).

Now, I wonder: Is there some convention for dealing with different types of these requirements files and distinguishing between top-level and sub-level dependencies with pip?

For example, I can imagine having a requirements-prod.txt which contains only the top-level requirements for the production environment, as the (simplified) equivalent of package.json, and a requirements-prod.lock, which contains the output of pip freeze, and acts as my lock-file. In addition I could have a requirements-dev.txt for development dependencies, and so on and so forth.

I would like to know if this is the way to go, or if there is a better approach.

p.s. The same question could be asked for conda's environment.yml.

Interdict answered 5/10, 2018 at 12:25 Comment(1)
L
48

There are at least three good options available today:

  1. Poetry uses pyproject.toml and poetry.lock files, much in the same way that package.json and lock files work in the JavaScript world.

    This is now my preferred solution.

  2. Pipenv uses Pipfile and Pipfile.lock, also much like you describe the JavaScript files.

Both Poetry and Pipenv do more than just dependency management. Out of the box, they also create and maintain virtual environments for your projects.

  1. pip-tools provides pip-compile and pip-sync commands. Here, requirements.in lists your direct dependencies, often with loose version constraints and pip-compile generates locked down requirements.txt files from your .in files.

    This used to be my preferred solution. It's backwards-compatible (the generated requirements.txt can be processed by pip) and the pip-sync tool ensures that the virtualenv exactly matches the locked versions, removing things that aren't in your "lock" file.

Luganda answered 5/10, 2018 at 12:34 Comment(8)
Thanks for the great answer, which pointed me to this interesting post. However, I am hesitant in adopting pipenv, with its use of virtualenv instead of conda, because I really like (and rely on) conda's ability to manage Python versions.Interdict
That's another point in favour of pip-tools, IMO. It doesn't try to do too much for you.Luganda
And pip-tools also takes care of "it is easy to be left with orphaned packages" since it removes anything that's not in the requirements file.Luganda
Sounds good, I'll have a look at it. Does introduce another dependency though. ;-)Interdict
Yes, to work around limitations of pip itself. There are manual workarounds using just pip but then you're a lot more likely to make a mistake. The fact that pip-compile outputs a pip-compatible requirements.txt file means you can just pip install -r requirements.txt on new machines and then work with pip-tools moving forward. I usually install pip-tools into new environments on creation.Luganda
What I need is, I'd like to run pip install and generate the lock file automatically and commit the lockfile in source code as well. So next time, in a new system, I can install the exact same package versions by reference lockfile more than to install some newer version on-fly.Tunisia
@BMW, are you asking for clarification about something? Both options listed here do what you want, just with slightly different syntax.Luganda
Jumping in here to add a +1 to pip-tools, I like how lightweight it is and how compatible it is with any other tool out there, since it's output is just a plain old requirements.txt.Ferryman
Q
1

I had the same question and I came up with a more generic and simple solution. I am using the well-known requirements.txt for all explicit dependencies and requirements.lock as a list of all packages including sub dependencies.

I personally like to manage python, pip and setuptools via the distributions builtin package manager and install pip dependencies inside a virtual environment.

Usually you would start installing all directly required dependencies. This will pull in all sub dependencies as well. If you are not using a virtual environment make sure to add the --user flag.

# If you already have a requirements file
pip3 install -r requirements.txt

# If you start from scratch
pip3 install <package>

If you want to upgrade your packages, you have multiple options here as well. Since I am using a virtual environment I will always update all packages. However you are free to only update your direct requirements. If they need an update of their dependencies, those will be pulled in as well, everything else will be left untouched.

# Update all outdated packages (excluding pip and setuptools itself)
pip3 install -r <(pip3 list --outdated --format freeze --exclude pip setuptools | cut -d '=' -f1) --upgrade

# Update explicitly installed packages, update sub dependencies only if required.
pip3 install -r <(cut -d '=' -f1 requirements.txt) --upgrade

Now we come to the tricky part: Saving back our requirements file. Make sure that the previous requirements file is checked into git, so if anything goes wrong you have a backup.

Remember that we want to differentiate between packages explicitly installed (requirements.txt) and packages including their dependencies (requirements.lock).

If you have not yet setup a requirements.txt I suggest running the following command. Note that it will not include sub dependencies if they are already satisfied by another package. This means requests will not be included in the list, if it was already satisfied by another package. You might still want to add that manually, if your script explicitly relies on such a package.

pip3 list --not-required --format freeze --exclude pip --exclude setuptools > requirements.txt

If you already have a requirements.txt you can update it by using this sed trick. This will leave all sub-dependencies outside, which we will only include in the requirements.lock in the next step.

pip3 freeze -r requirements.txt | sed -n '/## The following requirements were added by pip freeze:/q;p' | sponge requirements.txt

Finally we can output all dependencies to a requirements.lock file which will be our complete list of all packages and versions. If we have issues to reproduce an issue, we can always come back to this lock file and run our code with the previously working dependencies.

# It is important to use the -r option here, so pip will differenciate between directly required packages and dependencies.
pip3 freeze -r requirements.txt > requirements.lock
Quintus answered 10/9, 2022 at 7:10 Comment(0)
L
1

Rye (a project and package management tool) generates a requirements.lock and requirements-dev.lock:

cd my-project
rye init
rye add pandas
cat requirements.lock

Cf documentation

Rye is now my preferred tool because the generated requirements.lock file is a regular requirements.txt file, only the extension changes, and thus is fully compatible with pip install -r requirements.lock.

Lederhosen answered 4/7 at 9:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.