2024-04-05 update:
It looks like my tips proved to be useful to many people, but they are not needed anymore. Just use Pixi. It's still alpha, but it works great, and provides the features of the Conda + Poetry setup in a simpler and more unified way. In particular, Pixi supports:
- installing packages both from Conda channels and from PyPi,
- lockfiles,
- creating multiple features and environments (prod, dev, etc.),
- very efficient package version resolution, not just faster than Conda (which is very slow), but in my experience also faster than Mamba, Poetry and pip.
Making a Pixi env look like a Conda env
One non-obvious tip about Pixi is that you can easily make your project's Pixi environment visible as a Conda environment, which may be useful e.g. in VS Code, which allows choosing Python interpreters and Jupyter kernels from detected Conda environments. All you need to do is something like:
ln -s /path/to/my/project/.pixi/envs/default /path/to/conda/base/envs/conda-name-of-my-env
The first path is the path to your Pixi environment, which resides in your project directory, under .pixi/envs
, and the second path needs to be within one of Conda's environment directories, which can be found with conda config --show envs_dirs
.
Original answer:
I have experience with a Conda + Poetry setup, and it's been working fine. The great majority of my dependencies are specified in pyproject.toml
, but when there's something that's unavailable in PyPI, or installing it with Conda is easier, I add it to environment.yml
. Moreover, Conda is used as a virtual environment manager, which works well with Poetry: there is no need to use poetry run
or poetry shell
, it is enough to activate the right Conda environment.
Tips for creating a reproducible environment
- Add Poetry, possibly with a version number (if needed), as a dependency in
environment.yml
, so that you get Poetry installed when you run conda create
, along with Python and other non-PyPI dependencies.
- Add
conda-lock
, which gives you lock files for Conda dependencies, just like you have poetry.lock
for Poetry dependencies.
- Consider using
mamba
which is generally compatible with conda
, but is better at resolving conflicts, and is also much faster. An additional benefit is that all users of your setup will use the same package resolver, independent from the locally-installed version of Conda.
- By default, use Poetry for adding Python dependencies. Install packages via Conda if there's a reason to do so (e.g. in order to get a CUDA-enabled version). In such a case, it is best to specify the package's exact version in
environment.yml
, and after it's installed, to add an entry with the same version specification to Poetry's pyproject.toml
(without ^
or ~
before the version number). This will let Poetry know that the package is there and should not be upgraded.
- If you use a different channels that provide the same packages, it might be not obvious which channel a particular package will be downloaded from. One solution is to specify the channel for the package using the :: notation (see the
pytorch
entry below), and another solution is to enable strict channel priority. Unfortunately, in Conda 4.x there is no way to enable this option through environment.yml
.
- Note that Python adds user site-packages to
sys.path
, which may cause lack of reproducibility if the user has installed Python packages outside Conda environments. One possible solution is to make sure that the PYTHONNOUSERSITE
environment variable is set to True
(or to any other non-empty value).
Example
environment.yml
:
name: my_project_env
channels:
- pytorch
- conda-forge
# We want to have a reproducible setup, so we don't want default channels,
# which may be different for different users. All required channels should
# be listed explicitly here.
- nodefaults
dependencies:
- python=3.10.* # or don't specify the version and use the latest stable Python
- mamba
- pip # pip must be mentioned explicitly, or conda-lock will fail
- poetry=1.* # or 1.1.*, or no version at all -- as you want
- tensorflow=2.8.0
- pytorch::pytorch=1.11.0
- pytorch::torchaudio=0.11.0
- pytorch::torchvision=0.12.0
# Non-standard section listing target platforms for conda-lock:
platforms:
- linux-64
virtual-packages.yml
(may be used e.g. when we want conda-lock
to generate CUDA-enabled lock files even on platforms without CUDA):
subdirs:
linux-64:
packages:
__cuda: 11.5
First-time setup
You can avoid playing with the bootstrap env and simplify the example below if you have conda-lock
, mamba
and poetry
already installed outside your target environment.
# Create a bootstrap env
conda create -p /tmp/bootstrap -c conda-forge mamba conda-lock poetry='1.*'
conda activate /tmp/bootstrap
# Create Conda lock file(s) from environment.yml
conda-lock -k explicit --conda mamba
# Set up Poetry
poetry init --python=~3.10 # version spec should match the one from environment.yml
# Fix package versions installed by Conda to prevent upgrades
poetry add --lock tensorflow=2.8.0 torch=1.11.0 torchaudio=0.11.0 torchvision=0.12.0
# Add conda-lock (and other packages, as needed) to pyproject.toml and poetry.lock
poetry add --lock conda-lock
# Remove the bootstrap env
conda deactivate
rm -rf /tmp/bootstrap
# Add Conda spec and lock files
git add environment.yml virtual-packages.yml conda-linux-64.lock
# Add Poetry spec and lock files
git add pyproject.toml poetry.lock
git commit
Usage
The above setup may seem complex, but it can be used in a fairly simple way.
Creating the environment
conda create --name my_project_env --file conda-linux-64.lock
conda activate my_project_env
poetry install
Activating the environment
conda activate my_project_env
Updating the environment
# Re-generate Conda lock file(s) based on environment.yml
conda-lock -k explicit --conda mamba
# Update Conda packages based on re-generated lock file
mamba update --file conda-linux-64.lock
# Update Poetry packages and re-generate poetry.lock
poetry update
conda
andpip
are not able to provide for you – Terrillpoetry
is an upgrade ofpipenv
, notpyenv
. for example, it does dependency resolution (figuring out the latest versions of all dependencies that are compatible with each other). – Ghentmamba
alone is the better solution for your use case. Some ofconda
's weaknesses as a package manager are solved withmamba
, especially package resolution speed. That said, I've succesfully usedconda
+poetry
for a major ML project. – Resistant