(2021 UPDATE)
TL;DR Use pip, it's the official package manager since Python 3.
pip
basics
pip is the default package manager for python
pip is built-in as of Python 3.0
Usage: python3 -m venv myenv; source myenv/bin/activate; python3 -m pip install requests
Packages are downloaded from pypi.org, the official public python repository
It can install precompiled binaries (wheels) when available, or source (tar/zip archive).
Compiled binaries are important because many packages are mixed Python/C/other with third-party dependencies and complex build chains. They MUST be distributed as binaries to be ready-to-use.
advanced
pip can actually install from any archive, wheel, or git/svn repo...
...that can be located on disk, or on a HTTP URL, or a personal pypi server.
pip install git+https://github.com/psf/[email protected]
for example (it can be useful for testing patches on a branch).
pip install https://download.pytorch.org/whl/cpu/torch-1.9.0%2Bcpu-cp39-cp39-linux_x86_64.whl
(that wheel is Python 3.9 on Linux).
when installing from source, pip will automatically build the package. (it's not always possible, try building TensorFlow without the google build system :D)
binary wheels can be python-version specific and OS specific, see manylinux specification to maximize portability.
conda
You are NOT permitted to use Anaconda or packages from Anaconda repositories for commercial use, unless you acquire a license.
Conda is a third party package manager from conda.
It's popularized by anaconda, a Python distribution including most common data science libraries ready-to-use.
You will use conda when you use anaconda.
Packages are downloaded from the anaconda repo.
It only installs precompiled packages.
Conda has its own format of packages. It doesn't use wheels.
conda install
to install a package.
conda build
to build a package.
conda can build the python interpreter (and other C packages it depends on). That's how an interpreter is built and bundled for anaconda.
conda allows to install and upgrade the Python interpreter (pip does not).
advanced
Historically, the selling point of conda was to support building and installing binary packages, because pip did not support binary packages very well (until wheels and manylinux2010 spec).
Emphasis on building packages. Conda has extensive build settings and it stores extensive metadata, to work with dependencies and build chains.
Some projects use conda to initiate complex build systems and generate a wheel, that is published to pypi.org for pip.
easy_install/egg
- For historical reference only. DO NOT USE
- egg is an abandoned format of package, it was used up to mid 2010s and completely replaced by wheels.
- an egg is a zip archive, it contains python source files and/or compiled libraries.
- eggs are used with
easy_install
and the first releases of pip.
easy_install
was yet another package manager, that preceded pip and conda. It was removed in setuptools v58.3 (year 2021).
- it too caused a lot of confusion, just like pip vs conda :D
- egg files are slow to load, poorly specified, and OS specific.
- Each egg was setup in a separate directory, an
import mypackage
would have to look for mypackage.py
in potentially hundreds of directories (how many libraries were installed?). That was slow and not friendly to the filesystem cache.
Historically, the above three tools were open-source and written in Python.
However the company behind conda updated their Terms of Service in 2020 to prohibit commercial usage, watch out!
Funfact: The only strictly-required dependency to build the Python interpreter is zlib (a zip library), because compression is necessary to load more packages. Eggs and wheels packages are zip files.
Why so many options?
A good question.
Let's delve into the history of Python and computers. =D
Pure python packages have always worked fine with any of these packagers. The troubles were with not-only-Python packages.
Most of the code in the world depends on C. That is true for the Python interpreter, that is written in C. That is true for numerous Python packages, that are python wrappers around C libraries or projects mixing python/C/C++ code.
Anything that involves SSL, compression, GUI (X11 and Windows subsystems), math libraries, GPU, CUDA, etc... is typically coupled with some C code.
This creates troubles to package and distribute Python libraries because it's not just Python code that can run anywhere. The library must be compiled, compilation requires compilers and system libraries and third party libraries, then once compiled, the generated binary code only works for the specific system and python version it was compiled on.
Originally, python could distribute pure-python libraries just fine, but there was little support for distributing binary libraries. In and around 2010 you'd get a lot of errors trying to use numpy
or cassandra
. It downloaded the source and failed to compile, because of missing dependencies. Or it downloaded a prebuilt package (maybe an egg at the time) and it crashed with a SEGFAULT when used, because it was built for another system. It was a nightmare.
This was resolved by pip and wheels from 2012 onward. Then wait many years for people to adopt the tools and for the tools to propagate to stable Linux distributions (many developers rely on /usr/bin/python
). The issues with binary packages extended to the late 2010s.
For reference, that's why the first command to run is python3 -m venv myvenv && source myvenv/bin/activate && pip install --upgrade pip setuptools
on antiquated systems, because the OS comes with an old python+pip from 5 years ago that's buggy and can't recognize the current package format.
Conda worked on their own solution in parallel. Anaconda was specifically meant to make data science libraries easy to use out-of-the-box (data science = C and C++ everywhere), hence they had to come up with a package manager specifically meant to address building and distributing binary packages, conda.
If you install any package with pip install xxx
nowadays, it just works. That's the recommended way to install packages and it's built-in in current versions of Python.
conda
/enpgk
is targeted atnew users who want to get up and running with minimal effort
: canopy/anaconda are standalone environement, that do not interfere with system python (like venv but more powerfull). BTW IPyhton, not iPython (upper case I) – Tasiaconda install conda pip
first? – Springhaltconda create --name myenv python==3.9 conda pip
. Then activateconda activate myenv
, and then you can use pip install. Conda is aware of the pip-installed packages, and will still update packages that it installs, but it will not update the pip-installed packages. – Pirozzo