On Linux Debian, how can I list all installed python pip packages and the size (amount of disk space used) that each one takes up?
Go to the package site to find the size e.g. https://pypi.python.org/pypi/pip/json
Then expand releases
, find the version, and look up the size
(in bytes).
Modified for pip version 18 and above:
pip list \
| tail -n +3 \
| awk '{print $1}' \
| xargs pip show \
| grep -E 'Location:|Name:' \
| cut -d ' ' -f 2 \
| paste -d ' ' - - \
| awk '{print $2 "/" tolower($1)}' \
| xargs du -sh 2> /dev/null \
| sort -hr
This command shows pip packages, sorted by descending order of sizes.
LANG=C pip list | tail -n +3 | awk '{print $1}' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -hr
and voilà! –
Bullheaded beautifulsoup4
would be installed as bs4
.) Looks like currently we don't have a perfect solution unless we do a deep and serious scan (of dist-info
or something like that). –
Recursion site-packages
according to the name provided in the pip list description (often to do with hyphens and underscores) –
Sunderland site-packages
e.g. env/lib/python3.10 and then running the command: du -sh ./site-packages/* | sort -hr
–
Sunderland ^
) on the grep ...
part, because there are some packages with edge cases, like scipy
, where Name:
matches more than one line, and that inserts wrong lines all along the pipeline. –
Beauharnais Could please try this one(A bit long though, maybe there are better solutions):
$ pip list \
| xargs pip show \
| grep -E 'Location:|Name:' \
| cut -d ' ' -f 2 \
| paste -d ' ' - - \
| awk '{print $2 "/" tolower($1)}' \
| xargs du -sh \
2> /dev/null
the output should look like this:
80K /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/blinker
3.8M /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/docutils
296K /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/ecdsa
340K /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/execnet
564K /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/fabric
1.4M /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/flask
316K /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/httplib2
1.9M /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages/jinja2
...
should works if the package is installed in Location/Name
. (location and name are from pip show <package>
)
pip show <package>
will show you the location:
---
Metadata-Version: 2.0
Name: Flask
Version: 0.10.1
Summary: A microframework based on Werkzeug, Jinja2 and good intentions
Home-page: http://github.com/mitsuhiko/flask/
Author: Armin Ronacher
Author-email: [email protected]
License: BSD
Location: /home/lord63/.pyenv/versions/2.7.11/envs/py2/lib/python2.7/site-packages
Requires: itsdangerous, Werkzeug, Jinja2
we get the Name
and Location
to join them to get the location, finally use du -sh
to get the package size.
gsort
on Mac OS X from homebrew, because standard sort on Mac does not have the -h
flag –
Vedette pip 18.0
which outputs a header, so I added in a tail -n +3 | awk '{print $1}' in between the
pip list` and pip show
–
Monkhmer pip
commands with pip3
as I'm on a Mac where pip is used for Python 2 and pip3 for Python 3; then (similar to what @Monkhmer did) I used | sed '1,2d'
between pip3 list
and xargs pip3 show
to remove the 2 header rows in the pip3 list
output; then to chop off the full path, I added | sed -E 's/\/Library\/Frameworks\/Python.framework\/Versions\/3.7\/lib\/python3.7\/site-packages\///g'
; then for reverse sort and size in bytes I added | sed -E 's/([0-9]).([0-9])M/\1\200000/g ; s/ +([0-9]+)M/\1000000/g ; s/([0-9]).([0-9])K/\1\200/g ; s/ +([0-9]+)K/\1000/g' | sort -rn
–
Alberta New version for new pip list format:
pip2 list --format freeze \
|awk -F = {'print $1'} \
| xargs pip2 show \
| grep -E 'Location:|Name:' \
| cut -d ' ' -f 2 \
| paste -d ' ' - - \
| awk '{print $2 "/" tolower($1)}' \
| xargs du -sh \
2> /dev/null \
|sort -h
pip3 list --format freeze|awk -F = {'print $1'}| xargs pip3 show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null|sort -h
–
Munafo There is a simple Pythonic way to find it out though.
Here is the code. Let's call this file pipsize.py
.
import os
import pkg_resources
def calc_container(path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
dists = [d for d in pkg_resources.working_set]
for dist in dists:
try:
path = os.path.join(dist.location, dist.project_name)
size = calc_container(path)
if size/1000 > 1.0:
print (f"{dist}: {size/1000} KB")
print("-"*40)
except OSError:
'{} no longer exists'.format(dist.project_name)
When run with python pipsize.py
this will print out something like,
pip 21.1.2: 8651.906 KB
----------------------------------------
numpy 1.20.3: 25892.871 KB
----------------------------------------
numexpr 2.7.3: 1627.361 KB
----------------------------------------
zict 2.0.0: 48.54 KB
----------------------------------------
yarl 1.6.3: 1395.888 KB
----------------------------------------
widgetsnbextension 3.5.1: 4609.962 KB
----------------------------------------
webencodings 0.5.1: 54.768 KB
----------------------------------------
wcwidth 0.2.5: 452.214 KB
----------------------------------------
uvicorn 0.14.0: 257.515 KB
----------------------------------------
tzlocal 2.1: 67.11 KB
----------------------------------------
traitlets 5.0.5: 800.71 KB
----------------------------------------
tqdm 4.61.0: 289.412 KB
----------------------------------------
tornado 6.1: 2898.264 KB
Go to the package site to find the size e.g. https://pypi.python.org/pypi/pip/json
Then expand releases
, find the version, and look up the size
(in bytes).
All of the above solutions do not list packages with dashes in them: PIP converts them to underscores in the folder names:
pip list --format freeze | awk -F = {'print $1'} | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{gsub("-","_",$1); print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -h
And for Mac users:
pip3 list --format freeze | awk -F = {'print $1'} | xargs pip3 show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{gsub("-","_",$1); print $2 "/" tolower($1)}' | xargs du -sh 2> /dev/null | sort -h
Here's how,
pip3 show numpy | grep "Location:"
- this will return path/to/all/packages
du -h path/to/all/packages
- last line will contain size of all packages in MB
Note: You may put any package name in place of numpy
How
$ du -h -d 1 "$(pip -V | cut -d ' ' -f 4 | sed 's/pip//g')" | grep -vE "dist-info|_distutils_hack|__pycache__" | sort -h
Pros
No need to convert these:
case (Django:django)
hyphen (django-q:django_q)
naming (djangorestframework-gis:rest_framework_gis)
Cons
Dependencies and some unknown directories revealed as well...
History :
There is no command or applications developed for that purpose at the moment, we need to check that manually
Manual Method I :
du /usr/lib/python3.5/ --max-depth=2 | sort -h
du /usr/lib64/python3.5/ --max-depth=2 | sort -h
This does not include packages/files installed out of that directory, thus said we will get 95% with those 2 simples command
Also if you have other version of python installed, you need to adapt the directory
Manual Method II :
pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - - | awk '{print $2 "/" $(find $2 -maxdepth 1 -iname $1)}' | xargs du -sh | sort -h
Search the install directory with the package name with case insensitive
Manual Method II Alternative I :
pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - -| awk '{print $2 "/" tolower($1)}' | xargs du -sh | sort -h
Search the install directory with the package name with lowered case
Manual Method II Alternative II :
pip list | sed '/Package/d' | sed '/----/d' | sed -r 's/\S+//2' | xargs pip show | grep -E 'Location:|Name:' | cut -d ' ' -f 2 | paste -d ' ' - -| awk '{print $2 "/" $1}' | xargs du -sh | sort -h
Search the install directory with the package name
Note :
For methods using du
, output lines starting with du: cannot access
need to be checked manually;
The command use the install directory and add to it the name of the package but some times the package name and directory name are different...
Make it simple :
- Use first method then
- Use second method and just check manually package outside python classic directory
On Mac, I navigate to the site-packages
folder and do
du -h -d 1 | sort -rh | grep -v "dist-info"
On linux you need --max-depth 1
instead of -d 1
. But I think that should work.
You can just run part 1 by it's self for all the current packages python tool-size.py
will total them all up for you
If you want to know the exact size of a particular pip package including all its dependencies, i've created a little bash and python combo to achieve this
( based off the excellent package walking code answer above https://mcmap.net/q/183476/-how-to-see-sizes-of-installed-pip-packages )
Steps :
- create a python script to check all currently installed pip packages
- create a shell script to create a brand new python environment and install package to test, and run the script from step 1
- run shell script
- profit :)
Step 1
create a python script called tool-size.py
#!/usr/bin/env python
import os
import pkg_resources
def calc_container(path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
def calc_installed_sizes():
dists = [d for d in pkg_resources.working_set]
total_size = 0
print (f"Size of Dependencies")
print("-"*40)
for dist in dists:
# ignore pre-installed pip and setuptools
if dist.project_name in ["pip", "setuptools"]:
continue
try:
path = os.path.join(dist.location, dist.project_name)
size = calc_container(path)
total_size += size
if size/1000 > 1.0:
print (f"{dist}: {size/1000} KB")
print("-"*40)
except OSError:
'{} no longer exists'.format(dist.project_name)
print (f"Total Size (including dependencies): {total_size/1000} KB")
if __name__ == "__main__":
calc_installed_sizes()
Step 2
create a bash script called tool-size.sh
#!/usr/bin/env bash
# uncomment to to debug
# set -x
rm -rf ~/.virtualenvs/tool-size-tester
python -m venv ~/.virtualenvs/tool-size-tester
source ~/.virtualenvs/tool-size-tester/Scripts/activate
pip install -q $1
python tool-size.py
deactivate
Step 3
run script with package you want to get the size of
tool-size.sh xxx
say for truffleHog3
$ ./tool-size.sh truffleHog3
Size of Dependencies
----------------------------------------
truffleHog3 2.0.6: 56.46 KB
----------------------------------------
smmap 4.0.0: 108.808 KB
----------------------------------------
MarkupSafe 2.0.1: 40.911 KB
----------------------------------------
Jinja2 3.0.1: 917.551 KB
----------------------------------------
gitdb 4.0.7: 320.08 KB
----------------------------------------
Total Size (including dependencies): 1443.81 KB
Starting with Python 3.10 you can get the on-disk sizes of installed Python packages using a script like
import importlib.metadata
for d in importlib.metadata.distributions():
print(sum(f.locate().stat().st_blocks*512 for f in d.files), d.name)
Or from command line on single line
python -c 'for d in __import__("importlib.metadata").metadata.distributions(): print(sum(f.locate().stat().st_blocks*512 for f in d.files), d.name)
It works starting Python 3.8 if you replace d.name
with d.metadata['Name']
.
A modified version of Marko Kohtala's answer:
One-liner:
python -c "for d in __import__('importlib.metadata').metadata.distributions(): print('{:>12.3f} KiB {}'.format(sum(0 if not f.locate().is_file() else f.locate().stat().st_size for f in d.files) / 1024, d.name))"
The same, but more readable:
import importlib.metadata
for d in importlib.metadata.distributions():
d_size = 0
for f in d.files:
if f.locate().is_file():
d_size += f.locate().stat().st_size
print('{:>12.3f} KiB {}'.format(d_size/1024, d.name))
Example output:
60.752 KiB multipledispatch
318.895 KiB natsort
64329.371 KiB numpy
288.076 KiB packaging
54892.789 KiB pandas
28.006 KiB pandas-flavor
7185.510 KiB pip
77101.011 KiB pyarrow
1088.491 KiB pyjanitor
644.466 KiB python-dateutil
1033.665 KiB pytz
147559.953 KiB scipy
3810.577 KiB setuptools
64.252 KiB six
303.010 KiB tabulate
572.733 KiB tzdata
523.449 KiB wheel
9488.667 KiB xarray
Motivation for this modification:
- uses
st_size
(size in bytes) instead ofst_blocks
(size taken on disk) - hence works of both Windows and Linux (python 3.10)
- resilient to missing files (personally, I run into them a lot)
- slightly better formatting
Building on @Tirtha and @AnsonH answers, here is my version:
It features:
- line showing the total space,
- a line showing the space taken by all the small libraries,
- a table-like formatting to display everything in decreasing order.
# Run `python pipsize.py` in Terminal to show size of pip packages
# Credits: https://mcmap.net/q/183476/-how-to-see-sizes-of-installed-pip-packages
# Credits: https://gist.github.com/AnsonH/fd634ba4298376f2abd8e00f99b01be8
import os
import pkg_resources
sort_in_descending = True # Show packages in descending order
def calc_container(path):
total_size = 0
for dirpath, _, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
total_size = 0
max_size = 0
max_dist_length = 0
dists = [d for d in pkg_resources.working_set]
dists_with_size = {}
for dist in dists:
try:
max_dist_length = max(max_dist_length, len(str(dist)))
path = os.path.join(dist.location, dist.project_name)
size = calc_container(path)
total_size += size
max_size = max(max_size, size)
dists_with_size[size] = dist
except OSError:
'{} no longer exists'.format(dist.project_name)
# Sort packages size
dists_with_size = dict(sorted(dists_with_size.items(), reverse=sort_in_descending))
def str_spacer(name: str, max_len: int = max_dist_length) -> str:
n_spaces = max_len - len(str(name))
return f"{n_spaces * ' '}"
def human_readable_size(size: int, decimal_places: int = 2, max_unit: str = "PiB"):
units = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB']
if max_unit not in units:
raise ValueError(f"specified max unit not in available units. Available units: {units}")
for unit in units:
if size < 1024.0 or unit == max_unit:
break
size /= 1024.0
return f"{size:.{decimal_places}f} {unit}"
def table_printer(text: str, size: int):
print(f"{text} {str_spacer(text)}{human_readable_size(size, max_unit='MiB')}")
# print total statement
table_printer("TOTAL", total_size)
max_size_text = human_readable_size(max_size, max_unit="MiB")
print("=" * (1 + max_dist_length + len(max_size_text)))
# print size for each distro
count_small_libs = 0
small_lib_size = 0
for size, dist in dists_with_size.items():
if size/1000000 > 1.0:
table_printer(dist, size)
else:
count_small_libs += 1
small_lib_size += size
# print remaining size for small distros
small_lib_text = f"{count_small_libs} libs smaller than 1.0 MB"
print()
table_printer(small_lib_text, small_lib_size)
Running the script in python outputs:
TOTAL 1341.58 MiB
==========================================
kaleido 0.2.1 253.34 MiB
torch 1.13.0 232.95 MiB
scipy 1.8.1 93.77 MiB
pyarrow 10.0.0 81.60 MiB
safetensors 0.4.1 1.14 MiB
fsspec 2023.12.2 1.08 MiB
coverage 7.4.0 1.05 MiB
pyod 1.1.2 1.03 MiB
pycparser 2.21 1001.23 KiB
92 libs smaller than 1.0 MB 27.70 MiB
I like @Tirtha's solution. Here's my upgraded version that takes the path to a requirements.txt
as an optional argument and only shows the sizes of the packages contained therein.
Useful if you want to know the size of dependencies for a specific project.
import os
import sys
import pkg_resources
from numpy import loadtxt
# Usage: python3 pipsize.py [requirements.txt]
if len(sys.argv) == 2:
with open(sys.argv[1], 'r') as file:
requirements = file.read().splitlines()
else:
requirements = []
def calc_container(path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
dists = [d for d in pkg_resources.working_set]
for dist in dists:
if requirements:
if dist.project_name not in requirements:
continue
try:
path = os.path.join(dist.location, dist.project_name)
size = calc_container(path)
if size/1000 > 1.0:
print (f"{dist}: {size/1000} KB")
print("-"*40)
except OSError:
print(f"{dist.project_name} no longer exists")
Here is code to return total size of Python packags in [MB] with individual package size:
import pkg_resources
def calc_container(path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
dists = [d for d in pkg_resources.working_set]
total_size = 0
for dist in dists:
try:
path = os.path.join(dist.location, dist.project_name)
size = calc_container(path)
total_size += size
if size / (1024*1024) > 1.0:
print(f"{dist}: {size / (1024*1024):.2f} MB")
print("-" * 40)
except OSError:
print(f"{dist.project_name} no longer exists")
print("Total size of installed packages:", total_size / (1024*1024), "MB")
© 2022 - 2025 — McMap. All rights reserved.