Can pip (or setuptools, distribute etc...) list the license used by each installed package?
Asked Answered
A

12

53

I'm trying to audit a Python project with a large number of dependencies and while I can manually look up each project's homepage/license terms, it seems like most OSS packages should already contain the license name and version in their metadata.

Unfortunately I can't find any options in pip or easy_install to list more than the package name and installed version (via pip freeze).

Does anyone have pointers to a tool to list license metadata for Python packages?

Antin answered 30/9, 2013 at 3:27 Comment(0)
P
30

You can use pkg_resources:

import pkg_resources

def get_pkg_license(pkgname):
    """
    Given a package reference (as from requirements.txt),
    return license listed in package metadata.
    NOTE: This function does no error checking and is for
    demonstration purposes only.
    """
    pkgs = pkg_resources.require(pkgname)
    pkg = pkgs[0]
    for line in pkg.get_metadata_lines('PKG-INFO'):
        (k, v) = line.split(': ', 1)
        if k == "License":
            return v
    return None

Example use:

>>> get_pkg_license('mercurial')
'GNU GPLv2+'
>>> get_pkg_license('pytz')
'MIT'
>>> get_pkg_license('django')
'UNKNOWN'
Papillon answered 30/9, 2013 at 3:57 Comment(4)
This works great! It turns out that is pkg_resources.working_set is iterable as well, which is helpful for my situation (listing all licenses at once)Antin
You can use the built email package to parse the PKG-INFO content, which is a bit more reliable than using split on colon. First import email.parser, then set up a parser = email.parser.HeaderParser() and then pkg_info = parser.parsestr('\n'.join(package.get_metadata_lines('PKG-INFO'))) which gives you a Message from which you can get pkg_info['Licence'].Prophetic
Works great, replacing PKG-INFO with METADATA and catching some errors here and there.V
Swapped pkg_resources.require(pkgname)[0] for pkg_resources.get_distribution(package) to just fetch the package instead of checking for requirements.Cumming
S
44

Here is a copy-pasteable snippet which will print your packages.

Requires: prettytable (pip install prettytable)

Code

import pkg_resources
import prettytable

def get_pkg_license(pkg):
    try:
        lines = pkg.get_metadata_lines('METADATA')
    except:
        lines = pkg.get_metadata_lines('PKG-INFO')

    for line in lines:
        if line.startswith('License:'):
            return line[9:]
    return '(Licence not found)'

def print_packages_and_licenses():
    t = prettytable.PrettyTable(['Package', 'License'])
    for pkg in sorted(pkg_resources.working_set, key=lambda x: str(x).lower()):
        t.add_row((str(pkg), get_pkg_license(pkg)))
    print(t)


if __name__ == "__main__":
    print_packages_and_licenses()

Example Output

+---------------------------+--------------------------------------------------------------+
|          Package          |                           License                            |
+---------------------------+--------------------------------------------------------------+
|       appdirs 1.4.3       |                             MIT                              |
|     argon2-cffi 16.3.0    |                             MIT                              |
|        boto3 1.4.4        |                      Apache License 2.0                      |
|      botocore 1.5.21      |                      Apache License 2.0                      |
|        cffi 1.10.0        |                             MIT                              |
|       colorama 0.3.9      |                             BSD                              |
|      decorator 4.0.11     |                       new BSD License                        |
|        Django 1.11        |                             BSD                              |
|  django-debug-toolbar 1.7 |                             BSD                              |
|    django-environ 0.4.3   |                         MIT License                          |
|   django-storages 1.5.2   |                             BSD                              |
|    django-uuslug 1.1.8    |                             BSD                              |
| djangorestframework 3.6.2 |                             BSD                              |
|      docutils 0.13.1      | public domain, Python, 2-Clause BSD, GPL 3 (see COPYING.txt) |
|     EasyProcess 0.2.3     |                             BSD                              |
|       ipython 6.0.0       |                             BSD                              |
|   ipython-genutils 0.2.0  |                             BSD                              |
|        jedi 0.10.2        |                             MIT                              |
|       jmespath 0.9.1      |                             MIT                              |
|       packaging 16.8      |              BSD or Apache License, Version 2.0              |
|     pickleshare 0.7.4     |                             MIT                              |
|         pip 9.0.1         |                             MIT                              |
|     prettytable 0.7.2     |                        BSD (3 clause)                        |
|   prompt-toolkit 1.0.14   |                           UNKNOWN                            |
|       psycopg2 2.6.2      |                 LGPL with exceptions or ZPL                  |
|       pycparser 2.17      |                             BSD                              |
|       Pygments 2.2.0      |                         BSD License                          |
|      pyparsing 2.2.0      |                         MIT License                          |
|   python-dateutil 2.6.0   |                        Simplified BSD                        |
|    python-slugify 1.2.4   |                             MIT                              |
|        pytz 2017.2        |                             MIT                              |
|   PyVirtualDisplay 0.2.1  |                             BSD                              |
|     s3transfer 0.1.10     |                      Apache License 2.0                      |
|       selenium 3.0.2      |                           UNKNOWN                            |
|     setuptools 35.0.2     |                           UNKNOWN                            |
|    simplegeneric 0.8.1    |                           ZPL 2.1                            |
|         six 1.10.0        |                             MIT                              |
|       sqlparse 0.2.3      |                             BSD                              |
|      traitlets 4.3.2      |                             BSD                              |
|      Unidecode 0.4.20     |                             GPL                              |
|       wcwidth 0.1.7       |                             MIT                              |
|       wheel 0.30.0a0      |                             MIT                              |
|  win-unicode-console 0.5  |                             MIT                              |
+---------------------------+--------------------------------------------------------------+
Spangle answered 20/5, 2017 at 19:51 Comment(3)
The try/except there is important: worked better than the accepted answer. Seems like METADATA and PKG-INFO both must be checked!Irksome
up vote from my side also. Which type of exception is it? I would like to specialize my exceptionLona
I suppose it would be one of (KeyError, IOError), as found in some of the other answers.Spangle
P
30

You can use pkg_resources:

import pkg_resources

def get_pkg_license(pkgname):
    """
    Given a package reference (as from requirements.txt),
    return license listed in package metadata.
    NOTE: This function does no error checking and is for
    demonstration purposes only.
    """
    pkgs = pkg_resources.require(pkgname)
    pkg = pkgs[0]
    for line in pkg.get_metadata_lines('PKG-INFO'):
        (k, v) = line.split(': ', 1)
        if k == "License":
            return v
    return None

Example use:

>>> get_pkg_license('mercurial')
'GNU GPLv2+'
>>> get_pkg_license('pytz')
'MIT'
>>> get_pkg_license('django')
'UNKNOWN'
Papillon answered 30/9, 2013 at 3:57 Comment(4)
This works great! It turns out that is pkg_resources.working_set is iterable as well, which is helpful for my situation (listing all licenses at once)Antin
You can use the built email package to parse the PKG-INFO content, which is a bit more reliable than using split on colon. First import email.parser, then set up a parser = email.parser.HeaderParser() and then pkg_info = parser.parsestr('\n'.join(package.get_metadata_lines('PKG-INFO'))) which gives you a Message from which you can get pkg_info['Licence'].Prophetic
Works great, replacing PKG-INFO with METADATA and catching some errors here and there.V
Swapped pkg_resources.require(pkgname)[0] for pkg_resources.get_distribution(package) to just fetch the package instead of checking for requirements.Cumming
D
18

Here is a way to do this with yolk3k (Command-line tool for querying PyPI and Python packages installed on your system.)

pip install yolk3k

yolk -l -f license
#-l lists all installed packages
#-f Show specific metadata fields (In this case, License) 
Duwe answered 3/11, 2014 at 16:15 Comment(1)
Doesn't look like that project is maintained any more. Here's a fork which is being actively developed: pypi.python.org/pypi/yolk3kMethod
S
14

1. Choice

pip-licenses PyPI package.


2. Relevance

This answer is relevant for March 2018. In the future, the data of this answer may be obsolete.


3. Argumentation

  1. simply installation — pip install pip-licenses,
  2. more features and options, than yolk3k,
  3. active maintained.

4. Demonstration

Example output:

D:\KristinitaPelican>pipenv run pip-licenses --with-system --order=license --format-markdown
| Name                | Version   | License                                                      |
|---------------------|-----------|--------------------------------------------------------------|
| requests            | 2.18.4    | Apache 2.0                                                   |
| actdiag             | 0.5.4     | Apache License 2.0                                           |
| blockdiag           | 1.5.3     | Apache License 2.0                                           |
| nwdiag              | 1.0.4     | Apache License 2.0                                           |
| seqdiag             | 0.9.5     | Apache License 2.0                                           |
| Jinja2              | 2.10      | BSD                                                          |
| MarkupSafe          | 1.0       | BSD                                                          |
| license-info        | 0.8.7     | BSD                                                          |
| pip-review          | 1.0       | BSD                                                          |
| pylicense           | 1         | BSD                                                          |
| PTable              | 0.9.2     | BSD (3 clause)                                               |
| webcolors           | 1.8.1     | BSD 3-Clause                                                 |
| Markdown            | 2.6.11    | BSD License                                                  |
| Pygments            | 2.2.0     | BSD License                                                  |
| yolk3k              | 0.9       | BSD License                                                  |
| packaging           | 17.1      | BSD or Apache License, Version 2.0                           |
| idna                | 2.6       | BSD-like                                                     |
| markdown-newtab     | 0.2.0     | CC0                                                          |
| pyembed             | 1.3.3     | Copyright © 2013 Matt Thomson                                |
| pyembed-markdown    | 1.1.0     | Copyright © 2013 Matt Thomson                                |
| python-dateutil     | 2.7.2     | Dual License                                                 |
| Unidecode           | 1.0.22    | GPL                                                          |
| chardet             | 3.0.4     | LGPL                                                         |
| beautifulsoup4      | 4.6.0     | MIT                                                          |
| funcparserlib       | 0.3.6     | MIT                                                          |
| gevent              | 1.2.2     | MIT                                                          |
| markdown-blockdiag  | 0.7.0     | MIT                                                          |
| pip                 | 9.0.1     | MIT                                                          |
| pkgtools            | 0.7.3     | MIT                                                          |
| pytz                | 2018.3    | MIT                                                          |
| six                 | 1.11.0    | MIT                                                          |
| urllib3             | 1.22      | MIT                                                          |
| wheel               | 0.30.0    | MIT                                                          |
| blinker             | 1.4       | MIT License                                                  |
| greenlet            | 0.4.13    | MIT License                                                  |
| pip-licenses        | 1.7.0     | MIT License                                                  |
| pymdown-extensions  | 4.9.2     | MIT License                                                  |
| pyparsing           | 2.2.0     | MIT License                                                  |
| certifi             | 2018.1.18 | MPL-2.0                                                      |
| markdown-downheader | 1.1.0     | Simplified BSD License                                       |
| Pillow              | 5.0.0     | Standard PIL License                                         |
| feedgenerator       | 1.9       | UNKNOWN                                                      |
| license-lister      | 0.1.1     | UNKNOWN                                                      |
| md-environ          | 0.1.0     | UNKNOWN                                                      |
| mdx-cite            | 1.0       | UNKNOWN                                                      |
| mdx-customspanclass | 1.1.1     | UNKNOWN                                                      |
| pelican             | 3.7.1     | UNKNOWN                                                      |
| setuptools          | 38.5.1    | UNKNOWN                                                      |
| docutils            | 0.14      | public domain, Python, 2-Clause BSD, GPL 3 (see COPYING.txt) |

5. External link

Syllepsis answered 31/3, 2018 at 5:39 Comment(0)
E
4

According to the output of pip show -v, there are two possible places where the information about the license for each package, lies.

Here are some examples:

$ pip show django -v | grep -i license
License: BSD
  License :: OSI Approved :: BSD License

$ pip show setuptools -v | grep -i license
License: UNKNOWN
  License :: OSI Approved :: MIT License

$ pip show python-dateutil -v | grep -i license
License: Dual License
  License :: OSI Approved :: BSD License
  License :: OSI Approved :: Apache Software License

$ pip show ipdb -v | grep -i license
License: BSD

The code below returns an iterator that contains all possible licenses of a package, using pkg_resources from setuptools:

from itertools import chain, compress
from pkg_resources import get_distribution


def filters(line):
    return compress(
        (line[9:], line[39:]),
        (line.startswith('License:'), line.startswith('Classifier: License')),
    )


def get_pkg_license(pkg):
    distribution = get_distribution(pkg)
    try:
        lines = distribution.get_metadata_lines('METADATA')
    except OSError:
        lines = distribution.get_metadata_lines('PKG-INFO')
    return tuple(chain.from_iterable(map(filters, lines)))

Here are the results:

>>> tuple(get_pkg_license(get_distribution('django')))
('BSD', 'BSD License')

>>> tuple(get_pkg_license(get_distribution('setuptools')))
('UNKNOWN', 'MIT License')

>>> tuple(get_pkg_license(get_distribution('python-dateutil')))
('Dual License', 'BSD License', 'Apache Software License')

>>> tuple(get_pkg_license(get_distribution('ipdb')))
('BSD',)

Finally, to get all licenses from installed apps:

>>> {
        p.project_name: get_pkg_license(p) 
        for p in pkg_resources.working_set
    } 
Egeria answered 18/4, 2019 at 21:33 Comment(0)
P
3

A slightly better version for those running jupyter - uses Anaconda defaults - no install needed

import pkg_resources
import pandas as pd

def get_pkg_license(pkg):
    try:
        lines = pkg.get_metadata_lines('METADATA')
    except:
        lines = pkg.get_metadata_lines('PKG-INFO')

    for line in lines:
        if line.startswith('License:'):
            return line[9:]
    return '(Licence not found)'

def print_packages_and_licenses():
    table = []
    for pkg in sorted(pkg_resources.working_set, key=lambda x: str(x).lower()):
        table.append([str(pkg).split(' ',1)[0], str(pkg).split(' ',1)[1], get_pkg_license(pkg)])
    df = pd.DataFrame(table, columns=['Package',  'Version', 'License'])
    return df

print_packages_and_licenses()   
Possess answered 13/3, 2019 at 21:6 Comment(0)
H
2

With pip:

pip show django | grep License

If you want to get the PyPI classifier for the license, use the verbose option:

pip show -v django | grep 'License ::'

Heavyarmed answered 8/2, 2017 at 15:27 Comment(0)
P
1

I found several ideas from the answers and comments for this question to be relevant and wrote a short script for generating the license information for the applicable virtualenv:

import pkg_resources
import copy

def get_packages_info():
    KEY_MAP = {
        "Name": 'name',
        "Version": 'version',
        "License": 'license',
    }
    empty_info = {}
    for key, name in KEY_MAP.iteritems():
        empty_info[name] = ""

    packages = pkg_resources.working_set.by_key
    infos = []
    for pkg_name, pkg in packages.iteritems():
        info = copy.deepcopy(empty_info)
        try:
            lines = pkg.get_metadata_lines('METADATA')
        except (KeyError, IOError):
            lines = pkg.get_metadata_lines('PKG-INFO')

        for line in lines:
            try:
                key, value = line.split(': ', 1)
                if KEY_MAP.has_key(key):
                    info[KEY_MAP[key]] = value
            except ValueError:
                pass

        infos += [info]

    return "name,version,license\n%s" % "\n".join(['"%s","%s","%s"' % (info['name'], info['version'], info['license']) for info in sorted(infos, key=(lambda item: item['name'].lower()))])
Pleven answered 19/7, 2016 at 7:47 Comment(0)
K
1

Based on answer provided by @garromark and tweaked for Python 3, I use this on the command line:

import pkg_resources import copy

def get_packages_info():
    KEY_MAP = {
        "Name": 'name',
        "Version": 'version',
        "License": 'license',
    }
    empty_info = {}
    for key, name in KEY_MAP.items():
        empty_info[name] = ""

    packages = pkg_resources.working_set.by_key
    infos = []
    for pkg_name, pkg in packages.items():
        info = copy.deepcopy(empty_info)
        try:
            lines = pkg.get_metadata_lines('METADATA')
        except (KeyError, IOError):
            lines = pkg.get_metadata_lines('PKG-INFO')

        for line in lines:
            try:
                key, value = line.split(': ', 1)
                if key in KEY_MAP:
                    info[KEY_MAP[key]] = value
            except ValueError:
                pass

        infos += [info]

    return "name,version,license\n%s" % "\n".join(['"%s","%s","%s"' % (info['name'], info['version'], info['license']) for info in sorted(infos, key=(lambda item: item['name'].lower()))])

     print(get_packages_info())
Kenzi answered 26/4, 2017 at 20:40 Comment(0)
L
0

Another option is to use Brian Dailey's Python Package License Checker.

git clone https://github.com/briandailey/python-packages-license-check.git
cd python-packages-license-check
... activate your chosen virtualenv ...
./check.py
Lambrecht answered 7/9, 2017 at 16:7 Comment(0)
A
0

The answer didn't work for me a lot of those libraries generated exception.

So did a little brute force

def get_pkg_license_use_show(pkgname):
    """
    Given a package reference (as from requirements.txt),
    return license listed in package metadata.
    NOTE: This function does no error checking and is for
    demonstration purposes only.
    """
    out = subprocess.check_output(["pip", 'show', pkgname])
    pattern = re.compile(r"License: (.*)")
    license_line = [i for i in out.split("\n") if i.startswith('License')]
    match = pattern.match(license_line[0])
    license = match.group(1)
    return license
Adelinaadelind answered 19/9, 2017 at 0:22 Comment(0)
A
0

I found liccheck to be the best option. It shows all licenses in a dependency tree, so you immediately know where licenses come from. It also offers the ability to allow and forbid licenses, even pyproject.toml format.

pip install liccheck
liccheck
Apostolic answered 2/12, 2022 at 9:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.