JSON API for PyPi - how to list packages?
Asked Answered
E

7

40

There is a JSON API for PyPI which allows getting data for packages:

http://pypi.python.org/pypi/<package_name>/json
http://pypi.python.org/pypi/<package_name>/<version>/json

However, is it possible to get a list of all PyPI packages (or, for example, recent ones) with a GET call?

Esoteric answered 28/1, 2014 at 23:18 Comment(4)
Is Index of Packages the webpage you are looking for?Kellen
@Kellen No, it's not json. It has the data I need, but has some overhead for getting and parsing it.Esoteric
True, it's not json. I thought you were looking for a list of all packages.Kellen
Any way to search PyPI by a package prefix or fragment (e.g. lxm -> lxml, lxml-wrapper, ...) via the simple / JSON APIs? The XML-RPC API offers a search, but apparently it is being deprecated :(Mcbryde
B
31

The easiest way to do this is to use the simple index at PyPI which lists all packages without overhead. You can then request the JSON of each package individually by performing a GET request to the URLs mentioned in your question.

Brenda answered 25/5, 2014 at 21:40 Comment(1)
Thanks! (Before I was parsing Index of Packages, with its overhead for data transfer.)Esoteric
O
18

I know that you asked for a way to do this from the JSON API, but you can use the XML-RPC api to get this info very easily, without having to parse HTML.

try:
     import xmlrpclib
except ImportError:
     import xmlrpc.client as xmlrpclib

client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
# get a list of package names
packages = client.list_packages()
Olivaolivaceous answered 11/6, 2015 at 17:51 Comment(6)
Since 2017-04, the top of that page says: The XMLRPC interface for PyPI is considered legacy and should not be used..Sargassum
This worked for me - python version 3.6.6 - Date 1/17/2019.Massengale
For package releases you can use - client.package_releasesMassengale
As of a few days ago this mode has been disabled by pypi.orgSurratt
This still seems to be working as of 5/11/2021.Olivaolivaceous
slow. 4x slower than https://pypi.org/simple/Upanchor
B
8

As of PEP 691, you can now grab this through the Simple API if you request a JSON response.

curl --header 'Accept: application/vnd.pypi.simple.v1+json' https://pypi.org/simple/ | jq
Brittle answered 26/10, 2022 at 7:4 Comment(0)
C
4

I tried this answer, but it's not working on Python 3.6

I found one solution with HTML parsing by using lxml package, But you have to install it via pip command as

pip install lxml


Then, try the following snippet

from lxml import html
import requests

response = requests.get("https://pypi.org/simple/")

tree = html.fromstring(response.content)

package_list = [package for package in tree.xpath('//a/text()')]
Carafe answered 19/7, 2018 at 10:27 Comment(3)
I would rather use defusedxml for externally-fetched XML filesBagehot
@AlexanderShishenko How would you do that?Lavinalavine
i would rather use for match in re.finditer(r'"/simple/([^/]+)/"', html) to parse this simple htmlUpanchor
G
4

NOTE: To make tasks like this simple I've implemented an own Python module. It can be installed using pip:

pip install jk_pypiorgapi

The module is very simple to use. After instantiating an object representing the API interface you can make use of it:

import jk_pypiorgapi

api = jk_pypiorgapi.PyPiOrgAPI()
n = len(api.listAllPackages())
print("Number of packages on pypi.org:", n)

This module also provides capabilities for downloading information about specific packages as provided by pypi.org:

import jk_pypiorgapi
import jk_json

api = jk_pypiorgapi.PyPiOrgAPI()
jData = api.getPackageInfoJSON("jk_pypiorgapi")
jk_json.prettyPrint(jData)

This feature might be helpful as well.

Gifted answered 17/3, 2021 at 10:33 Comment(7)
Thanks for this! I had to install some undeclared dependencies to get it working: pypine, jk_cmdoutputparsinghelper, invoke, jk_version. I took a look at your pretty printer as well. Very nice!Levulose
Thanks for the comment, I will provide an update soon. BTW: PyPine is a new project I'm working on right now: A build and data processing framework in Python that will be open source soon. But jk_pyppiorgapi should not have a dependency for that. I'll look into that soon. If you encounter any issues, please file a bug report on GitHub. Thanks!Gifted
@RogerDahl Fixed. You might want to update. However, I'm a bit confused: There should be no requirement for jk_cmdoutputparsinghelper and jk_version as both modules are not used by jk_pypiorgapi. If you want to help please check this again after installing the update and if this dependency still exists please file a bug report on the GitHub repo page. Thank you!Gifted
Just do a fresh venv, install jk_pypiorgapi, and try the snippets. You should get the missing deps that I did. A was actually trying to find a way to query PyPI for dependency information, and later found out that the information is not exposed by the current PyPI API.Levulose
@RogerDahl: It's not that easy as you think. As I am building system tools as well I've installed many self written packages on system level. I need to wipe half of the system to test this. Therefore I had just a look at all packages directly: In this case there are not that much files, so this is quite easy, and I eliminated the dependency to pypine. The other dependencies should not be used anyway, neither by jk_pipyorgapi, nor by its dependencies. However, I will set up a special testing machine soon to eliminate any future inconveniences regarding dependencies.Gifted
You should avoid modifing your system Python -- it's very fragile and causes the type of problems you mentioned. Check out pyenv -- it's a wonderful tool that gives you full control over your Python environments.Levulose
using https://pypi.org/simple/ under the hood (source)Upanchor
A
1

This is now possible entirely within requests. The requested content type (the mime type for JSON) just needs to go into the dictionary of headers. `requests' can even decode the json into another dict for you:

r = requests.get(f'https://pypi.org/pypi/{package_name}/json', headers = {'Accept': 'application/json'});

info = r.json()['info']
print(f"requested package name = {package_name}, stored name: {info['name']}, author: {info['author']}, version: {info['version']}, license: {info['license']}")
Aldora answered 29/5, 2023 at 11:16 Comment(0)
L
-3

Here's Bash one-liner:

curl -sG -H 'Host: pypi.org' -H 'Accept: application/json' https://pypi.org/pypi/numpy/json | awk -F "description\":\"" '{ print $2 }' |cut -d ',' -f 1

# NumPy is a general-purpose array-processing package designed to...
Lavinalavine answered 26/11, 2018 at 19:40 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.