How can I use Docker Registry HTTP API V2 to obtain a list of all repositories in Docker Hub?
Asked Answered
C

5

21

An external organization that I work with has given me access to a private (auth token protected) docker registry, and eventually I would like to be able to query this registry, using docker's HTTP API V2, in order to obtain a list of all the repositories and/or images available in the registry.

But before I do that, I'd first like to get some basic practice with constructing these types of API queries on a public registry such as Docker Hub. So I've gone ahead and registered myself with a username and password on Docker Hub, and also consulted the API V2 documentation, which states that one may request an API version check as:

GET /v2/

or request a list of repositories as:

GET /v2/_catalog

Using curl, together with the username and password that I used in order to register my Docker Hub account, I attempt to construct a GET request at the command line:

stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/_catalog
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]}

where of course, in place of <my_password>, I substituted my actual account password.

The response that I had been expecting from this query was a giant json message, listing thousands of repository names, but instead it appears that the API is rejecting my Docker Hub credentials.

Question 1: Do I even have the correct URL (index.docker.io) for the docker hub registry? (I made this assumption in the first place based upon the status information returned by the command line tool docker info, so I have good reason to think it's correct.)

Question 2: Assuming I have the correct URL for the registry service itself, why does my query return an "UNAUTHORIZED" error code? My account credentials work just fine when I attempt to login via the web at hub.docker.com, so what's the difference between the two cases?

Cincinnati answered 17/5, 2019 at 20:23 Comment(0)
C
21

Do I even have the correct URL

  • "Docker" is a protocol, "DockerHub" is product that implements the Docker protocol but is not limited to it. Docker APIs are also implemented by other providers like:
    • GitLab (registry.gitlab.com)
    • GitHub CR (ghcr.io)
    • GCP GCR (gcr.io)
    • AWS ECR (public.ecr.aws & <account_id>.dkr.ecr..amazonaws.com)
    • Azure ACR (<registry_name>.azurecr.io)
  • index.docker.io hosts the Docker implementation by DockerHub.
  • hub.docker.com hosts the rich DockerHub specific APIs.
  • NOTE: DockerHub implements the generic Docker HTTP API V2 but it doesn't implement _catalog API from the generic API set.

why does my query return an "UNAUTHORIZED" error code?

In order to use the Docker V2 API, a JWT auth token needs to be generated from https://auth.docker.io/token for each call and that token has to be used as Bearer token in the DockerHub calls at index.docker.io

When we hit the DockerHub APIs like this: https://index.docker.io/v2/library/alpine/tags/list, it returns 401 with info on the missing pre-flight auth call. We look for www-authenticate response header in the failed request.

eg: www-authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:library/alpine:pull",error="invalid_token"

This means, we need to explicitly call following API to obtain the auth token.

https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/alpine:pull

The https://auth.docker.io/token works without any auth for public repos. To access a private repo, we need to add basic http auth to the request.

https://<username>:<password>@auth.docker.io/token?service=registry.docker.io&scope=repository:<repo>:pull

NOTE: auth.docker.io will generate a token even if the request is not valid (invalid creds or scope or anything). To validate the token, we can parse the JWT (eg: from jwt.io) and check access field in the payload, it should be containing requested scope references.

Congruous answered 4/8, 2021 at 16:11 Comment(1)
By using a GET request with the correct parameters you can get access? How can it be mitigated?Loquitur
B
11

Here is an example program to read repositories from a registry. I used it as a learning aid with Docker Hub.

#!/bin/bash

set -e

# set username and password
UNAME="username"
UPASS="password"

# get token to be able to talk to Docker Hub
TOKEN=$(curl -s -H "Content-Type: application/json" -X POST -d '{"username": "'${UNAME}'", "password": "'${UPASS}'"}' https://hub.docker.com/v2/users/login/ | jq -r .token)

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" 
https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" 
  https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

(this comes from an article on Docker site that describes how to use the API.)

In essence...

  • get a token
  • pass the token as a header Authorization: JWT <token> with any API calls you make
  • the api call you want to use to list repositories is https://hub.docker.com/v2/repositories/<username>/
Barone answered 5/3, 2020 at 15:37 Comment(3)
Great explanation, thank you! I noticed that by default it has access to only public repositories (like curl -s -H "Authorization: TOKEN" https://hub.docker.com/v2/repositories/USERNAME/ and curl -s -H "Authorization: TOKEN" https://hub.docker.com/v2/repositories/USERNAME/?is_private=true does output the same result). Unfortunately with this parameter ?is_private=true nothing outputs : do you know if it's possible? Regards!Congregational
Sorry I don't think that program is very robust, it's just an example. It only lists repositories having tags (which you may not have if you're just testing). I have just tried it and it returns all repos by default. If you added ?is_private=true then you only get private ones, and similarly if you pass false. Add some prints to the program so it outs the values of i (repo) and j (tag) and you should see what you expect.Barone
Thanks for your quick reply! Unfortunately in my case using this URL without this parameter or with the value false works (my only public repo outputs), but when using this parameter with the value true none of my private repos outputs I just see {"count": 0, "next": null, "previous": null, "results": []} I will still try to dig here and let you know if I find something. Have a great day!Congregational
E
3

This site says we cannot :(

Dockerhub hosts a mix of public and private repositories, but does not expose a catalog endpoint to programmatically list them.

Ecker answered 22/8, 2019 at 7:27 Comment(3)
What about non-Hub registries?Arpeggio
The answer here says it's possible in Registry V2: #31251856Arpeggio
I was able to do it with proper V2, on non hub registries.Ecker
I
0

I have modified https://mcmap.net/q/601217/-how-can-i-use-docker-registry-http-api-v2-to-obtain-a-list-of-all-repositories-in-docker-hub so i can search for any other user/org dockerhub image list:

#!/bin/bash

set -e

# User to search for
UNAME=${1}


# Put your own docker hub TOKEN.
# You can use pass command or 1password cli to store pat 
TOKEN=dckr_pat_XXXXXXXXXXXXXXXXXXXXXXXx


# get list of namespaces accessible by user (not in use right now)
#NAMESPACES=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/namespaces/ | jq -r '.namespaces|.[]')

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

Sample output:

gitlab/gitlab-ce:latest
gitlab/gitlab-ce:nightly
gitlab/gitlab-ce:15.5.9-ce.0
gitlab/gitlab-ce:15.6.6-ce.0
gitlab/gitlab-ce:rc
gitlab/gitlab-ce:15.7.5-ce.0
gitlab/gitlab-ce:15.7.3-ce.0
gitlab/gitlab-ce:15.5.7-ce.0
gitlab/gitlab-ce:15.6.4-ce.0
gitlab/gitlab-ce:15.7.2-ce.0
gitlab/gitlab-ce:15.7.1-ce.0
gitlab/gitlab-ce:15.7.0-ce.0
gitlab/gitlab-ce:15.6.3-ce.0
gitlab/gitlab-ce:15.5.6-ce.0
gitlab/gitlab-ce:15.6.2-ce.0
gitlab/gitlab-ce:15.4.6-ce.0
gitlab/gitlab-ce:15.5.5-ce.0
.....
Ignatzia answered 21/1, 2023 at 9:20 Comment(1)
I have also modified to match requested docker image gist.github.com/omerfsen/fc2b2b32d4c91dddbaf391aeb385acc9Ignatzia
M
0

Here's python code to do the very same. This can access both your organization and your own private repos.

Side note, I have another bunch of code that can access manifests, but only on private/public USER repos, but nor organizational level repos, anyone know why that is?

import requests

docker_username = ""
docker_password = ""
docker_organization = ""


auth_url = "https://hub.docker.com/v2/users/login/"
auth_data = {
    "username": docker_username,
    "password": docker_password
}
auth_response = requests.post(auth_url, json=auth_data)
auth_response.raise_for_status()
docker_hub_token = auth_response.json()["token"]

repositories_list = f"https://hub.docker.com/v2/repositories/{docker_username}/?page_size=100"
# repositories_list = f"https://hub.docker.com/v2/repositories/{docker_organization}/?page_size=100"
repos_headers = {
    "Authorization": f"JWT {docker_hub_token}"
}
repos_response = requests.get(repositories_list, headers=repos_headers)
repository_list = repos_response.json()["results"]
for repo in repository_list:
    namespace = repo["namespace"]
    repo_name = repo["name"]
    combined_name = f"{namespace}/{repo_name}"
    print(combined_name)
Marashio answered 18/5, 2023 at 19:6 Comment(1)
I added the import requests because your code relies on that package. See requests.readthedocs.io/en/latestWinter

© 2022 - 2024 — McMap. All rights reserved.