How to download single file from a git repository using python
Asked Answered
R

5

7

I want to download single file from my git repository using python.

Currently I am using gitpython lib. Git clone is working fine with below code but I don't want to download entire directory.

import os
from git import Repo
git_url = '[email protected]:/home2/git/stack.git'
repo_dir = '/root/gitrepo/'
if __name__ == "__main__":
    Repo.clone_from(git_url, repo_dir, branch='master', bare=True)
    print("OK")
Rotund answered 9/7, 2018 at 6:12 Comment(3)
What kind of file? Which os? Path of file?Deach
Go with git archive --remote.Cutlery
@ShashankSingh: any c or cpp source file, on windows OS, Path:- master/code/repo/Rotund
T
4

Don't think of a Git repo as a collection of files, but a collection of snapshots. Git doesn't allow you to select what files you download, but allows you to select how many snapshots you download:

git clone [email protected]:/home2/git/stack.git

will download all snapshots for all files, while

git clone --depth 1 [email protected]:/home2/git/stack.git

will only download the latest snapshot of all files. You will still download all files, but at least leave out all of their history.

Of these files you can simply select the one you want, and delete the rest:

import os
import git
import shutil
import tempfile

# Create temporary dir
t = tempfile.mkdtemp()
# Clone into temporary dir
git.Repo.clone_from('[email protected]:/home2/git/stack.git', t, branch='master', depth=1)
# Copy desired file from temporary dir
shutil.move(os.path.join(t, 'setup.py'), '.')
# Remove temporary dir
shutil.rmtree(t)
Tamworth answered 9/7, 2018 at 8:3 Comment(5)
It's a collection of snapshots, not changesets. In one sense this does not matter, but in others it does, and since Git lets the implementation show (shine?) through, it matters when using Git.Oof
Ok, I have changed the wordingTamworth
is there any git command for download the single file without any script?Rotund
No. Even the command git archive --remote (which isn't available in gitpython) requires un-tar-ing the output.Tamworth
It's true that Git won't let you do this, but Github and Bitbucket do. See https://mcmap.net/q/13274/-download-single-files-from-githubPart
A
3

You can also use subprocess in python:

import subprocess

args = ['git', 'clone', '--depth=1', '[email protected]:/home2/git/stack.git']
res = subprocess.Popen(args, stdout=subprocess.PIPE)
output, _error = res.communicate()

if not _error:
    print(output)
else:
    print(_error)

However, your main problem remains.

Git does not support downloading parts of the repository. You have to download all of it. But you should be able to do this with GitHub. Reference

Amusement answered 9/7, 2018 at 9:10 Comment(0)
F
1

You can use this function to download single file content from specific branch. This code uses only the requests library.

def download_single_file(
    repo_owner: str,
    repo_name: str,
    access_token: str,
    file_path: str,
    branch: str = "main",
    destination_path: str = None,
):
    if destination_path is None:
        destination_path = "./" + file_path

    url = f"https://api.github.com/repos/{repo_owner}/{repo_name}/contents/{file_path}?ref={branch}"

    # Set the headers with the access token and API version
    headers = {
        "Accept": "application/vnd.github+json",
        "Authorization": f"Bearer {access_token}",
    }

    # Send a GET request to the API endpoint
    response = requests.get(url, headers=headers)

    # Check if the request was successful
    if response.status_code == 200:
        # Get the content data from the response
        content_data = response.json()

        # Extract the content and decode it from base64
        content_base64 = content_data.get("content")
        content_bytes = base64.b64decode(content_base64)
        content = content_bytes.decode("utf-8")

        # Set the local destination path

        # Save the file content to the local destination path
        with open(destination_path, "w") as file:
            file.write(content)

        print("File downloaded successfully.")
    else:
        print(
            "Request failed. Check the repository owner, repository name, access token, and API version."
        )
Frenetic answered 17/5, 2023 at 11:20 Comment(0)
I
0

You need to request the raw version of the file! You can get it from raw.github.com

Induct answered 9/7, 2018 at 6:20 Comment(2)
I guess ,he never said github.Deach
My bad then, I thought it was githubInduct
D
0

I don't want to flag this as a direct duplicate, since it does not fully reflect the scope of this question, but part of what Lucifer said in his answer seems the way to go, according to this SO post. In short, git does not allow for a partial download, but certain providers (like GitHub) do, via raw content.
That being said, Python does provide quite a number of different libraries to download, with the best-known being urllib.request.

Dias answered 9/7, 2018 at 6:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.