How to do a shallow clone using GitPython
Asked Answered
M

2

5

I am trying to do a shallow/partial clone of a repository using GitPython.

Here is the git CLI command:

$ git clone -v --filter=tree:0 --filter=blob:none --sparse [email protected]:gitlab-org/gitlab-docs.git ./Projects/ 
Cloning into './Projects'...
remote: Enumerating objects: 4145, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (64/64), done.
remote: Total 4145 (delta 7), reused 64 (delta 7), pack-reused 4074
Receiving objects: 100% (4145/4145), 1.30 MiB | 2.89 MiB/s, done.
Resolving deltas: 100% (424/424), done.
remote: Enumerating objects: 57, done.
remote: Counting objects: 100% (14/14), done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 57 (delta 0), reused 5 (delta 0), pack-reused 43
Receiving objects: 100% (57/57), 10.41 KiB | 5.20 MiB/s, done.
remote: Enumerating objects: 31, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (12/12), done.
remote: Total 31 (delta 0), reused 3 (delta 0), pack-reused 19
Receiving objects: 100% (31/31), 182.70 KiB | 2.31 MiB/s, done.
Updating files: 100% (31/31), done.

I'm trying to run the same command via Python using the GitPython package.

Code:

from git import Repo


print('cloning ....')
repo = Repo.clone_from(
        '[email protected]:gitlab-org/gitlab-docs.git',
        './Projects/',
        filter='{tree:0,blob:none}',
        sparse=True
        )   
print(repo)

Output:

% python test.py 
cloning ....
Traceback (most recent call last):
  File "/partial_clone_fetch/test.py", line 5, in <module>
    repo = Repo.clone_from(
  File "/partial_clone_fetch/.env/lib/python3.9/site-packages/git/repo/base.py", line 1083, in clone_from
    return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs)
  File "/partial_clone_fetch/.env/lib/python3.9/site-packages/git/repo/base.py", line 1021, in _clone
    finalize_process(proc, stderr=stderr)
  File "/partial_clone_fetch/.env/lib/python3.9/site-packages/git/util.py", line 369, in finalize_process
    proc.wait(**kwargs)
  File "/partial_clone_fetch/.env/lib/python3.9/site-packages/git/cmd.py", line 450, in wait
    raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git clone -v --filter={tree:0,blob:none} --sparse [email protected]:gitlab-org/gitlab-docs.git ./Projects/
  stderr: 'fatal: invalid filter-spec '{tree:0,blob:none}'

I don't know how exactly to pass multiple filter in GitPython and there is not enough documentation available.

Mezoff answered 15/6, 2021 at 10:6 Comment(0)
M
7

Filter specs can be passed as a list. Here is the solution to the issue:

from git import Repo

print('cloning ....')
repo = Repo.clone_from(
        '[email protected]:gitlab-org/gitlab-docs.git',
        './Projects/',
        filter=['tree:0','blob:none'],
        sparse=True
        )
print(repo)
Mezoff answered 15/6, 2021 at 10:42 Comment(0)
O
3

The following worked for me, getting the latest commit without git history

Repo.clone_from(clone_url, destination, depth=1)
Oxheart answered 6/3, 2023 at 7:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.