Converting git repository to shallow?
Asked Answered
git
E

7

96

How can I convert an already cloned git repository to a shallow repository?

The git repository is downloaded through a script outside of my control so I cannot do a shallow clone.

The reason for doing this is to save disk space. (Yes, I'm really short on disk space so even though a shallow repository doesn't save much, it is needed.)

I already tried

git repack -a -d -f -depth=1

But that actually made the repository larger.

Essa answered 15/1, 2011 at 8:57 Comment(6)
stackoverflow.com/questions/1398919/… could help. What gives a git gc after your repack?Miramirabeau
huitseeker: Thanks for bringing it up. I am aware of the limitations and I am okay with it. I need access to the latest commit, or ideally couple of commits, but that's it.Essa
VonC: I'm doing a gc --aggressive right now. I should gain some from it, but if possible I would also like to drop objects I don't need.Essa
I just came across progit.org/2010/03/17/replace.html which suggests an alternate, potentially simpler, process involving git commit-tree.Ji
The --depth parameter in git repack is unrelated to shallowing: it is the depth in the deltification algorithm: --depth=1 means we want a deltification of 1, which is smaller than the default of 50, so there is less compression.Luxury
i made a git-shallow-maker to copy all local branches to a new local repo. this will copy only the needed commits, so the new repo is shallowGround
I
90

First, you may need to remove tags (as they prevent GC of tagged commits), like:

git tag -d $(git tag -l)

Then, this worked for me:

git pull --depth 1
git gc --prune=all

Which still leaves the reflog laying around, which like the tags references additional commits that can use up space. Note that I would not erase the reflog unless severely needed: it contains local change history used for recovery from mistakes.

There are additional commands on how to erase the reflog in the comments below, and a link to a similar question with a longer answer.

If you still have a lot of space used, ensure you removed the tags, which you should try first before removing the reflog.

Iredale answered 6/11, 2016 at 18:4 Comment(13)
Hmm for me this gives "fatal: git fetch-pack: expected shallow list" on the pull.Renfrew
@BenFarmer Well that's no good! As shallow support has been slowly developing, this probably only works on recent versions of git. What version do you have?Iredale
hmm, seems to be 2.7.0.rc3. I'll see if a newer one is available in my repos and try that...Renfrew
I've run the above commands. The repo is indeed shallow now (git log shows only one commit and git branch shows just one branch). But the .git folder still occupies 2.5 GB. The same repo cloned with --depth 1 occupies about 1 GB. Any advice how to cut down the disk usage?Huba
@Huba you're right. See the answer I posted, I think it saves space.Ximenes
@Dzmitry, I've updated the commands in the answer to add --prune=all to garbage collection. This immediately deletes the extra objects for me.Iredale
got a couple downvotes - please either edit or comment what could changeIredale
From my experiments git pull --depth=1 keeps non-HEAD tags, which are not removed by git gc --prune=all. I had to use git tag -d $(git tag -l) to properly garbage collect those refs.Gamb
I found the above commands were not enough, and that I had to do this as well: git reflog expire --expire=all --all as recommended in stackoverflow.com/questions/38171899/… Also, the git tag command above is also needed too.Indisputable
having used the reflog a lot, i'm skeptical to recommend clearing it without letting people know that it contains their action history for recovery. the direct link to the other answer is https://mcmap.net/q/13382/-how-to-reduce-the-depth-of-an-existing-git-cloneIredale
Even combining all the commands listed here, pull depth 1, deleting the tags, killing the reflog, and then finally doing the gc, I did not get space savings in my .git directory.Bradytelic
That's too bad. If you're on a unix terminal you can use a command like du -h --max-depth=2 .git to make sure it is object or pack files using the space up, and not something else.Iredale
--prune=all should be --prune=now, see: spinics.net/lists/git/msg354409.htmlTyronetyrosinase
T
18

You can convert git repo to a shallow one in place along this lines:

git show-ref -s HEAD > .git/shallow
git reflog expire --expire=0
git prune
git prune-packed

Make sure to make backup since this is destructive operation, also keep in mind that cloning nor fetching from shallow repo is not supported! To really remove all the history you also need to remove all references to previous commits before pruning.

Tremayne answered 29/10, 2011 at 9:2 Comment(9)
Actually this doesn't seem to do anything.Valentijn
hendry: Most likely you have not removed other references pointing to HEAD's history. Try removing all other branches and tags before attempting this steps.Tremayne
For submodules, you might need to resolve the .git file to the git dir (git rev-parse --git-dir). Also, you could use git describe --always HEAD~5 instead of show-ref -s HEAD to keep the latest commits. Then there is also git fetch --unshallow in the meantime to unshallow a clone.Vassar
In order to remove all references, add --all to the reflog command: git reflog expire --expire=now --allIncurious
Note that git prune performs git prune-packed already. Also note that if you want all branches stored, they must all have their tips listed in .git/shallow. This command worked for me, but I don't know if it will work all the time: find .git/refs -type f | xargs cat | sort -u > .git/shallowIredale
git describe --always output gives me a bad shallow line errorIredale
You might need to remove unneeded refs from .git/packed-refs before git prune.Stretch
This needs git gc --prune=all as well.Sochor
This broke my repo with errors like "cannot read " followed by some hash (sorry details not retained by my terminal) , in the end I used the nuclear option, re-checkout the repo and replace the broken .git subdirectory with the new one.Merchantable
L
14

Convert to shallow since a specific date:

git pull --shallow-since=YYYY-mm-dd
git gc --prune=all

Also works:

git fetch --shallow-since=YYYY-mm-dd
git gc --prune=all
Lowe answered 26/3, 2019 at 8:5 Comment(2)
Thanks! For me git fetch --depth 1; git gc --aggressive --prune=all worked as well. Doing so seemed to be equivalent of doing a shallow clone with: git clone --depth 1Pearce
I get you are not currently on a branch. Please specify which branch you want to rebase against. Trippet
X
11

Create shallow clone of a local repo:

git clone --depth 1 file:///full/path/to/original/dir destination

Note that the first "address" should be a file://, that's important. Also, git will assume your original local file:// address to be the "remote" ("origin"), so you'll need to update the new repository specifying the correct git remote.

Ximenes answered 24/9, 2017 at 11:31 Comment(2)
This did the trick for me. In our CI setup, I wanted to clone out the full repo in order to apply patches from another branch, and then shrink the directory as much as possible since it would be TAR'ed and stored.Siltstone
@Siltstone If you want to shrink the directory as much as possible, without needing any commit history in a CI scenario, you could just nuke the whole .git directory, right?Announcement
T
9

Combining the answer from @fuzzyTew with what the comments on that answer:

git pull --depth 1
git tag -d $(git tag -l)
git reflog expire --expire=all --all
git gc --prune=all

Want to save space by running this across your entire disk? - Then run this fd command:

fd -HIFt d '.git' -x bash -c 'pushd "$0" && ( git pull --depth 1; git tag -d $(git tag -l); git reflog expire --expire=all --all; git gc --prune=all ) && popd' {//}

Or with just regular find:

find -type d -name '.git' -exec bash -c 'pushd "${0%/*}" && ( git pull --depth 1; git tag -d $(git tag -l); git reflog expire --expire=all --all; git gc --prune=all ) && popd' {} \;
Trimly answered 30/6, 2020 at 4:31 Comment(4)
This still does not work for me to save space in my .git with git 2.28.0.Bradytelic
Hey! it works! Just be reminded that the reflog expire is dangerous if you actively working on the repo. But since I just want to browse the latest code, it is perfect!Embonpoint
It is better to use `git pull --depth=1 --no-tags'.Witmer
--prune=all should be --prune=now, see: spinics.net/lists/git/msg354409.htmlTyronetyrosinase
M
3

Note that a shallow repo (like one with git clone --depth 1 as a way to convert an existing repo to a shallow one) can fail on git repack.

See commit 5dcfbf5, commit 2588f6e, commit 328a435 (24 Oct 2018) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit ea100b6, 06 Nov 2018)

repack -ad: prune the list of shallow commits

git repack can drop unreachable commits without further warning, making the corresponding entries in .git/shallow invalid, which causes serious problems when deepening the branches.

One scenario where unreachable commits are dropped by git repack is when a git fetch --prune (or even a git fetch when a ref was force-pushed in the meantime) can make a commit unreachable that was reachable before.

Therefore it is not safe to assume that a git repack -adlf will keep unreachable commits alone (under the assumption that they had not been packed in the first place, which is an assumption at least some of Git's code seems to make).

This is particularly important to keep in mind when looking at the .git/shallow file: if any commits listed in that file become unreachable, it is not a problem, but if they go missing, it is a problem.
One symptom of this problem is that a deepening fetch may now fail with:

fatal: error in object: unshallow <commit-hash>

To avoid this problem, let's prune the shallow list in git repack when the -d option is passed, unless -A is passed, too (which would force the now-unreachable objects to be turned into loose objects instead of being deleted).
Additionally, we also need to take --keep-reachable and --unpack-unreachable=<date> into account.

Note: an alternative solution discussed during the review of this patch was to teach git fetch to simply ignore entries in .git/shallow if the corresponding commits do not exist locally.
A quick test, however, revealed that the .git/shallow file is written during a shallow clone, in which case the commits do not exist, either, but the "shallow" line does need to be sent.
Therefore, this approach would be a lot more finicky than the approach presented by the this patch.

Miramirabeau answered 11/11, 2018 at 2:4 Comment(0)
M
0

This may seem like cheating, but check out a shallow copy of the repo and then replace the .git subdirectroy of your existing copy of the repo with the .git from with the new one.

You will lose any local custom git state like stash or local branches, but I'm guessing you want to purge that stuff too.

Merchantable answered 28/2 at 0:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.