A way to keep a shallow git clone just minimally up to date?
Asked Answered
P

2

11

My aim is to be able to both build recent versions of, and contribute to, a project that has a long and voluminous history - and to do this without using local storage to duplicate lots of historic branches and history going back a decade and more (which I can always look up in the web UI of the project's central repository anyway, if I ever need to, which I probably won't).

I seemed to get lucky with my first try:

git clone --depth 40 -b master http://github.com/who/what.git/ what

That gave me a tidy local clone of 'what' that only had the 'master' branch, and enough commits to cover the most recent two tagged releases.
I could then do 'git checkout latest-release-tag' and build the latest release. Yippee!

As I sort of expected, I needed to make a patch. Everything went swimmingly: 'git checkout -b my-patch-branch', make my changes, commit, and I was able to push my-patch-branch back to a clone on github so the project could pull it. Easy!
I think I got lucky there because from what I read, e.g., here, I would not have been able to do that before git 1.9.
But the installed version turned out to be 1.9, so I got away with it.

Now the next obvious thing I'd like to do is fetch from the remote and pick up the most recent activity on master (including the upstream merge of my patch, so I won't need that branch any more). I tried 'git fetch --dry-run upstream' and watched in horror as it ticked off endless megabytes of download and then gave me a list of new tags going back to the age of mastodons. I'm glad I said --dry-run!

I was really hoping it would just pick up the dozen or so new commits on 'master' since the HEAD of my clone, and then maybe I'd have a depth 52 clone instead of 40, but that's kind of what I want ... start with a useful amount of recent history before I became involved, then just track and grow from that point, and be able to build, branch, and push patches. It seems so close.

Is there any simple way to make git do what I'm trying to do? Is what I'm trying to do unreasonable?

Edit: a bit more information.
(1) the upsteam is actually ahead of me by closer to a hundred commits, my estimate of a dozen was pulled out of the air.
(2) it turns out the original 40 commits that I got with my clone were all single-parent commits. A bunch of the later ones that I'm trying to fetch are merge commits with a second parent in some branch my clone didn't include. Could those be causing git to pull in all their ancient history because the earliest commit in my clone isn't a common ancestor?
Is there a way to tell it I don't want that?

More new information:
(1) it occurred to me that I was using the http protocol earlier, which doesn't actually interact with a git process on the server so it has no opportunity to tailor the download size.
However, when I retried using git-over-ssh, I still got a huge fetch.
Then
(2) manually, like an animal, I clicked through the merge commits shown in github's 'newtork' display, found the ones involving branches that began before my shallow cut, and added their parent-2 SHAs to my .git/shallow file, and then tried 'git fetch' over ssh again. That worked great and downloaded a tiny pack file that could just fast-forward my local master branch. I think this is exactly the operation I'd like git to be able to do automatically, but I haven't found a way to do it. Manually, it's pretty tedious. :)

Presumptuous answered 21/9, 2014 at 0:34 Comment(4)
did you pass --depth to fetch?Pusey
No. Should I? What value should I pass? If I originally cloned at depth 40 and there are 100 new commits upstream, do I ask for depth 140? Does that mean I need a script that somehow scans the upstream and counts commits before running fetch?Presumptuous
I've never used shallow repos, but the docs seem to indicate that is what you want. Not clear how you would know what to pass though. Did you compare the shallow repo size versus a normal clone? It could be you aren't really saving all that much to make it worth it.Pusey
Did you find a solution? I have a similar challenge and SO is so far just as quiet.Swung
M
4

What value should I pass? If I originally cloned at depth 40 and there are 100 new commits upstream, do I ask for depth 140?

Git 2.11 (Q4 2016) will allow you to increase the depth, so if your fetch does bring 100 new commits, you can set the new depth to 140

See commit cccf74e, commit 079aa97, commit 2997178, commit cdc3727, commit 859e5df, commit a45a260, commit 269a7a8, commit 41da711, commit 6d43a0c, commit 994c2aa, commit 508ea88, commit 569e554, commit 3d9ff4d, commit 79891cb, commit 1dd73e2, commit 0d789a5, commit 45a3e52, commit 3f0f662, commit 7fcbd37, commit 6e414e3 (12 Jun 2016) by Nguyễn Thái Ngọc Duy (pclouds).
Helped-by: Duy Nguyen (pclouds), Eric Sunshine (sunshineco), and Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit a460ea4, 10 Oct 2016)

In particular, commit cccf74e:

fetch, upload-pack: --deepen=N extends shallow boundary by N commits

In git fetch, --depth argument is always relative with the latest remote refs.
This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper.
It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth".
Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not.

This patch fixes that.
A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are.

(*) We could even support --deepen=<N> where <N> is negative.
In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result).


Before Git 2.27 (Q2 2020), "git pull" shares many options with underlying "git fetch", but some of them were not documented and some of those that would make sense to pass down were not passed down.

That means you can deepen your repo history and update your current branch all in one command:

git pull --deepen=x

See commit 13ac5ed, commit f05558f (28 Mar 2020) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit 9f471e4, 22 Apr 2020)

pull: pass documented fetch options on

Reported-by: 天几
Signed-off-by: René Scharfe

The fetch options --deepen, --negotiation-tip, --server-option, --shallow-exclude, and --shallow-since are documented for git pull as well, but are not actually accepted by that command.

Pass them on to make the code match its documentation.

Meghannmegiddo answered 12/10, 2016 at 11:13 Comment(0)
H
0

Maybe you could try fetch + rename branch.

For example, you've cloned a repo with --depth 1 and you want to update master branch to newest content but keep it as shallow copy.

# fetch newest master into master-tmp with --depth 1
git fetch --no-tags --depth 1 origin master:master-tmp

# switch to master-tmp
git checkout master-tmp

# force rename current branch (master-tmp) into master branch
git branch -M master
Hart answered 3/12, 2021 at 10:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.