How to fetch enough commits to do a merge in a shallow clone
Asked Answered
T

3

30

What I'm trying to do: test pull requests from github. I want to locally merge a pull request into master and run some tests on the result. Since the repository is huge, I do a shallow clone.

To be able to do the merge, I fetch more and more commits (git fetch with increasing --depth) until I have the merge-commit between master the pull request.

However, it doesn't work every time. It looks like I do not only need the merge-base, but also every commit in the master..merge_base range. I'm not sure however how to do that.

So, the question is: how do I fetch enough history to do the merge?

Torietorii answered 21/11, 2014 at 11:4 Comment(0)
P
19

If you have the history from when feature branched from master but don't want the full history of master then you can estimate a branching date and use;

git fetch --shallow-since=<date> origin master

It's hard to use any other form of git fetch to do what you want (query the remote for the merge-base) because git fetch fetches refs. There might not be a ref that you're looking for.

You can automate the digging using the following script.

while [ -z $( git merge-base master feature ) ]; do     
    git fetch -q --deepen=1 origin master feature;
done
Pent answered 13/5, 2019 at 13:19 Comment(5)
Doing so many fetches only one level deeper seems awfully inefficient because of network and git packing overhead. Maybe rather use something like deepen=100.Summerwood
@Summerwood completely agree. Tune as desired.Pent
The second command just infinite loops when I try it ``` usage: git merge-base [-a | --all] <commit> <commit>... or: git merge-base [-a | --all] --octopus <commit>... or: git merge-base --independent <commit>... or: git merge-base --is-ancestor <commit> <commit> or: git merge-base --fork-point <ref> [<commit>] -a, --all output all common ancestors --octopus find ancestors for a single n-way merge --independent list revs not reachable from others <more but character limits> ```Granville
Note that --deepen requires Git 2.11+ (from 2016).Squeeze
The git fetch --since doesn't seem to work reliably with merged Git history. It seems to often fetch the wrong branch of history and then you don't have the commit you requested.Ambi
P
5

I recently hit this issue and found a way that saved a ton of time, get the merge-base commit from the API rather than get enough repo information downloaded to calculate it https://docs.github.com/en/rest/commits/commits#compare-two-commits

This api call amounts to

gh api \
  repos/my_github_user/my_repo/compare/main...my_pr_branch \
  | jq -r '.merge_base_commit.sha'"

Or curl if you prefer

curl -s -H "Authorization: token $GITHUB_TOKEN" \
  https://api.github.com/repos/my_user/my_repo/compare/main...my_pr_branch

here's more of what my github workflow / action looks like

      - name: Fetch merge base SHA from API
        run: |
          my_merge_base_cmd="gh api repos/github/github/compare/${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }} | jq -r '.merge_base_commit.sha'"
          echo $my_merge_base_cmd
          my_merge_base=$(eval $my_merge_base_cmd)
          echo "MY_MERGE_BASE_SHA=$my_merge_base" >> $GITHUB_ENV

      - name: Fetch merge base SHA
        run: |
          echo $MY_MERGE_BASE_SHA
          git fetch \
            --no-tags \
            --prune \
            --progress \
            --no-recurse-submodules \
            --depth=1 \
            origin $MY_MERGE_BASE_SHA

      - name: Checkout merge base SHA
        run: |
          echo $MY_MERGE_BASE_SHA
          git checkout \
            --force \
            $MY_MERGE_BASE_SHA

Basically once I have the merge base SHA checked out I can build artifacts and compare them artifacts I built on the PR branch

If you need the commits between the merge-base and the PR, you'd have to shallow fetch them too, but at least using the merge base you know your ending point

Note, best to use the sha, not the ref for the API call to fetch merge base, as the ref can change what it points to (someone merging main into their PR branch for example), resulting in a race condition.

Propaedeutic answered 7/12, 2022 at 1:58 Comment(3)
If you are using GitLab, you get the merge base commit SHA in either $CI_MERGE_REQUEST_TARGET_BRANCH_SHA or (if mirroring an external repo such as GitHub) in $CI_EXTERNAL_PULL_REQUEST_TARGET_BRANCH_SHA. You can then fetch this commit, again using depth 1.Ordinand
"If you need the commits between the merge-base and the PR, you'd have to shallow fetch them too" - you may be able to avoid this using git replace. For example if you want to use nx affected... to determine changed packages, it is enough to (1) use git replace --graft <headSHA> <baseSHA> to create a new temporary commit with the contents of headSHA but with baseSHA as its parent; (2) use newHeadSHA=$(cat .git/refs/replace/<headSHA>) to get the SHA of the new commit; (3) use nx affected --base=<baseSHA> --head=<newHeadSHA> to get changes.Ordinand
Also see github.com/orgs/community/discussions/39880 as to why you can't rely on github.event.pull_request.base.sha here. :-/Amato
E
3

What you need (I think), in a catch-22 way, is 'git describe --all --first-parent' to tell you the depth of the given commit from the appropriate branch. However I'm not sure how to get that from Github before you do your shallow fetch ;-)

Entasis answered 21/11, 2014 at 12:59 Comment(2)
Even in that case, the depth of a commit is computed using the shortest path from the master to the commit, but I (seem to) need all the commit between master and the merge_base, which may be deeper than merge_base.Torietorii
Ah, yes, I hadn't thought of the full traversal to the merge_base (rather than the shortest path). If you have a merge loop that had lots of small commits that depth could be quite large. Unfortunately the shallow fetch is only limited by a counted depth, rather than any other scheme. And I'm still not sure how to determine the depth anyway!Entasis

© 2022 - 2024 — McMap. All rights reserved.