Why does "git clone" not take a refspec?
Asked Answered
O

2

10

It appears that a lot of folks have gone to replacing git clone with the combo git init && git fetch. This seems rather silly, and unfortunately tools like Jenkins won't do that for you. So why does git clone not take a refspec, the same way git fetch does?

Specifically, if you wish to run a gerrit triggered build task on Jenkins, you need to ensure that the workspace exists, otherwise jenkins will fail to checkout the revision containing the gerrit change. This is because gerrit uses a ref path that is not among those fetched by a git clone.

Oppugnant answered 17/12, 2013 at 21:3 Comment(2)
The answer probably depends on why people choose the git init && git fetch combo over git clone.Nitty
Explained use case in the question.Oppugnant
J
5

Please specify what is the refname that gerrit uses, that’s missing after the clone. And would simply git clone --origin <gerrit-suitable-origin-name> solve the problem?

Now for the long version. Your question is probably two questions combined. Why is it advantageous to git init, git remote add and git fetch, and why isn’t there a way to conveniently filter for refspecs when you initially clone the repository?

refspecs - After clone initialises a repo, the remote-refspec behaviour for the command is to add a default section to the .gitconfig that outlines the fetch spec:

[remote "origin"]
url = ssh://host/your.git
fetch = +refs/heads/*:refs/remotes/origin/*

These are good and sane defaults, and the supplied refspecs are intended to fetch everything from the remote. If you need to change the refspecs you can do it manually by just editing the file. For example,

[remote "origin"]
url = ssh://host/your.git
fetch = +refs/heads/atari:refs/remotes/origin/atari
fetch = +refs/heads/vertigo:refs/remotes/origin/vertigo

After the edit, fetching will involve only the atari and vertigo branches from the origin remote, and the commonly present master for example, along with all other branches that might exist on the remote are ignored. This is of course similar to the option of supplying the refspecs to git fetch on the command line.

Overall it just isn’t necessary, and I reckon isn’t a clean design, to be able to support multiple initial refspecs upfront on the git clone command line, for the sole purpose of placing them in .gitconfig. You can even script much the same by running git clone and then sed on the .gitconfig file. It will also be problematic to decide which is the initial branch to checkout when cloning given the many possible refspecs.

init over clone - Assuming we avoid discussing the more advanced git clone setups, such as leveraging --reference, shallow depth with --depth, or creating bare repos, the distinction between init-add-fetch and clone are quite minor for your everyday flow.

Plain clone just copies an existing repository and setups “origin” as the remote for the source of creation. This comes with some minor annoyances - the “origin” remote is forced on you, remote tracking branches are created, an initial branch is setup, and HEAD is checked out. If however you start with git init you are slightly more in control. You can start manually adding remotes and fetch specific branches without checking anything out.

Note though that many aspects of git clone behaviour can be controlled by command line switches – so maybe the developers who prefer git init just aren’t aware of them? There isn’t enough information in the question to decide. Compared to the alternatives, git clone saves you some typing, staves off the thermal death of the universe and sets up sane upstream defaults - such as having master a tracking branch. I vote for clone.

Judgeship answered 19/12, 2013 at 10:18 Comment(6)
I guess this is a fair answer, but as with many things in git, it induces coders to make mistakes. I've now seen several pieces of code which use the following faulty logic: if repo-exists then git fetch else git clone fi. The correct logic needs to be if not repo-exists then git clone fi; git fetch. I think the git design induces this error. Oh, and gerrit uses refs/changes/for/<branch>/, so it's totally non-standard, and breaks jenkins assumptions about cloning, which is what prompted the question.Oppugnant
Git gives you a suite of options to choose from, and I think this is good and powerful. So I have to disagree on the fact that anything in git ways induces devs to make mistakes. One way to deal with variability and mistakes is instituting a procedure for everyone to follow, but it's not up to git to enforce that on the users.Judgeship
For the gerrit part, after clone/init, would mapping fetch = +refs/changes/for/*:refs/remotes/origin/* (as outlined in my answer) help you with jenkins? It should normalise the names.Judgeship
The problem is that Jenkins won't run a fetch when it clones. Yes, obviously, I can fork Jenkins and make it do what I want by modifying the code (or hope that someday, someone will actually do it for me). My current workaround is to simply ensure that all the workspaces I expect Jenkisn to use are pre-populated with the right clone.Oppugnant
Supporting hard links during clone is one big advantage (and diffference) over a init-fetch-workflow.Clock
A big advantage would be automatically fetching git notes. But that's really a missing git clone feature, not a missing Jenkins feature.Viniferous
L
9

git clone can take a refspec via the generic --config option:

Set a configuration variable in the newly-created repository; this takes effect immediately after the repository is initialized, but before the remote history is fetched or any files checked out. [...] This makes it safe, for example, to add additional fetch refspecs to the origin remote.

Example:

 git clone -c remote.origin.fetch=+refs/changes/*:refs/remotes/origin/changes/* https://gerrit.googlesource.com/git-repo

However, I just realized this does not work as expected for me with git version 2.12.2.windows.2 as the changes ref is not really fetched when cloning. It gets properly added to .git/config, but I need to manually fetch it after the clone via

cd git-repo
git fetch

Edit: This turns out to be known bug.

Laurencelaurene answered 3/5, 2017 at 12:5 Comment(0)
J
5

Please specify what is the refname that gerrit uses, that’s missing after the clone. And would simply git clone --origin <gerrit-suitable-origin-name> solve the problem?

Now for the long version. Your question is probably two questions combined. Why is it advantageous to git init, git remote add and git fetch, and why isn’t there a way to conveniently filter for refspecs when you initially clone the repository?

refspecs - After clone initialises a repo, the remote-refspec behaviour for the command is to add a default section to the .gitconfig that outlines the fetch spec:

[remote "origin"]
url = ssh://host/your.git
fetch = +refs/heads/*:refs/remotes/origin/*

These are good and sane defaults, and the supplied refspecs are intended to fetch everything from the remote. If you need to change the refspecs you can do it manually by just editing the file. For example,

[remote "origin"]
url = ssh://host/your.git
fetch = +refs/heads/atari:refs/remotes/origin/atari
fetch = +refs/heads/vertigo:refs/remotes/origin/vertigo

After the edit, fetching will involve only the atari and vertigo branches from the origin remote, and the commonly present master for example, along with all other branches that might exist on the remote are ignored. This is of course similar to the option of supplying the refspecs to git fetch on the command line.

Overall it just isn’t necessary, and I reckon isn’t a clean design, to be able to support multiple initial refspecs upfront on the git clone command line, for the sole purpose of placing them in .gitconfig. You can even script much the same by running git clone and then sed on the .gitconfig file. It will also be problematic to decide which is the initial branch to checkout when cloning given the many possible refspecs.

init over clone - Assuming we avoid discussing the more advanced git clone setups, such as leveraging --reference, shallow depth with --depth, or creating bare repos, the distinction between init-add-fetch and clone are quite minor for your everyday flow.

Plain clone just copies an existing repository and setups “origin” as the remote for the source of creation. This comes with some minor annoyances - the “origin” remote is forced on you, remote tracking branches are created, an initial branch is setup, and HEAD is checked out. If however you start with git init you are slightly more in control. You can start manually adding remotes and fetch specific branches without checking anything out.

Note though that many aspects of git clone behaviour can be controlled by command line switches – so maybe the developers who prefer git init just aren’t aware of them? There isn’t enough information in the question to decide. Compared to the alternatives, git clone saves you some typing, staves off the thermal death of the universe and sets up sane upstream defaults - such as having master a tracking branch. I vote for clone.

Judgeship answered 19/12, 2013 at 10:18 Comment(6)
I guess this is a fair answer, but as with many things in git, it induces coders to make mistakes. I've now seen several pieces of code which use the following faulty logic: if repo-exists then git fetch else git clone fi. The correct logic needs to be if not repo-exists then git clone fi; git fetch. I think the git design induces this error. Oh, and gerrit uses refs/changes/for/<branch>/, so it's totally non-standard, and breaks jenkins assumptions about cloning, which is what prompted the question.Oppugnant
Git gives you a suite of options to choose from, and I think this is good and powerful. So I have to disagree on the fact that anything in git ways induces devs to make mistakes. One way to deal with variability and mistakes is instituting a procedure for everyone to follow, but it's not up to git to enforce that on the users.Judgeship
For the gerrit part, after clone/init, would mapping fetch = +refs/changes/for/*:refs/remotes/origin/* (as outlined in my answer) help you with jenkins? It should normalise the names.Judgeship
The problem is that Jenkins won't run a fetch when it clones. Yes, obviously, I can fork Jenkins and make it do what I want by modifying the code (or hope that someday, someone will actually do it for me). My current workaround is to simply ensure that all the workspaces I expect Jenkisn to use are pre-populated with the right clone.Oppugnant
Supporting hard links during clone is one big advantage (and diffference) over a init-fetch-workflow.Clock
A big advantage would be automatically fetching git notes. But that's really a missing git clone feature, not a missing Jenkins feature.Viniferous

© 2022 - 2024 — McMap. All rights reserved.