Make a shallow GIT repository less shallow
Asked Answered
A

2

4

I create a shallow clone for a specified tag:

git clone --branch v0.1.3 --depth 1 file:///c/usr/sites/smc .

After this the cloned repo only has the tag v0.1.3 (and associated files) in it. It does not have the history for all the changes before or after that tag (as I understand - correct me if wrong.) Next I would like to update the clone to include v0.1.4. If I use a "git fetch --unshallow" command, then I get the full history, which I do not want. Is there a way to expand my clone to include newer history from the master (like v0.1.4 and 0.1.5), but not older history (like 0.1.2)? (I see an option called update-shallow, but do not understand what it does or whether it is relevant.)

The goal of this is:

1) Make the initial setup of the repository on the remote server fast and small by not cloning the whole repo. (Our repo is mostly binaries: DLLs, EXEs.)

2) Make it possible to upgrade the remote repo to later versions (as given by the tag) but never earlier versions. Such an upgrade will only transfer a fraction of the repository, so it should also be fast.

NOTE: My Git version is 1.9.2.msysgit.0 on Windows 7. This includes the recent enhancements to shallow cloning. We will likely host the main repository on Linux, but the agents to which we deploy run Windows. The intent is to manage checkouts using puppet enterprise.

UPDATE: Tried VonC's suggestion.

$ git fetch --update-shallow origin v0.1.4
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 4 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (4/4), done.
From file:///c/usr/sites/smc
 * tag               v0.1.4     -> FETCH_HEAD

paul.chernoch@USB-XXXXXXXXX /c/usr/sites/smc-clone3 ((v0.1.3))
$ git describe
v0.1.3

paul.chernoch@USB-XXXXXXXXX /c/usr/sites/smc-clone3 ((v0.1.3))
$ git tag --list
v0.1.3

While the command seemed to do something, I do not see the tag v0.1.4 in my target repo. However, if I use the --tags option, I get all the tags, but also all the history! Also, I do not understand the notation "FETCH_HEAD" in the output of the git fetch command.

UPDATE: Further research shows that this SO question is after a similar goal: git shallow clone to specific tag

Archoplasm answered 18/6, 2014 at 20:20 Comment(2)
update-shallow is mentioned in github.com/git/git/commit/… and github.com/git/git/commit/…. Did you try a git fetch --update-shallow origin v0.1.4?Reld
@VonC: Thanks for the links. My source repo is not shallow while my target is shallow, so based on your information, it looks like --update-shallow is not what I want, but I will try it anyways.Archoplasm
T
2

Seems I had a similar question to this and found this afterwards. The trick was to specify the full refspec at the end and depth on the fetch command. refs/tags/v0.1.3:refs/tags/v0.1.3 or tag v0.1.3 for short

Git shallow fetch of a new tag

git fetch --depth 1 origin tag v0.1.4
Trahan answered 29/10, 2014 at 14:16 Comment(0)
R
0

This option updates .git/shallow and accept such refs.

But that will work better with Git 2.27 (Q2 2020), which fixes in-core inconsistency after fetching into a shallow repository that broke the code to write out commit-graph.

That will also influence git push.

See commit 37b9dca (23 Apr 2020), and commit 8a8da49 (24 Apr 2020) by Taylor Blau (ttaylorr).
(Merged by Junio C Hamano -- gitster -- in commit 2b4ff3d, 01 May 2020)

shallow.c: use '{commit,rollback}_shallow_file'

Helped-by: Jonathan Tan
Helped-by: Junio C Hamano
Signed-off-by: Taylor Blau
Reviewed-by: Jonathan Tan

In bd0b42aed3 ("fetch-pack: do not take shallow lock unnecessarily", 2019-01-10, Git v2.21.0-rc0 -- merge listed in batch #4), the author noted that 'is_repository_shallow' produces visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.

This is a problem for e.g., fetching with '--update-shallow' in a shallow repository with 'fetch.writeCommitGraph' enabled, since the update to '.git/shallow' will cause Git to think that the repository isn't shallow when it is, thereby circumventing the commit-graph compatibility check.

This causes problems in shallow repositories with at least shallow refs that have at least one ancestor (since the client won't have those objects, and therefore can't take the reachability closure over commits when writing a commit-graph).

Address this by introducing thin wrappers over 'commit_lock_file' and 'rollback_lock_file' for use specifically when the lock is held over '.git/shallow'.
These wrappers (appropriately called 'commit_shallow_file' and 'rollback_shallow_file') call into their respective functions in 'lockfile.h', but additionally reset validity checks used by the shallow machinery.

Replace each instance of 'commit_lock_file' and 'rollback_lock_file' with 'commit_shallow_file' and 'rollback_shallow_file' when the lock being held is over the '.git/shallow' file.

As a result, 'prune_shallow' can now only be called once (since 'check_shallow_file_for_update' will die after calling 'reset_repository_shallow'). But, this is OK since we only call 'prune_shallow' at most once per process.


Warning, before Git 2.28 (Q3 2020), "fetch.writeCommitGraph" was enabled when "feature.experimental" is asked for, but it was found to be a bit too risky even for bold folks in its current shape.

The configuration has been ejected, at least for now, from the "experimental" feature set.

See commit b5651a2 (06 Jul 2020) by Jonathan Nieder (artagnon).
(Merged by Junio C Hamano -- gitster -- in commit 9850823, 09 Jul 2020)

experimental: default to fetch.writeCommitGraph=false

Reported-by: Jay Conrod
Helped-by: Taylor Blau
Signed-off-by: Jonathan Nieder

The fetch.writeCommitGraph feature makes fetches write out a commit graph file for the newly downloaded pack on fetch.

This improves the performance of various commands that would perform a revision walk and eventually ought to be the default for everyone.

To prepare for that future, it's enabled by default for users that set feature.experimental=true to experience such future defaults.

Alas, for --unshallow fetches from a shallow clone it runs into a snag: by the time Git has fetched the new objects and is writing a commit graph, it has performed a revision walk and r->parsed_objects contains information about the shallow boundary from before the fetch.

The commit graph writing code is careful to avoid writing a commit graph file in shallow repositories, but the new state is not shallow, and the result is that from that point on, commands like "git log" make use of a newly written commit graph file representing a fictional history with the old shallow boundary.

We could fix this by making the commit graph writing code more careful to avoid writing a commit graph that could have used any grafts or shallow state, but it is possible that there are other pieces of mutated state that fetch's commit graph writing code may be relying on.

So disable it in the feature.experimental configuration.

Google developers have been running in this configuration (by setting fetch.writeCommitGraph=false in the system config) to work around this bug since it was discovered in April.

Once the fix lands, we'll enable fetch.writeCommitGraph=true again to give it some early testing before rolling out to a wider audience.

In other words:

  • this patch only affects behavior with feature.experimental=true
  • it makes feature.experimental match the configuration Google has been using for the last few months, meaning it would leave users in a better tested state than without it
  • this should improve testing for other features guarded by feature.experimental, by making feature.experimental safer to use

Plus, still with Git 2.28 (Q3 2020), when "fetch.writeCommitGraph" configuration is set in a shallow repository and a fetch moves the shallow boundary, we wrote out broken commit-graph files that do not match the reality, which has been corrected.

See commit ce16364 (08 Jul 2020) by Taylor Blau (ttaylorr).
(Merged by Junio C Hamano -- gitster -- in commit 24ecfdf, 09 Jul 2020)

commit.c: don't persist substituted parents when unshallowing

Helped-by: Derrick Stolee
Helped-by: Jonathan Nieder
Reported-by: Jay Conrod
Reviewed-by: Jonathan Nieder
Signed-off-by: Taylor Blau

Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file', 2020-04-22), Git knows how to reset stat-validity checks for the $GIT_DIR/shallow file, allowing it to change between a shallow and non-shallow state in the same process (e.g., in the case of 'git fetch --unshallow').

However, when $GIT_DIR/shallow changes, Git does not alter or remove any grafts (nor substituted parents) in memory.

This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph set to true. Ordinarily in a shallow repository (and before 37b9dcabfc, even in this case), commit_graph_compatible() would return false, indicating that the repository should not be used to write a commit-graphs (since commit-graph files cannot represent a shallow history). But since 37b9dcabfc, in an --unshallow operation that check succeeds.

Thus even though the repository isn't shallow any longer (that is, we have all of the objects), the in-core representation of those objects still has munged parents at the shallow boundaries.

When the commit-graph write proceeds, we use the incorrect parentage, producing wrong results.

There are two ways for a user to work around this: either (1) set 'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after unshallowing.

One way to fix this would be to reset the parsed object pool entirely (flushing the cache and thus preventing subsequent reads from modifying their parents) after unshallowing. That would produce a problem when callers have a now-stale reference to the old pool, and so this patch implements a different approach. Instead, attach a new bit to the pool, 'substituted_parent', which indicates if the repository ever stored a commit which had its parents modified (i.e., the shallow boundary prior to unshallowing).

This bit needs to be sticky because all reads subsequent to modifying a commit's parents are unreliable when unshallowing. Modify the check in 'commit_graph_compatible' to take this bit into account, and correctly avoid generating commit-graphs in this case, thus solving the bug.

Reld answered 2/5, 2020 at 19:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.