git worktrees vs "clone --reference"
Asked Answered
S

1

12

What are the pros-cons of using git worktrees vs maintaining multiple clones with --reference flag? The main scenario I am considering is when a developer needs to maintain multiple git repositories on the disk for old releases (release/1.0, release/2.0, release/3.0) because switching branches on a single git repo and rebuilding would be costly.

Using worktrees the developer could have a single clone of the repo, and any old releases could be created as worktrees of the repo using cd /opt/main/, git worktree add /opt/old_release_1 release/1.0. Using reference clones, the developer maintains a main clone somewhere, and uses cd /opt/old_release_1, git clone --reference /opt/main/.git ssh://[email protected]/myrepo.git to create clone repositories for the old releases.

It seems like they can both accomplish the same goal. Are there benefits to one over the other in terms of speed, disk space... other things?

Searle answered 17/1, 2018 at 18:44 Comment(0)
O
14

They all have a few issues that matter, but using git worktree is probably going to be your best bet.

  • A clone, let's call this AD for after-dependency clone, made with --reference local-path but without --dissociate uses objects from local-path. By "objects", I mean literal Git objects (stored loosely and/or in pack files). The other Git repository—the one in local-path—has no idea that AD is using these.

    Let's call the base clone BC. Now, suppose something happens in BC so that an object is no longer needed, such as deleting a branch name or a remote-tracking name. At this point, a git gc run in BC may garbage-collect and delete the object.

    If you now switch to the AD clone and run various Git operations, they may fail due to the removed object. The problem is that the older BC clone has no idea that the newer AD clone depends on it.

    Note that AD has, embedded in it, the path name of BC. If you move BC you must edit the .git/objects/info/alternates file in AD.

  • A work-tree made with git worktree add also uses objects from the original clone. Let's still call the original clone BC, with the added work-trees just called Wb. There are two key differences from the BC/AD setup above:

    • Each new work-tree Wb literally uses the entire .git directory from BC.

    • The BC repository records the path of each Wb, so it knows about each Wb. You won't have the problem of objects disappearing unexpectedly.

    • However, since BC records each Wb and all the branch names actually live inside BC itself, there's a constraint imposed: whatever branch is checked out in BC cannot be checked out in any Wb. Moreover, Wb1 must be "on" (as in git status says on branch ...) a different branch than Wb2, and so on. (You can be in "detached HEAD" mode, i.e., not on any branch at all, in any or all of BC and each Wb.)

    Since BC records each Wb path (and vice versa), if you want to move any of these repositories, you must adjust the paths.

Opt answered 17/1, 2018 at 21:54 Comment(4)
I think what you have written in your first bullet point regarding invalidating references in AD and creating problems there applies to --shared, but not to --reference.Trolly
@Sunday: I've done this in practice (back in 2016 or so) and those problems do actually occur with --reference. Using --dissociate fixes them, at the expense of a lot more disk space being used.Opt
Yes you are right, it applies to both, sorry. I checked the git clone reference again and the paragraph about --reference also refers to the warning note that is present in the paragraph about --shared.Trolly
@Sunday: I ended up using --dissociate in the code I left for the company. Not sure they ended up installing it, even though it cut a full build from 2 hours to 15 minutes. :-)Opt

© 2022 - 2024 — McMap. All rights reserved.