Suppose I have the following project tree:
src
data
doc
I'd like to keep all the folders in a Git repository, published to Gitlab. But I don't want to track data
and doc
together with src
.
So I use the following strategy:
git remote add origin ADDRESS
git submodule add -b data ADDRESS data
git submodule add -b doc ADDRESS doc
It actually works fine, except when I try to replicate the repository with:
git clone --recursive ADDRESS
all objects get transmitted 3 times: both the root and data
and doc
all contain:
- origin/master
- origin/data
- origin/doc
Is there an easy way to avoid this? Just to clarify what I'd like:
- the master repository should only fetch
origin/master
, not the other two - the data submodule should only fetch
origin/data
. - the doc submodule should only fetch
origin/doc
.
Would be easy to achieve with 3 separate repositories, but that's too cumbersome, since I apply this approach for multiple projects.
UPDATE
git worktree
from this answer allows me to achieve what I want manually.
But now, instead of the automatic approach (which consumes 4x bandwidth):
git clone --recursive git@foo:foo/bar.git
I have to do:
git clone git@foo:foo/bar.git
cd bar
git worktree add data origin/data
git worktree add src/notebooks origin/notebooks
git worktree add doc origin/doc
git worktree add reports origin/reports
I could automate this process with some scripts, since .gitmodules
file already contains the complete info:
[submodule "data"]
path = data
url = git@foo:foo/bar.git
branch = data
[submodule "src/notebooks"]
path = src/notebooks
url = git@foo:foo/bar.git
branch = notebooks
[submodule "doc"]
path = doc
url = git@foo:foo/bar.git
branch = doc
[submodule "reports"]
path = reports
url = git@foo:foo/bar.git
branch = reports
I wonder if there already is some standard git script or flag that handles this?