I had sort of the same problem like you: A large generic utility library as a submodule, and many projects depending on it. I did not want to create a separate checkout for every instance of the utility library.
The solution suggested by jthill works fine, but it only solves the first half of the problem, namely how to keep git happy.
What was missing is how to keep your build system happy, which expects actual files to work with and does not care about gitlink references.
But if you combine his idea with a symlink, you get what you want!
In order to implement this, let's start with the projects from your example
/home/projects/project1
/home/projects/project2
/home/projects/library_XYZ
assuming both project1 and project2 have library_XYZ already added as a submodule, and that currently all three projects contain a full checkout of library_XYZ.
In order to replace the full checkouts of the library submodules by a shared symlink to the library's checkout, do this:
sharedproject="/home/projects/library_XYZ"
superproject="/home/projects/project1"
submodule="library_XYZ"
cd "$superproject"
(cd -- "$submodule" && git status) # Verify that no uncommited changes exist!
(cd -- "$submodule" && git push -- "$sharedproject") # Save any local-only commits
git submodule deinit -- "$submodule" # Get rid of submodule's check-out
rm -rf .git/modules/"$submodule" # as well as of its local repository
mkdir -p .submods
git mv -- "$submodule" .submods/
echo "gitdir: $sharedproject/.git" > ".submods/$submodule/.git"
ln -s -- "$sharedproject" "$submodule"
echo "/$submodule" >> .gitignore
and then repeat the same steps for /home/projects/project2 as $superproject.
And here is an explanation what has been done:
First the submodule checkout is removed with "git submodule deinit", leaving library_XYZ behind as an empty directory. Be sure to commit any changes before you do this, because it will remove the checkout!
Next, we save any commits local to the check-out which have not yet been pushed to the shared project with "git push" to /home/projects/library_XYZ.
If this does not work because you did not setup a remote or refspec for that, you can do this:
(saved_from=$(basename -- "$superproject"); \
cd -- "$submodule" \
&& git push -- "$sharedproject" \
"refs/heads/*:refs/remotes/$saved_from/*")
This will save backups of all branches of the submodule's local repository as remote branches in /home/projects/library_XYZ. The basename of the $superproject directory will be used as the name of the remote, i. e. project1 or project2 in our example.
Of course, there exists not really a remote of that name in /home/projects/library_XYZ, but the saved branches will show up as if it did when "git branch -r" will be executed there.
As a safeguard, the refspec in the above command does not start with a "+", so the "git push" cannot accidentally overwrite any branch which already happens to exist in /home/projects/library_XYZ.
Next, .git/modules/library_XYZ will be removed in order to save space. We can do this because we do no longer need to use "git submodule init" or "git submodule update". This is the case because we will share both the check-out and the .git directory of /home/projects/library_XYZ with the submodule, avoiding a local copy of both.
Then we let git rename the empty submodule directory to ".submods/library_XYZ", a (hidden) directory the files in the projects will never use directly.
Next we apply jthill's partial solution to the problem and create a gitlink file in .submods/library_XYZ, which makes git see /home/projects/library_XYZ as the working tree and git repo of the submodule.
And now comes the new thing: We create a symlink with the relative name "library_XYZ" which points to /home/projects/library_XYZ. This symlink will not be put under version control, so we add it to the .gitignore file.
All the build files in project1 and project2 will use the library_XYZ symlink as if it were a normal subdirectory, but actually find the files from the working tree in /home/projects/library_XYZ there.
No-one except git actually uses .submods/library_XYZ!
However, as the symlink ./library_XYZ is not versioned, it won't be created when checking out project1 or project2. We therefore need to take care it will be created automatically when missing.
This should be done by the build infrastructure of project1/project2 with a command equivalent to the following shell commands:
$ test ! -e library_XYZ && ln -s .submods/library_XYZ
For instance, if project1 is built using a Makefile and contains the following target rule for updating the subproject
library_XYZ/libsharedutils.a:
cd library_XYZ && $(MAKE) libsharedutils.a
then we insert the line from above as the first line of the rule's action:
library_XYZ/libsharedutils.a:
test ! -e library_XYZ && ln -s .submods/library_XYZ
cd library_XYZ && $(MAKE) libsharedutils.a
If your project is using some other build system you can usually do the same thing by creating a custom rule for creating the library_XYZ subdirectory.
If your project contains only scripts or documents and does not use any kind of build system at all, you can add a script which the user can run for creating the "missing directories" (actually: symlinks) as follows:
(n=create_missing_dirs.sh && cat > "$n" << 'EOF' && chmod +x -- "$n")
#! /bin/sh
for dir in .submods/*
do
sym=${dir#*/}
if test -d "$dir" && test ! -e "$sym"
then
echo "Creating $sym"
ln -snf -- "$dir" "$sym"
fi
done
EOF
This will create symlinks to all submodule check-outs in .submods, but only if they don't exist yet or if they are broken.
So far for the transformation from a conventional submodule layout to the new layout which allows sharing.
Once you already have that layout committed, check out the superproject somewhere, go to its top-level directory, and do the following in order to enable sharing:
sharedproject="/home/projects/library_XYZ"
submodule="library_XYZ"
ln -sn -- "$sharedproject" "$submodule"
echo "gitdir: $sharedproject.git" > ".submods/$submodule/.git"
I hope you get the point: The library_XYZ subdirectory used by project1 and project2 is an unversioned symlink rather than corresponding to the submodule path as defined in ".gitmodules".
The symlink will be created automatically by the build infrastructure itself and will then point to .submods/library_XYZ, but only, and this is important, if the symlink does not already exist.
This allows one to create the symlink manually instead of letting the build system create it, so it can also be made to point to a single shared check-out rather than to .submods/library_XYZ.
That way you can use a shared check-out on your machine if you want.
But if another person does nothing special and just checks out project1 and does a normal "git submodule update --init library_XYZ", things will work the same without a shared check-out.
No changes to the checked-out build files necessary in either case!
In other words, a check-out of project1 and project2 will work out of the box as usual, no special instructions need to be followed by other people using your repo.
But by manually creating the gitlink file and the library_XYZ symlink before the build system has a chance to create the symlink, you can locally "override" the symlink and enforce a shared checkout of the library.
And there is even another advantage: As it turns out, you don't need to mess with "git submodule init" or "git submodule update" at all if you use the above solution: It just works without!
This is because "git submodule init" is only necessary as a preparation for "git submodule update". But you won't need the latter because the library is already checked-out somewhere else and also has already its own .git directory there. So there is nothing to do for "git submodule update", and we don't need it.
As a side effect of no longer using "git submodule update", no .git/module subdirectory will be required either. Neither remains any need to set alternates (--reference option) for the submodules.
Also, you don't need any remotes for pushing/pulling /home/projects/library_XYZ in /home/projects/project1 and /home/projects/project2 any more. So you can remove the remote for accessing library_XYZ from project1 and project2.
A win-win situation!
The only obvious disadvantage of this solution is that it requires symlinks to work.
That means, it won't be possible to check out project1 on, say, a VFAT filesystem.
But then, who does that?
And even when doing so, projects like http://sourceforge.net/projects/posixovl may still be able to work around any symlink limitation of a filesystem.
Finally, some advice for Windows users here:
Symbolic links are available since VISTA via the mklink command, but it requires special privileges.
But when using the "junction"-command from sysinternals, symlinks to directories could already be created even back in Windows XP times.
In addition, you have the option to use CygWin, which can (AFAIK) emulate symlinks even without support from the OS.
project1
depends on a different commit fromlibrary_XYZ
than the current version ofproject2
. If they're shared, things are going to get a bit ... confused. Better to just keep the two separate submodule instances and learn the associated workflow... – Kneepad