Update note
For most Windows developers struggling with symlinks and git
on Windows and the issues of sharing a repo with *nix systems, this topic is a solved problem -- once you update your Windows understanding of mklink
a bit and turn on Developer Mode.
See this more modern answer before digging into the following deep git hacks discussion.
Older systems:
I was asking this exact same question a while back (not here, just in general), and ended up coming up with a very similar solution to OP's proposition.
I'll post the solution I ended up using.
But first I'll provide direct answers to OP's 3 questions:
Q: "What, if any, downsides do you see to this approach?"
A: There are indeed a few downsides to the proposed solution, mainly regarding an increased potential for repository pollution, or accidentally adding duplicate files while they're in their "Windows symlink" states. (More on this under "limitations" below.)
Q: "Is this post-checkout script even implementable? i.e. can I recursively find out the dummy "symlink" files git creates?"
A: Yes, a post-checkout script is implementable! Maybe not as a literal post-git checkout
step, but the solution below has met my needs well enough that a literal post-checkout script wasn't necessary.
Q: "Has anybody already worked on such a script?"
A: Yes!
The Solution:
Our developers are in much the same situation as OP's: a mixture of Windows and Unix-like hosts, repositories and submodules with many git symlinks, and no native support (yet) in the release version of MsysGit for intelligently handling these symlinks on Windows hosts.
Thanks to Josh Lee for pointing out the fact that git commits symlinks with special filemode 120000
. With this information it's possible to add a few git aliases that allow for the creation and manipulation of git symlinks on Windows hosts.
Creating git symlinks on Windows
git config --global alias.add-symlink '!'"$(cat <<'ETX'
__git_add_symlink() {
if [ $# -ne 2 ] || [ "$1" = "-h" ]; then
printf '%b\n' \
'usage: git add-symlink <source_file_or_dir> <target_symlink>\n' \
'Create a symlink in a git repository on a Windows host.\n' \
'Note: source MUST be a path relative to the location of target'
[ "$1" = "-h" ] && return 0 || return 2
fi
source_file_or_dir=${1#./}
source_file_or_dir=${source_file_or_dir%/}
target_symlink=${2#./}
target_symlink=${target_symlink%/}
target_symlink="${GIT_PREFIX}${target_symlink}"
target_symlink=${target_symlink%/.}
: "${target_symlink:=.}"
if [ -d "$target_symlink" ]; then
target_symlink="${target_symlink%/}/${source_file_or_dir##*/}"
fi
case "$target_symlink" in
(*/*) target_dir=${target_symlink%/*} ;;
(*) target_dir=$GIT_PREFIX ;;
esac
target_dir=$(cd "$target_dir" && pwd)
if [ ! -e "${target_dir}/${source_file_or_dir}" ]; then
printf 'error: git-add-symlink: %s: No such file or directory\n' \
"${target_dir}/${source_file_or_dir}" >&2
printf '(Source MUST be a path relative to the location of target!)\n' >&2
return 2
fi
git update-index --add --cacheinfo 120000 \
"$(printf '%s' "$source_file_or_dir" | git hash-object -w --stdin)" \
"${target_symlink}" \
&& git checkout -- "$target_symlink" \
&& printf '%s -> %s\n' "${target_symlink#$GIT_PREFIX}" "$source_file_or_dir" \
|| return $?
}
__git_add_symlink
ETX
)"
Usage: git add-symlink <source_file_or_dir> <target_symlink>
, where the argument corresponding to the source file or directory must take the form of a path relative to the target symlink. You can use this alias the same way you would normally use ln
.
E.g., the repository tree:
dir/
dir/foo/
dir/foo/bar/
dir/foo/bar/baz (file containing "I am baz")
dir/foo/bar/lnk_file (symlink to ../../../file)
file (file containing "I am file")
lnk_bar (symlink to dir/foo/bar/)
Can be created on Windows as follows:
git init
mkdir -p dir/foo/bar/
echo "I am baz" > dir/foo/bar/baz
echo "I am file" > file
git add -A
git commit -m "Add files"
git add-symlink ../../../file dir/foo/bar/lnk_file
git add-symlink dir/foo/bar/ lnk_bar
git commit -m "Add symlinks"
Replacing git symlinks with NTFS hardlinks+junctions
git config --global alias.rm-symlinks '!'"$(cat <<'ETX'
__git_rm_symlinks() {
case "$1" in (-h)
printf 'usage: git rm-symlinks [symlink] [symlink] [...]\n'
return 0
esac
ppid=$$
case $# in
(0) git ls-files -s | grep -E '^120000' | cut -f2 ;;
(*) printf '%s\n' "$@" ;;
esac | while IFS= read -r symlink; do
case "$symlink" in
(*/*) symdir=${symlink%/*} ;;
(*) symdir=. ;;
esac
git checkout -- "$symlink"
src="${symdir}/$(cat "$symlink")"
posix_to_dos_sed='s_^/\([A-Za-z]\)_\1:_;s_/_\\\\_g'
doslnk=$(printf '%s\n' "$symlink" | sed "$posix_to_dos_sed")
dossrc=$(printf '%s\n' "$src" | sed "$posix_to_dos_sed")
if [ -f "$src" ]; then
rm -f "$symlink"
cmd //C mklink //H "$doslnk" "$dossrc"
elif [ -d "$src" ]; then
rm -f "$symlink"
cmd //C mklink //J "$doslnk" "$dossrc"
else
printf 'error: git-rm-symlink: Not a valid source\n' >&2
printf '%s =/=> %s (%s =/=> %s)...\n' \
"$symlink" "$src" "$doslnk" "$dossrc" >&2
false
fi || printf 'ESC[%d]: %d\n' "$ppid" "$?"
git update-index --assume-unchanged "$symlink"
done | awk '
BEGIN { status_code = 0 }
/^ESC\['"$ppid"'\]: / { status_code = $2 ; next }
{ print }
END { exit status_code }
'
}
__git_rm_symlinks
ETX
)"
git config --global alias.rm-symlink '!git rm-symlinks' # for back-compat.
Usage:
git rm-symlinks [symlink] [symlink] [...]
This alias can remove git symlinks one-by-one or all-at-once in one fell swoop. Symlinks will be replaced with NTFS hardlinks (in the case of files) or NTFS junctions (in the case of directories). The benefit of using hardlinks+junctions over "true" NTFS symlinks is that elevated UAC permissions are not required in order for them to be created.
To remove symlinks from submodules, just use git's built-in support for iterating over them:
git submodule foreach --recursive git rm-symlinks
But, for every drastic action like this, a reversal is nice to have...
Restoring git symlinks on Windows
git config --global alias.checkout-symlinks '!'"$(cat <<'ETX'
__git_checkout_symlinks() {
case "$1" in (-h)
printf 'usage: git checkout-symlinks [symlink] [symlink] [...]\n'
return 0
esac
case $# in
(0) git ls-files -s | grep -E '^120000' | cut -f2 ;;
(*) printf '%s\n' "$@" ;;
esac | while IFS= read -r symlink; do
git update-index --no-assume-unchanged "$symlink"
rmdir "$symlink" >/dev/null 2>&1
git checkout -- "$symlink"
printf 'Restored git symlink: %s -> %s\n' "$symlink" "$(cat "$symlink")"
done
}
__git_checkout_symlinks
ETX
)"
git config --global alias.co-symlinks '!git checkout-symlinks'
Usage: git checkout-symlinks [symlink] [symlink] [...]
, which undoes git rm-symlinks
, effectively restoring the repository to its natural state (except for your changes, which should stay intact).
And for submodules:
git submodule foreach --recursive git checkout-symlinks
Limitations:
Directories/files/symlinks with spaces in their paths should work. But tabs or newlines? YMMV… (By this I mean: don’t do that, because it will not work.)
If yourself or others forget to git checkout-symlinks
before doing something with potentially wide-sweeping consequences like git add -A
, the local repository could end up in a polluted state.
Using our "example repo" from before:
echo "I am nuthafile" > dir/foo/bar/nuthafile
echo "Updating file" >> file
git add -A
git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: dir/foo/bar/nuthafile
# modified: file
# deleted: lnk_bar # POLLUTION
# new file: lnk_bar/baz # POLLUTION
# new file: lnk_bar/lnk_file # POLLUTION
# new file: lnk_bar/nuthafile # POLLUTION
#
Whoops...
For this reason, it's nice to include these aliases as steps to perform for Windows users before-and-after building a project, rather than after checkout or before pushing. But each situation is different. These aliases have been useful enough for me that a true post-checkout solution hasn't been necessary.
References:
http://git-scm.com/book/en/Git-Internals-Git-Objects
http://technet.microsoft.com/en-us/library/cc753194
Last Update: 2019-03-13
- POSIX compliance (well, except for those
mklink
calls, of course) — no more Bashisms!
- Directories and files with spaces in them are supported.
- Zero and non-zero exit status codes (for communicating success/failure of the requested command, respectively) are now properly preserved/returned.
- The
add-symlink
alias now works more like ln(1) and can be used from any directory in the repository, not just the repository’s root directory.
- The
rm-symlink
alias (singular) has been superseded by the rm-symlinks
alias (plural), which now accepts multiple arguments (or no arguments at all, which finds all of the symlinks throughout the repository, as before) for selectively transforming git symlinks into NTFS hardlinks+junctions.
- The
checkout-symlinks
alias has also been updated to accept multiple arguments (or none at all, == everything) for selective reversal of the aforementioned transformations.
Final Note: While I did test loading and running these aliases using Bash 3.2 (and even 3.1) for those who may still be stuck on such ancient versions for any number of reasons, be aware that versions as old as these are notorious for their parser bugs. If you experience issues while trying to install any of these aliases, the first thing you should look into is upgrading your shell (for Bash, check the version with CTRL+X, CTRL+V). Alternatively, if you’re trying to install them by pasting them into your terminal emulator, you may have more luck pasting them into a file and sourcing it instead, e.g. as
. ./git-win-symlinks.sh