Is there a way to still use symbolic link for .gitignore? And why it's not supported anymore as a symbolic link?
Asked Answered
A

2

7

I have a central .gitignore on my laptop and for each project that I create, I create a symbolic link to that central file so that I can keep a uniform policy across all of my projects.

All of my projects are like each other (technology-wise) and it makes sense to have a central .gitignore to reduce the burden of maintenance.

However, recently I see this message:

warning: unable to access '.gitignore': Too many levels of symbolic links

And as I searched, it seems that from git 2.3 upwards they have decided to not support the symbolic link.

I have two questions. First, is there a way to force git to support symbolic links for .gitignore? And why on Earth do they not support it anymore? Does it not make sense to reuse code? Is half of linux not reused through symbolic links?

Antimere answered 23/6, 2022 at 11:40 Comment(10)
You mean git 2.32 upwards, do you not?Topmast
@Topmast I did not remember the exact version. Mine now is 2.34.1. But does it make any difference?Antimere
Well, my answer was generated by finding the new text added for 2.32, without looking all the way back to 2.3.Topmast
shameless self promotion : https://mcmap.net/q/54869/-when-would-you-use-git-info-exclude-instead-of-gitignore-to-exclude-filesBottali
@LeGEC, I ddin't understand what you mean. Are you suggesting that I use other ways to exclude stuff from git?Antimere
@Bigboy : yes, if you have a central gitignore file which you want to include in all your projects, you can set a global core.excludesfile /path/to/my.gitignore config option to have git always "include" it. This one will not be shared by people who clone your repository, but I think your symlink wasn't shared either.Bottali
@Bigboy : if you only need it in some of your repositories, you can set that option repo per repo instead (without the --global flag)Bottali
The whole problem with this approach, whether it's done wiht LeGEC's solution or a symlink, is that your git ignore rules are not going to be shared with your collaborators. If I clone your repo and might contribute to it, I'd like to get your git ignore rules too. I use global git ignore rules for my own personalized quirks (my editor drops *.swp files all over, so that's an example in my global git ignore). But if I have twenty Node projects, they will all have all the Node-relevant stuff committed in their .gitignore files, because that's not just for me.Contempt
@joanis, two notes. One is that maybe somebody wants to work alone. Git should not force its ideaology upon developers. That's such a stupid decision they have made. Second is the convention over configuration. Maybe a team choose to have an internal convention to have a global .gitignore file somewhere that is pulled into there somehow. Again, git should not impose its stupid ideas.Antimere
@Bigboy Well, I have to agree you make good points. And I understand the frustration of having a convention that's been working fine suddenly stop working. Hopefully LeGEC's solution will work for you.Contempt
T
6

First, is there a way to force git to support symbolic links for .gitignore?

No.

And why on Earth do they not support it anymore?

The gitattributes documentation now (as of Git 2.32) says this near the end:

NOTES

Git does not follow symbolic links when accessing a .gitattributes file in the working tree. This keeps behavior consistent when the file is accessed from the index or a tree versus from the filesystem.

While I'm not 100% sold on the reasoning here myself, it does make sense. (It seems to me that Git could just stuff the content of the .gitattributes file into the index and hence into the commits, although this would mean that on checkout it would destroy the symlink.)

Optional further reading / background

First, let's describe what a "symbolic link" is in the first place. To do this we must define what a file is (which is a pretty big job, so we'll just do very light bit of coverage): A file is a named entity, typically found in a file system (systematic collection of files), that store data for later retrieval. Being a named entity, a file has a name: for instance, README.txt, Makefile, and .gitconfig are all file names. Different OSes place different constraints on file names (e.g., Windows refuses to store a colon : character in a file name or create any file named aux with or without a suffix, so that you cannot have a C or C++ include file named aux.h or aux.hpp). Git itself places very few constraints on file names: they can contain almost any character except an ASCII NUL (b'\0' in Python, \0 in C, etc.), and forward slashes / are slightly special, but other than that a name character is just a name character and there are very few restrictions.1

On most real OSes, files can have "types". The exact mechanisms here rapidly become OS-specific and can get very complicated,2 though traditional Unix-like hierarchical file systems just have a few types: "directory", "file", "block or character device", "symbolic link", and the like. Symbolic links are in fact one of these types.

A symbolic link is a type of file in which the file's content is another file name. This file name, on a Unix-like file system, can be absolute (/home/john/somefile, /Users/torek/somefile) or relative (./somefile, ../../somefile). On these systems, opening a symbolic link results in opening the file whose name is provided by the symbolic link's content. To read the content of the symbolic link—that is, to find out what file name the link contains—we use a different operation: readlink instead of open, for instance. Modern Unix systems also have an O_NOFOLLOW flag that can be used to forbid the open system call from following the link.3

The way Git stores a symlink is as a special mode object in a commit: ordinary files are either mode 100644, meaning a non-executable file, or mode 100755, meaning an executable file. A symbolic link is stored as mode 120000 and Git stores the target name, found by calling readlink, as the content.4


1The one peculiar restriction is that you're not allowed to store anything named .git, in any mix or upper and/or lower case. This .git restriction actually applies to "name components" which are the parts between forward slashes. Due to Windows being Windows, Git-on-Windows will turn backwards slashes into forwards ones as necessary, and then places the restriction on the components.

2Traditional OSes from the 1960s through 1980s, for instance, may impose things called access methods based in part on file types. Unix simplified things a lot here.

3This is sometimes important for various security aspects. The details are beyond the scope of this article.

4These odd mode values correspond closely to the struct stat st_mode field in a Unix/Linux stat system call. That's because when Linus Torvalds first wrote the initial versions of Git, he was dealing with it—at least in part—as a kind of file system. The ability to store full Unix file modes (9 bits of rwxrwxrwx flags) was left in, and initially Git actually stored group write permissions, but this turned out to be a mistake and was removed before the first public release. The 100000 part is S_IFREG, "Stat: Inode Format REGular file". The 120000 found in a Git symbolic link is S_IFLNK, or "Stat: Inode Format symbolic LiNK". We also have mode 040000 for directories from S_IFDIR, which should now be obvious. However, Git can't store a mode 040000 entry in its index / staging-area, for no particularly good reason, which leads to the problem described in How can I add a blank directory to a Git repository?


In other words, a symbolic link means "use another file"

Wherever a symbolic link is found, it means read or write some other file. So if README.txt is a symbolic link reading /tmp/fooledyou, any attempt to read README.txt actually reads /tmp/fooledyou instead; any attempt to write README.txt actually writes to /tmp/fooledyou.

Consider, though, when this redirection—from README.txt to /tmp/fooledyou—occurs. It doesn't happen at the time you make the symbolic link itself. You can create this README.txt file last year. When I go to read README.txt, that's when the redirection occurs. So if you've changed /tmp/fooledyou since you created README.txt, I get the modern version, not the old one.

That, of course, is precisely why you wanted the symbolic link in the first place:

All of my projects are like each other (technology-wise) and it makes sense to have a central .gitignore to reduce the burden of maintenance.

In other words, you wanted to have one .gitignore, that is not version controlled, that always reflects what should be ignored based on what you learned up until right now, regardless of when it is that "right now" is.

This is the opposite of Git's normal purpose, which is to store a full snapshot of what your project looked like "back then", whenever "back then" was: the time at which you made a git commit snapshot.

My suggested possibility above is that when you run:

git add .gitignore

to update Git's idea of what should go in the .gitignore file that goes in the next commit, Git could follow the .gitignore indirection at that time, read the contents of the target of the symbolic link, and prepare that to be committed. You'd then make the commit—the snapshot and metadata—such that if, next year, you extract this particular historical commit, you'll get the historical snapshot, including the historical .gitignore.

The drawback to this is that by extracting the historical .gitignore, you "break the link": .gitignore is no longer a symbolic link at all. Instead, it is now an ordinary file, containing the historical snapshot. There's no way to get the link back except to remove the ordinary file and create a new symbolic link.

Before Git version 2.32, Git would notice when .gitignore was a symbolic link and would store, in its index / staging-area, the fact that .giginore was a symlink (mode 120000) and use the readlink system call to find the target of the symlink, and store that in the commit. Running git commit then makes a snapshot that, when extracted, creates .gitignore as a (new) symbolic link: the existing file-or-symlink is removed, and the new one is installed instead. It redirects, in the usual symlink fashion, to the saved (committed) historical location—even if that's wrong now.

As of Git version 2.32, Git will still store a symbolic link .gitignore file:

$ mkdir z; cd z
$ ../git --version
git version 2.36.1.363.g9c897eef06
$ ../git init
[messages snipped; branch renamed to main, also snipped]
$ echo testing > README
$ ln -s foo .gitignore
$ git add README .gitignore
$ git commit -m initial
[main (root-commit) 08c6626] initial
 2 files changed, 2 insertions(+)
 create mode 120000 .gitignore
 create mode 100644 README
$ ../git ls-tree HEAD
120000 blob 19102815663d23f8b75a47e7a01965dcdc96468c    .gitignore
100644 blob 038d718da6a1ebbc6a7780a96ed75a70cc2ad6e2    README

The same reasoning—that a Git commit, once it's made and stuffed into a repository, may contain a symbolic link that is no longer valid or correct—explains why Git 2.32 also refuses to follow .gitattributes and .mailmap files. Note that commands like git archive generally use the commit's version of .gitattributes to control archive substitutions, so a symbolic link stored in the repository is useless unless the target of the symbolic link is somehow correct. The repository and its commits get shipped around from one machine to another, but the targets of any committed symlinks in many cases don't.

Topmast answered 23/6, 2022 at 12:8 Comment(6)
I did not understand neither their explanation and reasoning, nor yours. Can you explain it in more lay terms? Thank you.Antimere
Git doesn't store symlinks, so a symlink was only valid for one particular working directory, not the repository in general. .gitignore is intended to be committed.Interloper
@Bigboy: I added a much longer section of optional reading.Topmast
@torek, wow!!! you're a true hero man. Who spends that much time to explain details in such a good and fluent manner? Thank you so much. However, I still believe that it's a choice of teams and developers and should not be forced upon developers. Git is a tool, not our creator. It should let US choose what we want to do and how we want to do it. Let' say I'm willing to lose that history, but gain consistency and peace of mind and DRY.Antimere
For the "willing to lose that history, but gain consistency" approach, see LeGEC's comment. The core.excludesFile you set won't be recorded in commits (so there's no history at all) but your exclusions will work for you.Topmast
"The repository and its commits get shipped around [...] but the targets of any committed symlinks in many cases don't." My symlinks all point into a submodule of the repository, including .gitignore as it makes sense to do so for this project. Maybe the git Devs will provide a way? I guess I'll just copy it... I think not optimal and sloppy in this context...Nianiabi
I
2

Under Linux, you always have the possibility to put .gitignore as a physical link (# ln ...) and not a symbolic link (# ln -s ...). Like this, the link is seen as a real physical file. This worked for me.

Inchon answered 15/8, 2023 at 14:29 Comment(1)
That works until you change the file. Either your editor or Git itself will break the hardlink.Macintyre

© 2022 - 2024 — McMap. All rights reserved.