Force LF eol in git repo and working copy
Asked Answered
G

4

243

I have a git repository hosted on github. Many of the files were initially developed on Windows, and I wasn't too careful about line endings. When I performed the initial commit, I also didn't have any git configuration in place to enforce correct line endings. The upshot is that I have a number of files with CRLF line endings in my github repository.

I'm now developing partially on Linux, and I'd like to clean up the line endings. How can I ensure the files are stored correctly with LF on github, and have LF in my working copy?

I've set up a .gitattributes file containing text eol=LF; is that correct? With that committed and pushed, can I just rm my local repo and re-clone from github to get the desired effect?

Guilt answered 2/4, 2012 at 13:2 Comment(4)
possible duplicate of git replacing LF with CRLFDyann
See also Cross platform development using Git (EOL issue)Dyann
Neither of those is quite what I'm asking. I'm the only developer, and I'm quite willing to set up all my machines the same. I have an existing repo with some CRLF files already committed to it, and a couple of clones on different machines. How can I update the repo, and each working copy, so that there are LF everywhere?Guilt
Have you looked this Github guide?Soembawa
S
288

Without a bit of information about what files are in your repository (pure source code, images, executables, ...), it's a bit hard to answer the question :)

Beside this, I'll consider that you're willing to default to LF as line endings in your working directory because you're willing to make sure that text files have LF line endings in your .git repository wether you work on Windows or Linux. Indeed better safe than sorry....

However, there's a better alternative: Benefit from LF line endings in your Linux workdir, CRLF line endings in your Windows workdir AND LF line endings in your repository.

As you're partially working on Linux and Windows, make sure core.eol is set to native and core.autocrlf is set to true.

Then, replace the content of your .gitattributes file with the following

* text=auto

This will let Git handle the automagic line endings conversion for you, on commits and checkouts. Binary files won't be altered, files detected as being text files will see the line endings converted on the fly.

However, as you know the content of your repository, you may give Git a hand and help him detect text files from binary files.

Provided you work on a C based image processing project, replace the content of your .gitattributes file with the following

* text=auto
*.txt text
*.c text
*.h text
*.jpg binary

This will make sure files which extension is c, h, or txt will be stored with LF line endings in your repo and will have native line endings in the working directory. Jpeg files won't be touched. All of the others will be benefit from the same automagic filtering as seen above.

In order to get a get a deeper understanding of the inner details of all this, I'd suggest you to dive into this very good post "Mind the end of your line" from Tim Clem, a Githubber.

As a real world example, you can also peek at this commit where those changes to a .gitattributes file are demonstrated.

UPDATE to the answer considering the following comment

I actually don't want CRLF in my Windows directories, because my Linux environment is actually a VirtualBox sharing the Windows directory

Makes sense. Thanks for the clarification. In this specific context, the .gitattributes file by itself won't be enough.

Run the following commands against your repository

$ git config core.eol lf
$ git config core.autocrlf input

As your repository is shared between your Linux and Windows environment, this will update the local config file for both environment. core.eol will make sure text files bear LF line endings on checkouts. core.autocrlf will ensure potential CRLF in text files (resulting from a copy/paste operation for instance) will be converted to LF in your repository.

Optionally, you can help Git distinguish what is a text file by creating a .gitattributes file containing something similar to the following:

# Autodetect text files
* text=auto

# ...Unless the name matches the following
# overriding patterns

# Definitively text files 
*.txt text
*.c text
*.h text

# Ensure those won't be messed up with
*.jpg binary
*.data binary

If you decided to create a .gitattributes file, commit it.

Lastly, ensure git status mentions "nothing to commit (working directory clean)", then perform the following operation

$ git checkout-index --force --all

This will recreate your files in your working directory, taking into account your config changes and the .gitattributes file and replacing any potential overlooked CRLF in your text files.

Once this is done, every text file in your working directory WILL bear LF line endings and git status should still consider the workdir as clean.

Snow answered 2/4, 2012 at 14:5 Comment(16)
I actually don't want CRLF in my Windows directories, because my Linux environment is actually a VirtualBox sharing the Windows directory; and while Notepad++ etc. can handle LF-only on Windows, vi is less happy with CRLF. Do I just want to change it so that core.autocrlf is false (or input)?Guilt
Excellent answer. A quick note for anyone else using this setup: The line "* text=auto" should be the first line in your .gitattributes file so that subsequent lines can override that setting.Khachaturian
Thanks so much for this post, you have saved me countless visual studio menu clicks since it has no default line ending setting override and no batch file line ending modification method.Dismay
git checkout-index: * is not in the cache < I get that when I try the checkout-index command. Also how does one do this for submodules recursively?Sapro
@Sapro Depending on your shell, git checkout-index --force --all may work better. The second point looks a bit off topic regarding the original question. How about asking a dedicated question?Snow
I don't understand why .gitattributes can't handle the case of sharing a working copy between Linux and Windows. Can we not set text and eol=lf to achieve the same result as described in your answer via core.eol and core.autocrlf?Iminourea
What if you want the truth and nothing but the truth, i.e. no fancy magic conversions, leave it as what it is.Sapro
@Snow I have a very similar scenario, but in my case, Windows is running on Parallels on my Mac. The repo is on the windows os, but shared w/ the mac. If i'm on windows and do a git status, i'll see no file changes, if i'm on my Mac and do git status, almost every file will show up as modified, and then if go back to Windows, and it will show the files changed. What would the configs be like for mac, for windows and .gitattributes look like?Microphone
For checkout-index, why not -a instead of *?Cryptanalysis
@Cryptanalysis See one of the comment above where --all is proposed. I can't remember exactly why (of what version of Git For Windows I was using but that didn't work for me at the time. However, nowadays, using --all makes more sense. Fixed. Thanks for this.Snow
git checkout-index --force --all does nothing for me. What works is the list of commands in the GitHub instructions for dealing with this issue.Wilber
This doesn't work though (in the first case). When I go to add a new file the diff shows a bunch of new lines ending in ^M with is CRLF.Seizure
This link is working for me help.github.com/articles/dealing-with-line-endings/….Traverse
@Snow Actually eol doesn't have any effect when autocrlf is set to true or input: git-scm.com/docs/git-config#Documentation/…Banff
This answer is old. You want @koppor's answer. Single line in .gitattributes: * text=auto eol=lfDread
Why not * text=auto eol=lf instead of * text=auto?Estey
V
258

Starting with Git 2.10 (released 2016-09-03), it is not necessary to enumerate each text file separately. Git 2.10 fixed the behavior of text=auto together with eol=lf. Source.

.gitattributes file in the root of your Git repository:

* text=auto eol=lf

Add and commit it.

Afterwards, you can do the following two steps to normalize all files:

git rm --cached -r .  # Remove every file from git's index.
git reset --hard      # Rewrite git's index to pick up all the new line endings.

Source: Answer by kenorb.

Warning

In some cases, files might be corrupted. This is a very rare case and none of the reporters made a reproducer available.

Git checks the first 8000 bytes of the file for a NUL "Character" to determine if the file is binary. Some binary file formats are more likely to contain 8000 bytes without 8 consecutive zero bits and should be marked as binary in the .gitattribute file to avoid that issue.

Based on this answer and this answer. Kudos to DecimalTurn for the explanation

Valles answered 9/2, 2017 at 11:47 Comment(9)
Git 2.10 has been released on 3rd September 2016.Bluenose
I ran this and it bricked all my non-text filesSymphonist
You can explicitly set binary mode to certain files. - I wonder why the auto detection is (still?!) broken on some filesValles
@Symphonist could you specify which file type they were? Based on this answer and this answer, Git checks the first 8000 bytes of the file for a NUL "Character" to determine if the file is binary. Some binary file formats are more likely to contain 8000 bytes without 8 consecutive zero bits and should be marked as binary in the .gitattribute file to avoid that issue.Selfdrive
It doesn't work for me. IDEA still shows a difference in the line separator.Clarkia
This WILL break every single image file you have checked into your repo. DO NOT use this code!Tangency
@Tangency - My images worked fine. What release of git did you do this with? And did you have your .gitattributes set correctly as listed above? I know there's nothing worse than hearing "It works for me" but I'm not getting broken image with this. Can you share more please?Mal
@JohnRocha, why risk breaking irrecoverably losing new images just to save having to list a few file extension? The text=auto feature seems to be a big case of just because you can, doesn't mean you should...Tangency
@yeerk, I don't see it as a matter of risk for breaking, I see it as a matter of using the appropriate syntax. I've seen a lot of folks make hard rules because "doing 'action-X' burned me this one time" and it wasn't actually 'action-X' that caused the problem. Hence the reason I was asking about the rest of the circumstances.Mal
A
32

To force LF line endings for all text files, you can create .gitattributes file in top-level of your repository with the following lines (change as desired):

# Ensure all C and PHP files use LF.
*.c         eol=lf
*.php       eol=lf

which ensures that all files that Git considers to be text files have normalized (LF) line endings in the repository (normally core.eol configuration controls which one do you have by default).

Based on the new attribute settings, any text files containing CRLFs should be normalized by Git. If this won't happen automatically, you can refresh a repository manually after changing line endings, so you can re-scan and commit the working directory by the following steps (given clean working directory):

$ echo "* text=auto" >> .gitattributes
$ rm .git/index     # Remove the index to force Git to
$ git reset         # re-scan the working directory
$ git status        # Show files that will be normalized
$ git add -u
$ git add .gitattributes
$ git commit -m "Introduce end-of-line normalization"

or as per GitHub docs:

git add . -u
git commit -m "Saving files before refreshing line endings"
git rm --cached -r . # Remove every file from Git's index.
git reset --hard # Rewrite the Git index to pick up all the new line endings.
git add . # Add all your changed files back, and prepare them for a commit.
git commit -m "Normalize all the line endings" # Commit the changes to your repository.

See also: @Charles Bailey post.

In addition, if you would like to exclude any files to not being treated as a text, unset their text attribute, e.g.

manual.pdf      -text

Or mark it explicitly as binary:

# Denote all files that are truly binary and should not be modified.
*.png binary
*.jpg binary

To see some more advanced git normalization file, check .gitattributes at Drupal core:

# Drupal git normalization
# @see https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html
# @see https://www.drupal.org/node/1542048

# Normally these settings would be done with macro attributes for improved
# readability and easier maintenance. However macros can only be defined at the
# repository root directory. Drupal avoids making any assumptions about where it
# is installed.

# Define text file attributes.
# - Treat them as text.
# - Ensure no CRLF line-endings, neither on checkout nor on checkin.
# - Detect whitespace errors.
#   - Exposed by default in `git diff --color` on the CLI.
#   - Validate with `git diff --check`.
#   - Deny applying with `git apply --whitespace=error-all`.
#   - Fix automatically with `git apply --whitespace=fix`.

*.config  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.css     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.dist    text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.engine  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.html    text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=html
*.inc     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.install text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.js      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.json    text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.lock    text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.map     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.md      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.module  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.php     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.po      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.profile text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.script  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.sh      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.sql     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.svg     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.theme   text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
*.twig    text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.txt     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.xml     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
*.yml     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2

# Define binary file attributes.
# - Do not treat them as text.
# - Include binary diff in patches instead of "binary files differ."
*.eot     -text diff
*.exe     -text diff
*.gif     -text diff
*.gz      -text diff
*.ico     -text diff
*.jpeg    -text diff
*.jpg     -text diff
*.otf     -text diff
*.phar    -text diff
*.png     -text diff
*.svgz    -text diff
*.ttf     -text diff
*.woff    -text diff
*.woff2   -text diff

See also:

Arrington answered 15/1, 2016 at 11:37 Comment(7)
1. text=auto is misleading. You can't use text=auto and eol together. Setting eol disables automatic detection of text files. This is why you have to specify all those file types. If auto was enabled, you wouldn't need all of that. 2. You don't need text and eol=lf. eol=lf effectively sets text.So
2nd what @So said, this config is currently wrong and dangerous if you don't explicitly mark all binary files.Galoot
I've read that * text=auto eol=lf the first text=auto is overridden by eol=lf. Where did you find this feature? Here's my source: #29435656Sapro
Removed * text=auto eol=lf from the example, since it was removed from Drupal as well. Consider removing comments as well.Arrington
How to reset git index for subdirectorySpecialist
Related: git replacing LF with CRLF.Arrington
It's important to note that what @So said is no longer true and it always was a bug - not an intended behavior.Banff
M
8

I was cloning the Chromium depot_tools to my mac and all files of the work copy were ended with CRLF. I found this script which fixed my problem.

cd <your repo>
# config the local repo to use LF
git config core.eol lf
git config core.autocrlf input

# Copy files from the index to the working tree
git checkout-index --force --all

# If above line doesn't work, delete all cached files and reset.
git rm --cached -r .
git reset --hard
Myrtlemyrvyn answered 19/2, 2022 at 5:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.