How line ending conversions work with git core.autocrlf between different operating systems
Asked Answered
E

8

290

I've read a lot of different questions and answers on Stack Overflow as well as git documentation on how the core.autocrlf setting works.

This is my understanding from what I've read:

Unix and Mac OSX (pre-OSX uses CR) clients use LF line endings.
Windows clients use CRLF line endings.

When core.autocrlf is set to true on the client, the git repository always stores files in LF line ending format and line endings in files on the client are converted back and forth on check out / commit for clients (i.e. Windows) that use non-LF line endings, no matter what format the line endings files are on the client (this disagrees with Tim Clem's definition - see update below).

Here is a matrix that tries to document the same for the 'input' and 'false' settings of core.autocrlf with question marks where I'm not sure of line ending conversion behavior.

My questions are:

  1. What should the question marks be?
  2. Is this matrix correct for the "non-question marks"?

I'll update the question marks from the answers as consensus appears to be formed.

                       core.autocrlf value
            true            input              false
----------------------------------------------------------
commit   |  convert           ?                  ?
new      |  to LF      (convert to LF?)   (no conversion?)

commit   |  convert to        ?                 no 
existing |  LF         (convert to LF?)     conversion

checkout |  convert to        ?                 no
existing |  CRLF       (no conversion?)     conversion

I'm not really looking for opinions on the pros and cons of the various settings. I'm just looking for data which makes it clear how to expect git to operate with each of the three settings.

--

Update 04/17/2012: After reading the article by Tim Clem linked by JJD in the comments, I have modified some of the values in the "unknown" values in the table above, as well as changing "checkout existing | true to convert to CRLF instead of convert to client". Here are the definitions he gives, which are more clear than anything I've seen elsewhere:

core.autocrlf = false

This is the default, but most people are encouraged to change this immediately. The result of using false is that Git doesn’t ever mess with line endings on your file. You can check in files with LF or CRLF or CR or some random mix of those three and Git does not care. This can make diffs harder to read and merges more difficult. Most people working in a Unix/Linux world use this value because they don’t have CRLF problems and they don’t need Git to be doing extra work whenever files are written to the object database or written out into the working directory.

core.autocrlf = true

This means that Git will process all text files and make sure that CRLF is replaced with LF when writing that file to the object database and turn all LF back into CRLF when writing out into the working directory. This is the recommended setting on Windows because it ensures that your repository can be used on other platforms while retaining CRLF in your working directory.

core.autocrlf = input

This means that Git will process all text files and make sure that CRLF is replaced with LF when writing that file to the object database. It will not, however, do the reverse. When you read files back out of the object database and write them into the working directory they will still have LFs to denote the end of line. This setting is generally used on Unix/Linux/OS X to prevent CRLFs from getting written into the repository. The idea being that if you pasted code from a web browser and accidentally got CRLFs into one of your files, Git would make sure they were replaced with LFs when you wrote to the object database.

Tim's article is excellent, the only thing I can think of that is missing is that he assumes the repository is in LF format, which is not necessarily true, especially for Windows only projects.

Comparing Tim's article to the highest voted answer to date by jmlane shows perfect agreement on the true and input settings and disagreement on the false setting.

Everest answered 8/7, 2010 at 18:39 Comment(8)
Keeping autocrlf to false seems so much easier ;) #2333924Criticism
@VonC: I've read that and I think I understand it, but I don't necessarily get to make the choice. I work with git repositories that I don't control who require that I set the value in a certain way.Everest
@Michael: and, depending on the Git server version, the rules about eol and autocrlf are about to change in the upcoming 1.7.2! See article.gmane.org/gmane.linux.kernel/1007412Criticism
Wouldn't it be nice if Windows normalised to LF too? Mac used to be CR (prior v10) but is now normalised to LF.Dutcher
I need to add a link to the great article of Timothy Clem - please read all of Mind the End of Your Line.Baker
Scenario: I'm a split Linux/Windows developer. I only use text editors that can recognize both types of line endings (IE. vim, eclipse). I only need (want) to work with files ending in LF. I currently have core.autocrlf=input set in my global git config. Am I good to go? Will I ever have a conflict?Papilla
It also seems to be hereHomogeny
Related: Disable git EOL ConversionsBackwards
C
159

The best explanation of how core.autocrlf works is found on the gitattributes man page, in the text attribute section.

This is how core.autocrlf appears to work currently (or at least since v1.7.2 from what I am aware):

  • core.autocrlf = true
  1. Text files checked-out from the repository that have only LF characters are normalized to CRLF in your working tree; files that contain CRLF in the repository will not be touched
  2. Text files that have only LF characters in the repository, are normalized from CRLF to LF when committed back to the repository. Files that contain CRLF in the repository will be committed untouched.
  • core.autocrlf = input
  1. Text files checked-out from the repository will keep original EOL characters in your working tree.
  2. Text files in your working tree with CRLF characters are normalized to LF when committed back to the repository.
  • core.autocrlf = false
  1. core.eol dictates EOL characters in the text files of your working tree.
  2. core.eol = native by default, which means working tree EOLs will depend on where git is running: CRLF on a Windows machine, or LF in *nix.
  3. Repository gitattributes settings determines EOL character normalization for commits to the repository (default is normalization to LF characters).

I've only just recently researched this issue and I also find the situation to be very convoluted. The core.eol setting definitely helped clarify how EOL characters are handled by git.

Chamorro answered 13/12, 2010 at 2:45 Comment(6)
for autocrlf=true shouldn't be the following? Text files that have only CRLF EOL characters in the repository, are normalized from CRLF to LF when committed back to the repository. Files that contain LF in the repository will be committed untouched.Hakluyt
For me, even if autocrlf=false git was converting the EOL to CRLF. After reading this answer I realized that my .gitattribute file had text=auto set which was causing the trouble.Roxi
For core.autocrlf = false, if I don't have a gitattributes file does it mean that there will be no normalization? Or does it mean it will use the default normalization?Fortuitism
Shouldn't .gitattributes file take precedence over core.autocrlf setting?Jenifer
it should be noted that after changing this setting, it helps to git rm --cached -r . and then git reset --hard in order to rewrite all files in the working tree. You will lose uncommitted changes by doing this!Pledget
@Roxi got it right; as @Chamorro states, when autocrlf=false, "core.eol dictates EOL characters in the text files of your working tree…". In fact, even the least meddlesome git config, which is autocrlf=input, still entails (again, thanks to core.eol) "[CRLF → LF] when committed…", meaning you can't simply add a new CRLF file into the repo as-is. The real culprit is what's designating various files for such abuse in the first place…, namely the .gitattributes entry * text[=auto]. If you want git to leave your files alone, start by disabling that: * -text.Garonne
G
81

The issue of EOLs in mixed-platform projects has been making my life miserable for a long time. The problems usually arise when there are already files with different and mixed EOLs already in the repo. This means that:

  1. The repo may have different files with different EOLs
  2. Some files in the repo may have mixed EOL, e.g. a combination of CRLF and LF in the same file.

How this happens is not the issue here, but it does happen.

I ran some conversion tests on Windows for the various modes and their combinations.
Here is what I got, in a slightly modified table:

                 | Resulting conversion when       | Resulting conversion when 
                 | committing files with various   | checking out FROM repo - 
                 | EOLs INTO repo and              | with mixed files in it and
                 |  core.autocrlf value:           | core.autocrlf value:           
--------------------------------------------------------------------------------
File             | true       | input      | false | true       | input | false
--------------------------------------------------------------------------------
Windows-CRLF     | CRLF -> LF | CRLF -> LF | as-is | as-is      | as-is | as-is
Unix -LF         | as-is      | as-is      | as-is | LF -> CRLF | as-is | as-is
Mac  -CR         | as-is      | as-is      | as-is | as-is      | as-is | as-is
Mixed-CRLF+LF    | as-is      | as-is      | as-is | as-is      | as-is | as-is
Mixed-CRLF+LF+CR | as-is      | as-is      | as-is | as-is      | as-is | as-is

As you can see, there are 2 cases when conversion happens on commit (3 left columns). In the rest of the cases the files are committed as-is.

Upon checkout (3 right columns), there is only 1 case where conversion happens when:

  1. core.autocrlf is true and
  2. the file in the repo has the LF EOL.

Most surprising for me, and I suspect, the cause of many EOL problems is that there is no configuration in which mixed EOL like CRLF+LF get normalized.

Note also that "old" Mac EOLs of CR only also never get converted.
This means that if a badly written EOL conversion script tries to convert a mixed ending file with CRLFs+LFs, by just converting LFs to CRLFs, then it will leave the file in a mixed mode with "lonely" CRs wherever a CRLF was converted to CRCRLF.
Git will then not convert anything, even in true mode, and EOL havoc continues. This actually happened to me and messed up my files really badly, since some editors and compilers (e.g. VS2010) don't like Mac EOLs.

I guess the only way to really handle these problems is to occasionally normalize the whole repo by checking out all the files in input or false mode, running a proper normalization and re-committing the changed files (if any). On Windows, presumably resume working with core.autocrlf true.

Gombroon answered 26/12, 2012 at 11:23 Comment(1)
Excellent answer, but one sentence with whom I cannot agree is On Windows, presumably resume working with core.autocrlf true. I personally believe that input should be used always.Chavira
E
80

The core.autocrlf value does not depend on the operating system,
but the default value is true on Windows. On Linux it is input.
I explored three possible values for the checkout and commit cases.

Here is the resulting table :

╔═══════════════╦══════════════╦══════════════╦══════════════╗
║ core.autocrlf ║     false    ║     input    ║     true     ║
╠═══════════════╬══════════════╬══════════════╬══════════════╣
║               ║ LF   => LF   ║ LF   => LF   ║ LF   => CRLF ║
║ git checkout  ║ CR   => CR   ║ CR   => CR   ║ CR   => CR   ║
║               ║ CRLF => CRLF ║ CRLF => CRLF ║ CRLF => CRLF ║
╠═══════════════╬══════════════╬══════════════╬══════════════╣
║               ║ LF   => LF   ║ LF   => LF   ║ LF   => LF   ║
║ git commit    ║ CR   => CR   ║ CR   => CR   ║ CR   => CR   ║
║               ║ CRLF => CRLF ║ CRLF => LF   ║ CRLF => LF   ║
╚═══════════════╩══════════════╩══════════════╩══════════════╝
Ethnarch answered 22/12, 2016 at 11:41 Comment(2)
Shorty summary in words: Files with CR alone are never touched. false never touches line endings. true always commits as LF and checks out as CRLF. And input always commits as LF and checks out as-is.Blowhard
@Ethnarch it does depend on the Operating System ->when autocrlf = false and eol=native.Olette
C
41

Things are about to change on the "eol conversion" front, with the upcoming Git 1.7.2:

A new config setting core.eol is being added/evolved:

This is a replacement for the 'Add "core.eol" config variable' commit that's currently in pu (the last one in my series).
Instead of implying that "core.autocrlf=true" is a replacement for "* text=auto", it makes explicit the fact that autocrlf is only for users who want to work with CRLFs in their working directory on a repository that doesn't have text file normalization.
When it is enabled, "core.eol" is ignored.

Introduce a new configuration variable, "core.eol", that allows the user to set which line endings to use for end-of-line-normalized files in the working directory.
It defaults to "native", which means CRLF on Windows and LF everywhere else. Note that "core.autocrlf" overrides core.eol.
This means that:

[core]
  autocrlf = true

puts CRLFs in the working directory even if core.eol is set to "lf".

core.eol:

Sets the line ending type to use in the working directory for files that have the text property set.
Alternatives are 'lf', 'crlf' and 'native', which uses the platform's native line ending.
The default value is native.


Other evolutions are being considered:

For 1.8, I would consider making core.autocrlf just turn on normalization and leave the working directory line ending decision to core.eol, but that will break people's setups.


git 2.8 (March 2016) improves the way core.autocrlf influences the eol:

See commit 817a0c7 (23 Feb 2016), commit 6e336a5, commit df747b8, commit df747b8 (10 Feb 2016), commit df747b8, commit df747b8 (10 Feb 2016), and commit 4b4024f, commit bb211b4, commit 92cce13, commit 320d39c, commit 4b4024f, commit bb211b4, commit 92cce13, commit 320d39c (05 Feb 2016) by Torsten Bögershausen (tboegi).
(Merged by Junio C Hamano -- gitster -- in commit c6b94eb, 26 Feb 2016)

convert.c: refactor crlf_action

Refactor the determination and usage of crlf_action.
Today, when no "crlf" attribute are set on a file, crlf_action is set to CRLF_GUESS. Use CRLF_UNDEFINED instead, and search for "text" or "eol" as before.

Replace the old CRLF_GUESS usage:

CRLF_GUESS && core.autocrlf=true -> CRLF_AUTO_CRLF
CRLF_GUESS && core.autocrlf=false -> CRLF_BINARY
CRLF_GUESS && core.autocrlf=input -> CRLF_AUTO_INPUT

Make more clear, what is what, by defining:

- CRLF_UNDEFINED : No attributes set. Temparally used, until core.autocrlf
                   and core.eol is evaluated and one of CRLF_BINARY,
                   CRLF_AUTO_INPUT or CRLF_AUTO_CRLF is selected
- CRLF_BINARY    : No processing of line endings.
- CRLF_TEXT      : attribute "text" is set, line endings are processed.
- CRLF_TEXT_INPUT: attribute "input" or "eol=lf" is set. This implies text.
- CRLF_TEXT_CRLF : attribute "eol=crlf" is set. This implies text.
- CRLF_AUTO      : attribute "auto" is set.
- CRLF_AUTO_INPUT: core.autocrlf=input (no attributes)
- CRLF_AUTO_CRLF : core.autocrlf=true  (no attributes)

As torek adds in the comments:

all these translations (any EOL conversion from eol= or autocrlf settings, and "clean" filters) are run when files move from work-tree to index, i.e., during git add rather than at git commit time.
(Note that git commit -a or --only or --include do add files to the index at that time, though.)

For more on that, see "What is difference between autocrlf and eol".

Criticism answered 9/7, 2010 at 4:21 Comment(15)
This, unfortunately, doesn't add clarity for me. It seems like they are saying there are problems with the current implementation (it's not clear what those problems are) and they are increasing the complexity in an effort to solve those unspecified problems. In my opinion, the core.autocrlf setting is already overly complex and under-documented and that situation appears to be getting worse. Thanks again for the heads up.Everest
This does not seem like a satisfactory solution, and seems to have the same problems as core.autocrlf. My preference would be if git would never automatically modify anything, but it would warn the user who wants to add or commit the wrong line endings. So you would need a commandline option to allow "git add" to add the "wrong" line endings. (probably git add is the better place to check this than git commit)Planetesimal
This would force the respective user to change their editor settings and really take care of the problem. While it would allow to leave the "wrong" line endings for files from 3rd parties, or that are already checked into the repository.Planetesimal
@Planetesimal again, I agree. But core.eol is about "automatically modifying" only what you explicitly declare in a .gitattributes file. This is different from core.autocrlf which applies to any file in the repo. It is a declarative process.Criticism
ah, ok. So if I Understand correctly, if .gitattributes is empty, core.autocrlf will assume some default file types to always convert, whereas core.eol will assume nothing. But once you create a .gitattributes, then both core.eol and core.autocrlf will behave the same?Planetesimal
And yes you agree, but I still wanted to say this here because it seems a more relevant place to say it than in the other question.Planetesimal
@donquixote: I realize this is quite old but I only read your comment now. In fact, all these translations (any EOL conversion from eol= or autocrlf settings, and "clean" filters) are run when files move from work-tree to index, i.e., during git add rather than at git commit time. (Note that git commit -a or --only or --include do add files to the index at that time, though.) For what it's worth, you and I and Linus Torvalds all hate the idea of a VCS ever modifying what's being committed. But there are all those Windows users... :-)Playback
@Playback Thank you for the comment. I have included it in the answer for more visibility.Criticism
That's a great post. Thanks for all the in depth clarification. However, I have one question that still remains unanswered (at least to me). I want to enforce autocrlf=input through the gitattributes file, how can I do that?Eachelle
@Eachelle You would need a core.eol directive, associated to '*' (for all the files) See git-scm.com/docs/gitattributes#gitattributes-Settostringvaluelf: eol=lf if you are on Linux.Criticism
@Criticism if my understanding is correct, core.eol implies that every files associated with that filter would have its line endings converted to lf in my working directory. I don't want that. What I want is to checkout as is and commit with lf which is what core.autocrlf=input does if my understanding is correct again.Eachelle
@Eachelle I agree. That means core.eol might not be the right tool.Criticism
@Criticism which brings us back to my original question. Is there a way to enforce core.autocrlf=input from within the .gitattributes in order to propagate the policy throughtout my colleagues ?Eachelle
@Eachelle Good question; could you ask it as an independent one, since I am not sure of the answer based of what is documented above?Criticism
@Criticism Thanks for the immediate responses. I just did post an independent question here: #42668496Eachelle
R
10

Here is my understanding of it so far, in case it helps someone.

core.autocrlf=true and core.safecrlf = true

You have a repository where all the line endings are the same, but you work on different platforms. Git will make sure your lines endings are converted to the default for your platform. Why does this matter? Let's say you create a new file. The text editor on your platform will use its default line endings. When you check it in, if you don't have core.autocrlf set to true, you've introduced a line ending inconsistency for someone on a platform that defaults to a different line ending. I always set safecrlf too because I would like to know that the crlf operation is reversible. With these two settings, git is modifying your files, but it verifies that the modifications are reversible.

core.autocrlf=false

You have a repository that already has mixed line endings checked in and fixing the incorrect line endings could break other things. Its best not to tell git to convert line endings in this case, because then it will exacerbate the problem it was designed to solve - making diffs easier to read and merges less painful. With this setting, git doesn't modify your files.

core.autocrlf=input

I don't use this because the reason for this is to cover a use case where you created a file that has CRLF line endings on a platform that defaults to LF line endings. I prefer instead to make my text editor always save new files with the platform's line ending defaults.

Romo answered 20/1, 2014 at 20:27 Comment(0)
F
6

No, the @jmlane answer is wrong.

For Checkin (git add, git commit):

  1. if text property is Set, Set value to 'auto', the conversion happens enen the file has been committed with 'CRLF'
  2. if text property is Unset:nothing happens, enen for Checkout
  3. if text property is Unspecified, conversion depends on core.autocrlf
    1. if autocrlf = input or autocrlf = true, the conversion only happens when the file in the repository is 'LF', if it has been 'CRLF', nothing will happens.
    2. if autocrlf = false, nothing happens

For Checkout:

  1. if text property is Unset: nothing happens.
  2. if text property is Set, Set value to 'auto: it depends on core.autocrlf, core.eol.
    1. core.autocrlf = input : nothing happens
    2. core.autocrlf = true : the conversion only happens when the file in the repository is 'LF', 'LF' -> 'CRLF'
    3. core.autocrlf = false : the conversion only happens when the file in the repository is 'LF', 'LF' -> core.eol
  3. if text property is Unspecified, it depends on core.autocrlf.
    1. the same as 2.1
    2. the same as 2.2
    3. None, nothing happens, core.eol is not effective when text property is Unspecified

Default behavior

So the Default behavior is text property is Unspecified and core.autocrlf = false:

  1. for checkin, nothing happens
  2. for checkout, nothing happens

Conclusions

  1. if text property is set, checkin behavior is depends on itself, not on autocrlf
  2. autocrlf or core.eol is for checkout behavior, and autocrlf > core.eol
Flown answered 8/3, 2018 at 2:55 Comment(0)
B
3

Did some tests both on linux and windows. I use a test file containing lines ending in LF and also lines ending in CRLF.
File is committed , removed and then checked out. The value of core.autocrlf is set before commit and also before checkout. The result is below.

commit core.autocrlf false, remove, checkout core.autocrlf false: LF=>LF   CRLF=>CRLF  
commit core.autocrlf false, remove, checkout core.autocrlf input: LF=>LF   CRLF=>CRLF  
commit core.autocrlf false, remove, checkout core.autocrlf true : LF=>LF   CRLF=>CRLF  
commit core.autocrlf input, remove, checkout core.autocrlf false: LF=>LF   CRLF=>LF  
commit core.autocrlf input, remove, checkout core.autocrlf input: LF=>LF   CRLF=>LF  
commit core.autocrlf input, remove, checkout core.autocrlf true : LF=>CRLF CRLF=>CRLF  
commit core.autocrlf true, remove, checkout core.autocrlf false: LF=>LF   CRLF=>LF  
commit core.autocrlf true, remove, checkout core.autocrlf input: LF=>LF   CRLF=>LF  
commit core.autocrlf true,  remove, checkout core.autocrlf true : LF=>CRLF CRLF=>CRLF  
Barry answered 23/2, 2017 at 8:51 Comment(0)
W
2

The statement core.autocrlf=true leading to CRLF -> LF on commit is all wrong! It's not all that simple, as you'll see...

The docs say the setting corresponds to... “text=auto in .gitattributes and core.eol being set to crlf in git config” ...meaning what exactly?

Meaning that, if a file does not have a .gitattributes text attribute set, and if core.autocrlf is true, it now depends on whether you the file being committed is new (in that case, yes, it will get normalized to LF in the git repo database), or whether it was an existing file that you edited and are now committing (in which case NOTHING will happen... unless you run git add --renormalize . in which case it will get normalized in the git repo database).

You see... the whole mechanism only happens to a file for which a .gitattributes has not placed a variant of the text attribute: text, -text, text=auto.

So what you should really be looking at is using .gitattributes with a default setting on all your files, being either:

* -text
# followed by specialization

which will default all (except specializations) to as-is, and overriding core.autocrlf completely, or using a default of:

*  text=auto
# followed by specialization

meaning that all files (except specializations) that git auto-detects as non-binary (text), and which have LF in the git database[see note 1.], will get CRLF whenever:
    • core.autocrlf is true, or
    • core.eol is crlf, or
    • core.eol is native (default) and you're on a Windows platform.
In all other cases, you get LF.

What specializations do I mean? For example having .bat files be CRLF and .sh files be LF via either:

*.sh           text eol=lf

# *.bat
*.[bB][aA][tT] text eol=crlf

or

# *.sh are committed correctly as-is (LF)
*.sh           -text

# *.bat are committed correctly as-is (CRLF)
*.[bB][aA][tT] -text

So yes... it's all not so simple.


[note 1]:
This will be the case for all files matching the text=auto attribute (i.e. not having some other specialization), since I assume your repo was properly normalized when the .gitattribute was created

Wroth answered 28/3, 2021 at 20:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.