why does git treat some cpp files as binary?
Asked Answered
K

2

6

here's output of git log :

* 5a831fdb34f05edd62321d1193a96b8f96486d69      HEAD (HEAD, origin/work, work)
|  LIB/xxx.cpp                        |  Bin 592994 -> 593572 bytes
|  LIB/xxx.h                          |    5 +++++
|  LIB/bbb/xxx.h                      |    9 +++++++++
|  LIB/aaa/xxx.cpp                    |  Bin 321534 -> 321536 bytes
|  LIB/aaa/yyy.cpp                    |   31 +++++++------------------------
|  tests/aaa/xxx.cpp                  |   29 +++++++++++++++++++++++++++++
|  tests/test_xxx.vcproj              |    4 ++++
|  7 files changed, 54 insertions(+), 24 deletions(-)

why is it treating some files as binary, and others not? This gives serious problems since git also doesn't want to automatically merge them.. Hence pretty much all merge/rebase/pull actions become a pain.

Here's the repo config:

[core]
  repositoryformatversion = 0
  filemode = false
  bare = false
  logallrefupdates = true
  symlinks = false
  ignorecase = true
  hideDotFiles = dotGitOnly
[remote "origin"]
  fetch = +refs/heads/*:refs/remotes/origin/*
  url = https://xxx/project.git
[branch "master"]
  remote = origin
  merge = refs/heads/master
[branch "work"]
  remote = origin
  merge = refs/heads/work
[svn-remote "svn"]
  url = xxxx
  fetch = :refs/remotes/git-svn

also core.autocrlf = false in the main .gitconfig.

edit I set core.autocrlf to true as suggested in the comments, but this doesn't seem to affect the next merge I'm after (maybe it's too late now to change autocrlf? or is it unrelated to the problem?):

> git merge work
warning: Cannot merge binary files: LIB/xxx.cpp (HEAD vs. work)

warning: Cannot merge binary files: LIB/aaa/xxx.cpp (HEAD vs. work)

Auto-merging LIB/xxx.cpp
CONFLICT (content): Merge conflict in LLIB/xxx.cpp
Auto-merging LIB/xxx.h
Auto-merging LIB/aaa/xxx.cpp
CONFLICT (content): Merge conflict in LIB/aaa/xxx.cpp
Automatic merge failed; fix conflicts and then commit the result.

Also now gits insist on changing lineendings in a couple of files (which is what I do not want).

Kisor answered 19/1, 2011 at 13:15 Comment(0)
C
10

Try adding the following line to your $repo/.git/info/attributes:

*.cpp crlf diff

You can specify it in gitattributes per-repo, per-user and per-system.


Basic check-list

• Do you actually have CRLF or LF line endings in your file?
👉 Yes, CRLF — set core.autocrlf to true (at least for this repo).


• Does the file contain funny non-ASCII characters: umlauts, diacritics, emoji, kanji, copyright sigil ©, invisible esoteric spaces, etc?..
👉 If yes, better ensure that all the stuff is encoded in UTF-8. Fuzzing with surrogate pairs isn't fun.


• Does the file content start with UTF-8 BOM?
👉 Wipe it now, it makes no sense.


• Does the file content start with UTF16 BOM?
👉 Too bad; I've got no good advice for you at this point; sorry. Contact your system vendor.

Carport answered 19/1, 2011 at 13:41 Comment(2)
It might be an encoding issue too, Git works best with utf-8, if the encoding is in something like utf-16, Git will think it's binary, no matter what you set in .gitattributes.Cheryl
not really clear to me why it has to be set explicitly though I would guess you have a copyright name somewhere at the top of the file, which requires non-ASCII characters.Lots
C
5

It is treating some files as binary, because they have wrong file encoding. It should work ok, if you convert those files to UTF-8 (or to the same encoding that in normal files). To change file encoding use notepad++ or any another way.

Collimator answered 29/3, 2018 at 9:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.