Is git only good for text files/source code? [duplicate]
Asked Answered
C

4

16

A very noob question.

Is it also good to use git for my whole project (as kind of synchronization between different storages, version control is not the main point), including images, pdfs, Word documents, probably even some exe-files?

How does it track changes in pdfs, images, exe-files, if at all? If it stores the whole changed files simply because there's some difference with the HEAD version, the repository can become quite large after few commits.. Or does it still manage to save only the incremental changes in files other than text files?

The bottom line: is git good (or at least acceptable) for synchronization of big projects? For me it will be enough if it's not worse than Dropbox etc. (in terms of end result, GUI is not an issue).

Calliecalligraphy answered 3/12, 2014 at 9:51 Comment(0)
O
8

Git can see that you changed your non-text files, but you won't be able to get the best of git in that case. With text files you can see what is the actual difference between different versions / commits.

That being said, you could try this solution for image diffs in git. I am sure that there should be software to display differences between other file types, that you may need and that would make sens to check for differences.

Compared to dropbox, git should be better, because you can use commit messages that will say what was done in that particular change, and you can create feature branches; but it is a bit more complicated, due to its purpose, namely keeping track of source code differences between versions.

EDIT:

A̶n̶d̶ ̶n̶o̶,̶ ̶G̶i̶t̶ ̶d̶o̶e̶s̶ ̶n̶o̶t̶ ̶s̶a̶v̶e̶ ̶i̶n̶c̶r̶e̶m̶e̶n̶t̶a̶l̶ ̶c̶h̶a̶n̶g̶e̶s̶ ̶f̶o̶r̶ ̶n̶o̶n̶-̶t̶e̶x̶t̶ ̶f̶i̶l̶e̶s̶,̶ ̶b̶u̶t̶ ̶n̶e̶i̶t̶h̶e̶r̶ ̶d̶o̶e̶s̶ ̶d̶r̶o̶p̶b̶o̶x̶,̶ ̶a̶s̶ ̶f̶a̶r̶ ̶a̶s̶ ̶I̶ ̶k̶n̶o̶w̶.̶

It looks like git is storing non-text files as character strings, so yes it should only keep track of differences. Therefore, any good difftool like meld or Beyond Compare should be able to tell the difference between two images, for instance. For instance, I was able to see the differences between two png images with Beyond Compare.

It also seems to do a good job with PDF files, but, like exe files, you should not track those file types with version control. Instead of PDFs, keep track of their source code - for instance LaTeX files (which are plain text). Due to their nature, compiled files, like exe files are not suited for version control. The reason for this is that even if you edit directly into the character string of the file, you won't be able to accomplish much - you are supposed to edit the source code.

Outbreed answered 3/12, 2014 at 10:3 Comment(0)
B
8

Binary files can go into one of this categories:

  1. Binary files that can be reproduced by the source code. There is no point in storing and keeping track of them. You don't typically edit a .exe file to make changes. Just be sure to store all the building scripts needed reproduce the build and add the binaries to .gitignore.

  2. Binary files that can be edited and compared. For example, office files. There are some workarounds like converting them to text like shown here. Some GIT IDEs might allow external tool to make diffs.

  3. Binary files that can be edited but are hard to compare. How would you represent the diff of two videos? Possible but hard. Depending on the size I would add the files to GIT. You always get most of the benefits of GIT like keeping track of different versions, knowing when one file has changed, etc. The price you pay is a bigger repository size. Comparing will require humans eyes opening the files anyway..

  4. Binary files that won't typically be edited and are used as inputs. For example, A .jar file as a dependency. In this case you need the metadata of what this binary object was and how to get it. You can try systems like Maven where you track dependencies storing the pom.xml and you add the binaries to gitignore. Other files can be tracked by a manual Dependecies.txt file (This versions needs My.Lib1.jar version 10.32.3 ...). You will need the discipline to update the file with every change. It will help you to know what where the binary changes in each version.

Buyse answered 3/12, 2014 at 10:40 Comment(0)
K
4

Git can be used for big projects, but you shouldn't check in generated files (like pdf, exe, etc). Add a .gitignore file (google for details) in which is written which files git should ignore.

If you want to include Word files (or similar), which are binary files, but aren't generated, there are possibilities to tell git how to "diff" such files. This means that you tell git how it can compare two word files and decide how to merge two different word files. Again, google will be your fried to find out the details of how to do this.

Kat answered 3/12, 2014 at 9:55 Comment(0)
L
3

if you did a change in a file then git does not store only the change(difference). instead it stores the whole file again. eg : if you changed a single line of a 2MB file. git will store the whole file again with the new change. (then size of the repository will be 4MB)

Literality answered 3/12, 2014 at 10:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.