What to put under version control?
Asked Answered
S

8

20

Almost any IDE creates lots of files that have nothing to do with the application being developed, they are generated and mantained by the IDE so he knows how to build the application, where the version control repository is and so on.

Should those files be kept under version control along with the files that really have something to do with the aplication (source code, application's configuration files, ...)?

The things is: on some IDEs if you create a new project and then import it into the version-control repository using the version-control client/commands embedded in the IDE, then all those files are sent to the respitory. And I'm not sure that's right: what is two different developers working on the same project want to use two different IDEs?


I want to keep this question agnostic avoiding references to any particular IDE, programming language or version control system. So this question is not exactly the same as these:

Speechless answered 10/12, 2009 at 12:57 Comment(1)
See also #116621Encomiast
H
19

Rules of thumb:

  1. Include everything which has an influence on the build result (compiler options, file encodings, ASCII/binary settings, etc.)
  2. Include everything to make it possible to open the project from a clean checkout and being able to compile/run/test/debug/deploy it without any further manual intervention
  3. Don't include files which contain absolute paths
  4. Avoid including personal preferences (tab size, colors, window positions)

Follow the rules in this order.

[Update] There is always the question what should happen with generated code. As a rule of thumb, I always put those under version control. As always, take this rule with a grain of salt.

My reasons:

Versioning generated code seems like a waste of time. It's generated right? I can get it back at a push of a button!

Really?

If you had to bite the bullet and generate the exact same version of some previous release without fail, how much effort would it be? When generating code, you not only have to get all the input files right, you also have to turn back time for the code generator itself. Can you do that? Always? As easy as it would be to check out a certain version of the generated code if you had put it under version control?

And even if you could, could you ever be sure that didn't miss something?

So on one hand, putting generated code under version control make sense since it makes it dead easy to do what VCS are meant for: Go back in time.

Also it makes it easy to see the differences. Code generators are buggy, too. If I fix a bug and have 150'000 files generated, it helps a lot when I can compare them to the previous version to see that a) the bug is gone and b) nothing else changed unexpectedly. It's the unexpected part which you should worry about. If you don't, let me know and I'll make sure you never work for my company ever :-)

The major pain point of code generators is stability. It doesn't do when your code generator just spits out a random mess of bytes every time you run (well, unless you don't care about quality). Code generators need to be stable and deterministic. You run them twice with the same input and the output must be identical down to least significant bit.

So if you can't check in generated code because every run of the generator creates differences that aren't there, then your code generator has a bug. Fix it. Sort the code when you have to. Use hash maps that preserve order. Do everything necessary to make the output non-random. Just like you do everywhere else in your code.

Generated code that I might not put under version control would be documentation. Documentation is somewhat of a soft target. It doesn't matter as much when I regenerate the wrong version of the docs (say, it has a few typos more or less). But for releases, I might do that anyway so I can see the differences between releases. Might be useful, for example, to make sure the release notes are complete.

I also don't check in JAR files. As I do have full control over the whole build and full confidence that I can get back any version of the sources in a minute plus I know that I have everything necessary to build it without any further manual intervention, why would I need the executables for? Again, it might make sense to put them into a special release repo but then, better keep a copy of the last three years on your company's web server to download. Think: Comparing binaries is hard and doesn't tell you much.

Hereof answered 10/12, 2009 at 13:8 Comment(5)
I'd add two rules "Avoid anything that may be generated from other included files" and "Avoid anything that changes frequently and automatically"Cris
Rules 1 and 2 are totally wrong. They apply to the distribution tarball, but NOT to the VCS.Churchyard
I actually agree with 1 and 2 100%. It's very important to be able to get a copy of the source, type "ant run" or somesuch, and be up and going. I'll even include the source or binary of a library I depend on. I'm totally happy with this approach; it's far better than requiring all developers to go download and install something every time there's a new dependency.Demoiselle
Versioning generated code is part of a good SQL database development workflow. First you create a live database using whatever tools you like. Then you use a tool to generate a CREATE script that describes each object. To version the database structure, all the scripts go into version control. In app development terms, it's kind of like starting with a binary and decompiling it to get the source code. Weird, eh?Proton
@Ian I would call that a usable DB development workflow, but not a good one. Why not start with the scripts instead of with the live DB? Certainly that's what I do.Working
C
7

I think it's best to put anything under version control that helps developers to get started quickly, ignoring anything that may be auto-generated by an IDE or build tools (e.g. Maven's eclipse plugin generates .project and .classpath - no need to check these in). Especially avoid files that change often, that contain nothing but user preferences, or that conflict between IDEs (e.g. another IDE that uses .project just like eclipse does).

For eclipse users, I find it especially handy to add code style (.settings/org.eclipse.jdt.core.prefs - auto formatting on save turned on) to get consistently formatted code.

Cris answered 10/12, 2009 at 13:3 Comment(0)
O
4

Everything that can be automatically generated from the source+configuration files should not be under the version control! It only causes problems and limitations (like the one you stated - using 2 different project files by different programmers).

Its true not only for IDE "junk files" but also for intermediate files (like .pyc in python, .o in c etc).

Outstretch answered 10/12, 2009 at 13:4 Comment(0)
T
3

This is where build automation and build files come in.

For example, you can still build the project (the two developers will need the same build software obviously) but they then could in turn use two different IDE's.

As for the 'junk' that gets generated, I tend to ignore most if it. I know this is meant to be language agnostic but consider Visual Studio. It generates user files (user settings etc..) this should not be under source control.

On the other hand, project files (used by the build process) most certainly should. I should add that if you are on a team and have all agreed on an IDE, then checking in IDE specific files is fine providing they are global and not user specific and/or not needed.

Those other questions do a good job of explaining what should and shouldn't be checked into source control so I wont repeat them.

Teucer answered 10/12, 2009 at 13:1 Comment(0)
M
2

Anything that would be devastating if it were lost, should be under version control.

Marienthal answered 10/12, 2009 at 13:0 Comment(1)
"...should be under version control" you say? :)Speechless
K
2

In my opinion it depends on the project and environment. In a company environment where everybody is using the same IDE it can make sense to add the IDE files to the repository. While this depends a bit on the IDE, as some include absolute paths to things.

For a project which is developed in different environments it doesn't make sense and will be pain in the long run as the project files aren't maintained by all developers and make it harder to find "relevant" things.

Krilov answered 10/12, 2009 at 13:2 Comment(2)
As you write, some IDE include absolute paths, so its a real headache to put those files in. plus - what's the benefit? If the 'junk files' get lost, you can re-generate them, and I don't remember a single time that I needed to revert back to a previous version of a project configuration.Outstretch
Depending on the language and IDE you'Re using these files include build isntructions, build configurations, .. which might be worth sharing among developers for using similar settingsKrilov
W
0

In my opinion, anything needed to build the project (code, make files, media, databases with required program info, etc) should be in repositories. I realise that especially for media/database files this is contriversial, but to me if you can't branch and then hit build the source control's not doing it's job. This goes double for distributed systems with cheap branch creation/merging.

Anything else? Store it somewhere different. Developers should choose their own working environment as much as possible.

Windowpane answered 10/12, 2009 at 13:9 Comment(0)
R
0

From what I have been looking at with version control, it seems that most things should go into it - e.g. source code and so on. However, the problem that many VCS's run into is when trying to handle large files, typically binaries, and at times things like audio and graphic files. Therefore, my personal way to do it is to put the source code under version control, along with general small sized graphics, and leave any binaries to other systems of management. If it is a binary that I created myself using the build system of the IDE, then that can definitily be ignored, because it is going to be regenerated every build. For dependancy libraries, well this is where dependancy package managers come in.

As for IDE generated files (I am assuming these are ones that aren't generated during the build process, such as the solution files for Visual Studio) - well, I think it would depend on whether or not you are working alone. If you are working alone, then go ahead and add them - they will allow you to revert settings in the solution or whatever you make. Same goes for other non-solution like files as well. However, if you are collaborating, then my recomendation is no - most IDE generated files tend to be, well, user specific - aka they work on your machine, but not neccesarily on others. Hence, you may be better of not including IDE generated files in that case.

tl;dr you should put most things that relate to your program into version control, excluding dependencies (things like libraries, graphics and audio should be handled by some other dependancy management system). As for things directly generated by the IDE - well, it would depend on if you are working alone or with other people.

Riehl answered 20/11, 2013 at 5:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.