Git and Mercurial - Compare and Contrast
Asked Answered
S

9

526

For a while now I've been using subversion for my personal projects.

More and more I keep hearing great things about Git and Mercurial, and DVCS in general.

I'd like to give the whole DVCS thing a whirl, but I'm not too familiar with either option.

What are some of the differences between Mercurial and Git?

Note: I'm not trying to find out which one is "best" or even which one I should start with. I'm mainly looking for key areas where they are similar, and where they are different, because I am interested to know how they differ in terms of implementation and philosophy.

Subheading answered 21/10, 2009 at 4:46 Comment(2)
See also stackoverflow.com/questions/995636/…Excalibur
possible duplicate of What is the Difference Between Mercurial and Git?Forbore
M
457

Disclaimer: I use Git, follow Git development on git mailing list, and even contribute a bit to Git (gitweb mainly). I know Mercurial from documentation and some from discussion on #revctrl IRC channel on FreeNode.

Thanks to all people on on #mercurial IRC channel who provided help about Mercurial for this writeup



Summary

Here it would be nice to have some syntax for table, something like in PHPMarkdown / MultiMarkdown / Maruku extension of Markdown

  • Repository structure: Mercurial doesn't allow octopus merges (with more than two parents), nor tagging non-commit objects.
  • Tags: Mercurial uses versioned .hgtags file with special rules for per-repository tags, and has also support for local tags in .hg/localtags; in Git tags are refs residing in refs/tags/ namespace, and by default are autofollowed on fetching and require explicit pushing.
  • Branches: In Mercurial basic workflow is based on anonymous heads; Git uses lightweight named branches, and has special kind of branches (remote-tracking branches) that follow branches in remote repository.
  • Revision naming and ranges: Mercurial provides revision numbers, local to repository, and bases relative revisions (counting from tip, i.e. current branch) and revision ranges on this local numbering; Git provides a way to refer to revision relative to branch tip, and revision ranges are topological (based on graph of revisions)
  • Mercurial uses rename tracking, while Git uses rename detection to deal with file renames
  • Network: Mercurial supports SSH and HTTP "smart" protocols, and static HTTP protocol; modern Git supports SSH, HTTP and GIT "smart" protocols, and HTTP(S) "dumb" protocol. Both have support for bundles files for off-line transport.
  • Mercurial uses extensions (plugins) and established API; Git has scriptability and established formats.

There are a few things that differ Mercurial from Git, but there are other things that make them similar. Both projects borrow ideas from each other. For example hg bisect command in Mercurial (formerly bisect extension) was inspired by git bisect command in Git, while idea of git bundle was inspired by hg bundle.

Repository structure, storing revisions

In Git there are four types of objects in its object database: blob objects which contain contents of a file, hierarchical tree objects which store directory structure, including file names and relevant parts of file permissions (executable permission for files, being a symbolic link), commit object which contain authorship info, pointer to snapshot of state of repository at revision represented by a commit (via a tree object of top directory of project) and references to zero or more parent commits, and tag objects which reference other objects and can be signed using PGP / GPG.

Git uses two ways of storing objects: loose format, where each object is stored in a separate file (those files are written once, and never modified), and packed format where many objects are stored delta-compressed in a single file. Atomicity of operations is provided by the fact, that reference to a new object is written (atomically, using create + rename trick) after writing an object.

Git repositories require periodic maintenance using git gc (to reduce disk space and improve performance), although nowadays Git does that automatically. (This method provides better compression of repositories.)

Mercurial (as far as I understand it) stores history of a file in a filelog (together, I think, with extra metadata like rename tracking, and some helper information); it uses flat structure called manifest to store directory structure, and structure called changelog which store information about changesets (revisions), including commit message and zero, one or two parents.

Mercurial uses transaction journal to provide atomicity of operations, and relies on truncating files to clean-up after failed or interrupted operation. Revlogs are append-only.

Looking at repository structure in Git versus in Mercurial, one can see that Git is more like object database (or a content-addressed filesystem), and Mercurial more like traditional fixed-field relational database.

Differences:
In Git the tree objects form a hierarchical structure; in Mercurial manifest file is flat structure. In Git blob object store one version of a contents of a file; in Mercurial filelog stores whole history of a single file (if we do not take into account here any complications with renames). This means that there are different areas of operations where Git would be faster than Mercurial, all other things considered equal (like merges, or showing history of a project), and areas where Mercurial would be faster than Git (like applying patches, or showing history of a single file). This issue might be not important for end user.

Because of the fixed-record structure of Mercurial's changelog structure, commits in Mercurial can have only up to two parents; commits in Git can have more than two parents (so called "octopus merge"). While you can (in theory) replace octopus merge by a series of two-parent merges, this might cause complications when converting between Mercurial and Git repositories.

As far as I know Mercurial doesn't have equivalent of annotated tags (tag objects) from Git. A special case of annotated tags are signed tags (with PGP / GPG signature); equivalent in Mercurial can be done using GpgExtension, which extension is being distributed along with Mercurial. You can't tag non-commit object in Mercurial like you can in Git, but that is not very important, I think (some git repositories use tagged blob to distribute public PGP key to use to verify signed tags).

References: branches and tags

In Git references (branches, remote-tracking branches and tags) reside outside DAG of commits (as they should). References in refs/heads/ namespace (local branches) point to commits, and are usually updated by "git commit"; they point to the tip (head) of branch, that's why such name. References in refs/remotes/<remotename>/ namespace (remote-tracking branches) point to commit, follow branches in remote repository <remotename>, and are updated by "git fetch" or equivalent. References in refs/tags/ namespace (tags) point usually to commits (lightweight tags) or tag objects (annotated and signed tags), and are not meant to change.

Tags

In Mercurial you can give persistent name to revision using tag; tags are stored similarly to the ignore patterns. It means that globally visible tags are stored in revision-controlled .hgtags file in your repository. That has two consequences: first, Mercurial has to use special rules for this file to get current list of all tags and to update such file (e.g. it reads the most recently committed revision of the file, not currently checked out version); second, you have to commit changes to this file to have new tag visible to other users / other repositories (as far as I understand it).

Mercurial also supports local tags, stored in hg/localtags, which are not visible to others (and of course are not transferable)

In Git tags are fixed (constant) named references to other objects (usually tag objects, which in turn point to commits) stored in refs/tags/ namespace. By default when fetching or pushing a set of revision, git automatically fetches or pushes tags which point to revisions being fetched or pushed. Nevertheless you can control to some extent which tags are fetched or pushed.

Git treats lightweight tags (pointing directly to commits) and annotated tags (pointing to tag objects, which contain tag message which optionally includes PGP signature, which in turn point to commit) slightly differently, for example by default it considers only annotated tags when describing commits using "git describe".

Git doesn't have a strict equivalent of local tags in Mercurial. Nevertheless git best practices recommend to setup separate public bare repository, into which you push ready changes, and from which others clone and fetch. This means that tags (and branches) that you don't push, are private to your repository. On the other hand you can also use namespace other than heads, remotes or tags, for example local-tags for local tags.

Personal opinion: In my opinion tags should reside outside revision graph, as they are external to it (they are pointers into graph of revisions). Tags should be non-versioned, but transferable. Mercurial's choice of using a mechanism similar to the one for ignoring files, means that it either has to treat .hgtags specially (file in-tree is transferable, but ordinary it is versioned), or have tags which are local only (.hg/localtags is non-versioned, but untransferable).

Branches

In Git local branch (branch tip, or branch head) is a named reference to a commit, where one can grow new commits. Branch can also mean active line of development, i.e. all commits reachable from branch tip. Local branches reside in refs/heads/ namespace, so e.g. fully qualified name of 'master' branch is 'refs/heads/master'.

Current branch in Git (meaning checked out branch, and branch where new commit will go) is the branch which is referenced by the HEAD ref. One can have HEAD pointing directly to a commit, rather than being symbolic reference; this situation of being on an anonymous unnamed branch is called detached HEAD ("git branch" shows that you are on '(no branch)').

In Mercurial there are anonymous branches (branch heads), and one can use bookmarks (via bookmark extension). Such bookmark branches are purely local, and those names were (up to version 1.6) not transferable using Mercurial. You can use rsync or scp to copy the .hg/bookmarks file to a remote repository. You can also use hg id -r <bookmark> <url> to get the revision id of a current tip of a bookmark.

Since 1.6 bookmarks can be pushed/pulled. The BookmarksExtension page has a section on Working With Remote Repositories. There is a difference in that in Mercurial bookmark names are global, while definition of 'remote' in Git describes also mapping of branch names from the names in remote repository to the names of local remote-tracking branches; for example refs/heads/*:refs/remotes/origin/* mapping means that one can find state of 'master' branch ('refs/heads/master') in the remote repository in the 'origin/master' remote-tracking branch ('refs/remotes/origin/master').

Mercurial has also so called named branches, where the branch name is embedded in a commit (in a changeset). Such name is global (transferred on fetch). Those branch names are permanently recorded as part of the changeset\u2019s metadata. With modern Mercurial you can close "named branch" and stop recording branch name. In this mechanism tips of branches are calculated on the fly.

Mercurial's "named branches" should in my opinion be called commit labels instead, because it is what they are. There are situations where "named branch" can have multiple tips (multiple childless commits), and can also consist of several disjoint parts of graph of revisions.

There is no equivalent of those Mercurial "embedded branches" in Git; moreover Git's philosophy is that while one can say that branch includes some commit, it doesn't mean that a commit belongs to some branch.

Note that Mercurial documentation still proposes to use separate clones (separate repositories) at least for long-lived branches (single branch per repository workflow), aka branching by cloning.

Branches in pushing

Mercurial by default pushes all heads. If you want to push a single branch (single head), you have to specify tip revision of the branch you want to push. You can specify branch tip by its revision number (local to repository), by revision identifier, by bookmark name (local to repository, doesn't get transferred), or by embedded branch name (named branch).

As far as I understand it, if you push a range of revisions that contain commits marked as being on some "named branch" in Mercurial parlance, you will have this "named branch" in the repository you push to. This means that names of such embedded branches ("named branches") are global (with respect to clones of given repository / project).

By default (subject to push.default configuration variable) "git push" or "git push <remote>" Git would push matching branches, i.e. only those local branches that have their equivalent already present in remote repository you push into. You can use --all option to git-push ("git push --all") to push all branches, you can use "git push <remote> <branch>" to push a given single branch, and you can use "git push <remote> HEAD" to push current branch.

All of the above assumes that Git isn't configured which branches to push via remote.<remotename>.push configuration variables.

Branches in fetching

Note: here I use Git terminology where "fetch" means downloading changes from remote repository without integrating those changes with local work. This is what "git fetch" and "hg pull" does.

If I understand it correctly, by default Mercurial fetches all heads from remote repository, but you can specify branch to fetch via "hg pull --rev <rev> <url>" or "hg pull <url>#<rev>" to get single branch. You can specify <rev> using revision identifier, "named branch" name (branch embedded in changelog), or bookmark name. Bookmark name however (at least currently) doesn't get transferred. All "named branches" revisions you get belong to get transferred. "hg pull" stores tips of branches it fetched as anonymous, unnamed heads.

In Git by default (for 'origin' remote created by "git clone", and for remotes created using "git remote add") "git fetch" (or "git fetch <remote>") gets all branches from remote repository (from refs/heads/ namespace), and stores them in refs/remotes/ namespace. This means for example that branch named 'master' (full name: 'refs/heads/master') in remote 'origin' would get stored (saved) as 'origin/master' remote-tracking branch (full name: 'refs/remotes/origin/master').

You can fetch single branch in Git by using git fetch <remote> <branch> - Git would store requested branch(es) in FETCH_HEAD, which is something similar to Mercurial unnamed heads.

Those are but examples of default cases of powerful refspec Git syntax: with refspecs you can specify and/or configure which branches one want to fetch, and where to store them. For example default "fetch all branches" case is represented by '+refs/heads/*:refs/remotes/origin/*' wildcard refspec, and "fetch single branch" is shorthand for 'refs/heads/<branch>:'. Refspecs are used to map names of branches (refs) in remote repository to local refs names. But you don't need to know (much) about refspecs to be able to work effectively with Git (thanks mainly to "git remote" command).

Personal opinion: I personally think that "named branches" (with branch names embedded in changeset metadata) in Mercurial are misguided design with its global namespace, especially for a distributed version control system. For example let's take case where both Alice and Bob have "named branch" named 'for-joe' in their repositories, branches which have nothing in common. In Joe's repository however those two branches would be mistreated as a single branch. So you have somehow come up with convention protecting against branch name clashes. This is not problem with Git, where in Joe's repository 'for-joe' branch from Alice would be 'alice/for-joe', and from Bob it would be 'bob/for-joe'. See also Separating branch name from branch identity issue raised on Mercurial wiki.

Mercurial's "bookmark branches" currently lack in-core distribution mechanism.

Differences:
This area is one of the main differences between Mercurial and Git, as james woodyatt and Steve Losh said in their answers. Mercurial, by default, uses anonymous lightweight codelines, which in its terminology are called "heads". Git uses lightweight named branches, with injective mapping to map names of branches in remote repository to names of remote-tracking branches. Git "forces" you to name branches (well, with exception of single unnamed branch, situation called detached HEAD), but I think this works better with branch-heavy workflows such as topic branch workflow, meaning multiple branches in single repository paradigm.

Naming revisions

In Git there are many ways of naming revisions (described e.g. in git rev-parse manpage):

  • The full SHA1 object name (40-byte hexadecimal string), or a substring of such that is unique within the repository
  • A symbolic ref name, e.g. 'master' (referring to 'master' branch), or 'v1.5.0' (referring to tag), or 'origin/next' (referring to remote-tracking branch)
  • A suffix ^ to revision parameter means the first parent of a commit object, ^n means n-th parent of a merge commit. A suffix ~n to revision parameter means n-th ancestor of a commit in straight first-parent line. Those suffixes can be combined, to form revision specifier following path from a symbolic reference, e.g. 'pu~3^2~3'
  • Output of "git describe", i.e. a closest tag, optionally followed by a dash and a number of commits, followed by a dash, a 'g', and an abbreviated object name, for example 'v1.6.5.1-75-g5bf8097'.

There are also revision specifiers involving reflog, not mentioned here. In Git each object, be it commit, tag, tree or blob has its SHA-1 identifier; there is special syntax like e.g. 'next:Documentation' or 'next:README' to refer to tree (directory) or blob (file contents) at specified revision.

Mercurial also has many ways of naming changesets (described e.g. in hg manpage):

  • A plain integer is treated as a revision number. One need to remember that revision numbers are local to given repository; in other repository they can be different.
  • Negative integers are treated as sequential offsets from the tip, with -1 denoting the tip, -2 denoting the revision prior to the tip, and so forth. They are also local to repository.
  • An unique revision identifier (40-digit hexadecimal string) or its unique prefix.
  • A tag name (symbolic name associated with given revision), or a bookmark name (with extension: symbolic name associated with given head, local to repository), or a "named branch" (commit label; revision given by "named branch" is tip (childless commit) of all commits with given commit label, with largest revision number if there are more than one such tip)
  • The reserved name "tip" is a special tag that always identifies the most recent revision.
  • The reserved name "null" indicates the null revision.
  • The reserved name "." indicates the working directory parent.

Differences
As you can see comparing above lists Mercurial offers revision numbers, local to repository, while Git doesn't. On the other hand Mercurial offers relative offsets only from 'tip' (current branch), which are local to repository (at least without ParentrevspecExtension), while Git allows to specify any commit following from any tip.

The most recent revision is named HEAD in Git, and "tip" in Mercurial; there is no null revision in Git. Both Mercurial and Git can have many root (can have more than one parentless commits; this is usually result of formerly separate projects joining).

See also: Many different kinds of revision specifiers article on Elijah's Blog (newren's).

Personal opinion: I think that revision numbers are overrated (at least for distributed development and/or nonlinear / branchy history). First, for a distributed version control system they have to be either local to repository, or require treating some repository in a special way as a central numbering authority. Second, larger projects, with longer history, can have number of revisions in 5 digits range so they are offer only slight advantage over shortened to 6-7 character revision identifiers, and imply strict ordering while revisions are only partially ordered (I mean here that revisions n and n+1 doesn't need to be parent and child).

Revision ranges

In Git revision ranges are topological. Commonly seen A..B syntax, which for linear history means revision range starting at A (but excluding A), and ending at B (i.e. range is open from below), is shorthand ("syntactic sugar") for ^A B, which for history traversing commands mean all commits reachable from B, excluding those reachable from A. This means that the behavior of A..B range is entirely predictable (and quite useful) even if A is not ancestor of B: A..B means then range of revisions from common ancestor of A and B (merge base) to revision B.

In Mercurial revision ranges are based on range of revision numbers. Range is specified using A:B syntax, and contrary to Git range acts as a closed interval. Also range B:A is the range A:B in reverse order, which is not the case in Git (but see below note on A...B syntax). But such simplicity comes with a price: revision range A:B makes sense only if A is ancestor of B or vice versa, i.e. with linear history; otherwise (I guess that) the range is unpredictable, and the result is local to repository (because revision numbers are local to repository).

This is fixed with Mercurial 1.6, which has new topological revision range, where 'A..B' (or 'A::B') is understood as the set of changesets that are both descendants of X and ancestors of Y. This is, I guess, equivalent to '--ancestry-path A..B' in Git.

Git also has notation A...B for symmetric difference of revisions; it means A B --not $(git merge-base A B), which means all commits reachable from either A or B, but excluding all commits reachable from both of them (reachable from common ancestors).

Renames

Mercurial uses rename tracking to deal with file renames. This means that the information about the fact that a file was renamed is saved at the commit time; in Mercurial this information is saved in the "enhanced diff" form in filelog (file revlog) metadata. The consequence of this is that you have to use hg rename / hg mv... or you need to remember to run hg addremove to do similarity based rename detection.

Git is unique among version control systems in that it uses rename detection to deal with file renames. This means that the fact that file was renamed is detected at time it is needed: when doing a merge, or when showing a diff (if requested / configured). This has the advantage that rename detection algorithm can be improved, and is not frozen at time of commit.

Both Git and Mercurial require using --follow option to follow renames when showing history of a single file. Both can follow renames when showing line-wise history of a file in git blame / hg annotate.

In Git the git blame command is able to follow code movement, also moving (or copying) code from one file to the other, even if the code movement is not part of wholesome file rename. As far as I know this feature is unique to Git (at the time of writing, October 2009).

Network protocols

Both Mercurial and Git have support for fetching from and pushing to repositories on the same filesystem, where repository URL is just a filesystem path to repository. Both also have support for fetching from bundle files.

Mercurial support fetching and pushing via SSH and via HTTP protocols. For SSH one needs an accessible shell account on the destination machine and a copy of hg installed / available. For HTTP access the hg-serve or Mercurial CGI script running is required, and Mercurial needs to be installed on server machine.

Git supports two kinds of protocols used to access remote repository:

  • "smart" protocols, which include access via SSH and via custom git:// protocol (by git-daemon), require having git installed on server. The exchange in those protocols consist of client and server negotiating about what objects they have in common, and then generating and sending a packfile. Modern Git includes support for "smart" HTTP protocol.
  • "dumb" protocols, which include HTTP and FTP (only for fetching), and HTTPS (for pushing via WebDAV), do not require git installed on server, but they do require that repository contains extra information generated by git update-server-info (usually run from a hook). The exchange consist of client walking the commit chain and downloading loose objects and packfiles as needed. The downside is that it downloads more than strictly required (e.g. in corner case when there is only single packfile it would get downloaded whole even when fetching only a few revisions), and that it can require many connections to finish.

Extending: scriptability vs extensions (plugins)

Mercurial is implemented in Python, with some core code written in C for performance. It provides API for writing extensions (plugins) as a way of adding extra features. Some of functionality, like "bookmark branches" or signing revisions, is provided in extensions distributed with Mercurial and requires turning it on.

Git is implemented in C, Perl and shell scripts. Git provides many low level commands (plumbing) suitable to use in scripts. The usual way of introducing new feature is to write it as Perl or shell script, and when user interface stabilizes rewrite it in C for performance, portability, and in the case of shell script avoiding corner cases (this procedure is called builtinification).

Git relies and is built around [repository] formats and [network] protocols. Instead of language bindings there are (partial or complete) reimplementations of Git in other languages (some of those are partially reimplementations, and partially wrappers around git commands): JGit (Java, used by EGit, Eclipse Git Plugin), Grit (Ruby), Dulwich (Python), git# (C#).


TL;DR

Megasporangium answered 21/10, 2009 at 10:23 Comment(30)
What could be added is that hg tries very hard to discourage history rewriting (it can only be done with extensions: mq, histedit, rebase), while git does it out-of-the-box (and it looks like part of the community even encourages it).Larondalarosa
I'm not sure I get your point about the one version vs. whole history thing. It's an implementation detail: git addresses content with a unique key (sha hash), while hg addresses content with a tuple (filename, sha hash). Then how those revs are stored is a different problem (a packed git repo will store all the revs in a single file, the hg revs can be stored in Google's bigtable or with a scheme similar to git (one file per-rev), etc.)Larondalarosa
I think "rewriting history" is unnecessarily negative sounding. What I encourage in git is people to consider the history they publish. Other people need to consume that history. Nobody (not even you) is interested in all of your "oops, forgot a file" commits. Nor does anyone care about the series of inbound merges you went through while you were tracking an upstream branch while working on a new feature. That kind of stuff makes history (and related tools) much harder to understand and provides no value.Huntress
I rewrite the history all the time with hg (using mq), I just wanted to point that Matt wanted, while designing hg, to not make it easy to do that. It's somewhat reserved to advanced user (needs an extension, etc.).Larondalarosa
@Jakub: named branches are something that doesn't exist in git. It's simply a field in the cset description (and that is part of the history, so it is immutable unless you changes hashes, etc.). Something like git branches are bookmarks ("named heads") but they aren't currently remote transferrable (you don't import the remote bookmarks when pulling). stevelosh.com/blog/entry/2009/8/30/… explains it very well.Larondalarosa
"As far as I know Mercurial doesn't have equivalent of annotated tags (tag objects) from Git, which special case are signed tags (with PGP / GPG)" -- Mercurial comes with an extension for signing changesets: mercurial.selenic.com/wiki/GpgExtensionAct
"Mercurial originally supported only one branch per repository workflow, and it shows." Uh, no. Mercurial didn't support named branches originally, but you've always been able to have as many anonymous branches as your heart desires in a single repo. Contrast that with git, which makes anonymous branching a huge pain. You pretty much have to think of a name for every little branch if you want to get anything done (and avoid having your work garbage collected).Act
@tonfa: I'll try to come up with description of diference between "named heads", Mercurial "floating tags" / "bookmarks" and Git "branch tips"; as I said I am Git user, and do not know much about Mercurial.Bushhammer
@Steve Losh: Thanks for mentioning GpgExtension for signing changesets (but I guess it is not exact equivalen of signed tags in Git, isn't it?). I am planning to write a bit more about anonymous heads in Mercurial (which are required to be able to pull then merge); OTOH I think that named branches are better than unnamed ones. BTW in modern Git you can be on a unnamed branch (called "detached HEAD" in Git). Also in Git you can delete branches after you finished working with them.Bushhammer
@Larondalarosa I was quite a heavy mq user when I used hg a lot. I can see that it was possible, but it's harder to safely use mq than it is to safely use rebase, etc... (since everything is undoable by default).Huntress
@Steve I don't see why you feel that git requires naming branches, but hg does not. Is that because you can't easily have multiple anonymous heads? To be honest, I don't see a big fuss in putting a name on a branch when I intend to have more than one.Huntress
@Dustin: Some quick testing with git 1.6.1: If I make three commits, then use git log --all I see all 3. Great. Now I git co the first commit and make a new commit without thinking up a name for a branch. git log --all doesn't include this new commit, because no ref points to it. It's a commit, but it's not in the log -- not great. Let's hope I never checkout something else or I'm going to have fun with grep trying to find the hash to get back to it. I'm pretty sure git will GC it as well after a while if I forget to tag it. Definitely not great.Act
@Dustin: And no, it's not usually too hard to think up a branch name (for long-running branches, it's actually a very good idea). But for branches that will only take a couple of minutes it's probably not necessary, and git's "you will now stop thinking about your code so you can think of a name for this 3-commit branch and you will like it because that's the way we do things here" attitude is pretty annoying.Act
@Steve Losh: You still have this commit in git log (or git log HEAD); for some reason --all doesn't include HEAD ref. And even if you checkout some other branch, there is always HEAD reflog (for 30 days). Also in Git you can name your branch e.g. 'tmp' or 'feature-a' or 'subsystem-b' (if you don't like using detached HEAD for some reason), and then delete it if this doesn't work (if it is there to stay you need to think some name anyway to later refer to it).Bushhammer
Jakub Narębski You still have it in git log until you check out something else, then it's gone unless you wrote down the revision hash or want to find it in .git/objects manually. The fact that it never appears in git log --all is kind of amusing. I would think log --all should always show an equal or greater number of revisions than log. And sure, you can name it a throwaway name and delete it later, but that's still doing unnecessary work just to please git. I don't like using tools that make me work around them, even if it's not a lot of work.Act
Sorry about abusing comment system to hold discussion @Steve Losh You seem to have missed mentioning reflog : even if you checkout something else, you still have it in "git reflog" output; no need to write down revision hash or creating a branch or a tag. Besides I don't see having to name a branch (name which you can change later, without any sign of old name) is more of the work than e.g. trying to come up with good commit message. As to git log --all: it` means all references in refs/ namespace, and HEAD is outside it - thanks for noticing this.Bushhammer
@Jakub Narębski I didn't know about git reflog, and yes, it will contain the hash, but you're still going to want to use git reflog | grep "message" or something if you want to find a specific commit. To me, reading that output is too much effort compared to hg heads. And sure, branch name effort <= commit message effort, but that doesn't really matter because naming a branch doesn't excuse you from writing a commit message: you need to do both. And yes, git log --all works how you say, but intuitively it seems wrong that using --all can result in less output.Act
@Steve Losh: Actually git log --all includes HEAD since git version 1.6.1.3 or later.Bushhammer
@Steve Losh your preference to 'hg heads' is personal ; many people would rather have identifiers in order to identify things (i.e. tmp-cleanup-x). However, you could implement a script that creates a 'head-n' branch, and another one (e.g. git heads) that shows those branches if that suits you. That's irrelevant to this discussion.Torpedoman
@Dustin: That's actually what rewriting history is for: Rewriting the "oops" commit so that they are merge with the correct one. It's never about changing something big (except special cases like sensitive files submitted...).Carolanncarole
From @Steve Losh's post A Git User’s Guide to Mercurial Queues it sounds like mq with multiple queues offers slightly more flexible functionality than git does in terms of rewriting. If that is the case, I think it is worth adding to the post.Gratia
@Faheem: There are third party patch queue management interfaces for Git too: StGit, Guild, TopGit.Bushhammer
@Jakub: Ok, I see. You didn't talk about patch queues because they are not part of the main system, I suppose?Gratia
@Faheem: Yes, this is why I didn't talk about MQ (Mercurial Queues) extension (though I think it is built-in extension).Bushhammer
Mercurial also offers quite efficient dumb HTTP. See for example draketo.de/proj/hgsiteArawak
@SteveLosh: you seem to think having lots of anonymous branches in Mercurial is a good thing, but to me it seems horrible. How do you tell them all apart? And you seem to think naming branches in Git is some enormous difficulty, but if you have a purpose for creating the branch then you have a ready-made name. If you have no purpose, then don't branch. I fail to see how Mercurial offers any benefit here. I see only pain and confusion.Freeliving
I have no idea when the most recent present-giving celebratory event for you would've been, but at the same time, much like the little fix here, some things now isn't too late for. Enjoy~Alsatia
@Freeliving Unlike git, mercurial actually tracks diffs rather than commits. This means that mercurial actually tracks history, and thus does not need branch identifiers to identify what branch you are on. You can see what anonymous branch you are on by looking at the branch topology and the modified file list as well as by the commit comment; in git, the branch topology and the modified file list are not available, except perhaps by reconstruction, so you more often need a branch name to supplement the commit comment.Hoag
To continue my comment, I've used mercurial extensively and never felt a need for branch names except when the name needed to be communicated across the team, such as for release branches for the project. In those cases, it's actually a big plus that the name is the same for everyone.Hoag
@WarrenDew Saying "the branch topology and the modified file list are not available, except perhaps by reconstruction" makes no sense to me. There are tons of options to git log, so you can see the list of changed files with or without diffs and pretty anything I could think of. Options like -p, --name-status, --graph should cover everything you listed. And yes, the diffs get computed on the fly, and no, the parents are stored in each commit. It's slower than showing stored diffs, but it produces 100 diff lines per millisecond (time -p git log -p | wc -l 1868268 real 1.89).Labrie
H
58

I think you can get a feeling of what those systems are similar or different in by whatching those two videos:

Linus Torvalds on Git (http://www.youtube.com/watch?v=4XpnKHJAok8)
Bryan O'Sullivan on Mercurial (http://www.youtube.com/watch?v=JExtkqzEoHY)

Both of them are very similar in design but very different in implementations.

I use Mercurial. As far as I understand Git, one major thing git is different is that it tracks contents of files instead of files themselves. Linus says that if you move a function from one file to another, Git will tell you the history of that single function across the move.

They also say that git is slower over HTTP but it has it's own network protocol and server.

Git works better as an SVN thick client than Mercurial. You can pull and push against an SVN server. This functionality is still under development in Mercurial

Both Mercurial and Git have very nice web hosting solutions available (BitBucket and GitHub), but Google Code supports Mercurial only. By the way, they have a very detailed comparison of Mercurial and Git they did for deciding which one to support (http://code.google.com/p/support/wiki/DVCSAnalysis). It has a lot of good info.

Hayton answered 21/10, 2009 at 5:4 Comment(6)
I'd recommend reading all of the comments on that google code page. The information does feel somewhat biased and doesn't match my experience well. I like hg, and used it extensively for a year or so. I use git almost exclusively now. There are things I need to accomplish that git makes easy and hg makes nearly impossible (though some like to call this by means of "complication.") Basic git is as easy as base hg.Huntress
Dustin, maybe list some of those "git easy, hg not so much" cases?Strove
@knittl no it doesn't. Mainly because it would be a pain for them to deploy it since git lacks a smart http protocol (most of Google front-ends are http-based).Larondalarosa
@tonfa: Smart HTTP protocol for Git is currently being developed (as in: there are patches on git mailing list, and they are in 'pu' = proposed updates branch in git.git repository).Bushhammer
@Jakub is the protocol described somewhere? We might do major changes in hg protocol so it might be helpful to see what has been done elsewhere.Larondalarosa
@tonfa: Search for "smart HTTP" (or "Return of smart HTTP" in git mailing list (archives). For example latest protocol description is here: thread.gmane.org/gmane.comp.version-control.git/129732/…Bushhammer
A
27

I use both quite regularly. The major functional difference is in the way Git and Mercurial name branches within repositories. With Mercurial, branch names are cloned and pulled along with their changesets. When you add changes to a new branch in Mercurial and push to another repository, the branch name is pushed at the same time. So, branch names are more-or-less global in Mercurial, and you have to use the Bookmark extension to have local-only lightweight names (if you want them; Mercurial, by default, uses anonymous lightweight codelines, which in its terminology are called "heads"). In Git, branch names and their injective mapping to remote branches are stored locally and you must manage them explicitly, which means knowing how to do that. This is pretty much where Git gets its reputation for being harder to learn and use than Mercurial.

As others will note here, there are lots and lots of minor differences. The thing with the branches is the big differentiator.

Arzola answered 21/10, 2009 at 5:26 Comment(1)
See also this post for a good explaination about the four kinds of branches in Mercurial: stevelosh.com/blog/entry/2009/8/30/…Limner
F
12

After reading all over that Mercurial is easier (which I still believe it is, after all the internet community is of the opinion), when I started working with Git and Mercurial I felt Git is relatively simpler for me to adapt to (I started off with Mercurial with TortoiseHg) when working from the command line, mainly because the git commands were named appropriately according to me and are fewer in number. Mercurial has different naming for each command that does a distinct job, while Git commands can be multipurpose according to situation (for eg, checkout). While Git was harder back then, now the difference is hardly substantial. YMMV.. With a good GUI client like TortoiseHg, true it was much easier to work with Mercurial and I did not have to remember the slightly confusing commands. I'm not going into detail how every command for the same action varied, but here are two comprehensive lists: 1 from Mercurial's own site and 2nd from wikivs.

╔═════════════════════════════╦════════════════════════════════════════════════════════════════════════════════════════════════╗
║           Git               ║                Mercurial                                                                       ║
╠═════════════════════════════╬════════════════════════════════════════════════════════════════════════════════════════════════╣
║ git pull                    ║ hg pull -u                                                                                     ║
║ git fetch                   ║ hg pull                                                                                        ║
║ git reset --hard            ║ hg up -C                                                                                       ║
║ git revert <commit>         ║ hg backout <cset>                                                                              ║
║ git add <new_file>          ║ hg add <new_file> (Only equivalent when <new_file> is not tracked.)                            ║
║ git add <file>              ║ Not necessary in Mercurial.                                                                    ║
║ git add -i                  ║ hg record                                                                                      ║
║ git commit -a               ║ hg commit                                                                                      ║
║ git commit --amend          ║ hg commit --amend                                                                              ║
║ git blame                   ║ hg blame or hg annotate                                                                        ║
║ git blame -C                ║ (closest equivalent): hg grep --all                                                            ║
║ git bisect                  ║ hg bisect                                                                                      ║
║ git rebase --interactive    ║ hg histedit <base cset> (Requires the HisteditExtension.)                                      ║
║ git stash                   ║ hg shelve (Requires the ShelveExtension or the AtticExtension.)                                ║
║ git merge                   ║ hg merge                                                                                       ║
║ git cherry-pick <commit>    ║ hg graft <cset>                                                                                ║
║ git rebase <upstream>       ║ hg rebase -d <cset> (Requires the RebaseExtension.)                                            ║
║ git format-patch <commits>  ║ hg email -r <csets> (Requires the PatchbombExtension.)                                         ║
║   and git send-mail         ║                                                                                                ║
║ git am <mbox>               ║ hg mimport -m <mbox> (Requires the MboxExtension and the MqExtension. Imports patches to mq.)  ║
║ git checkout HEAD           ║ hg update                                                                                      ║
║ git log -n                  ║ hg log --limit n                                                                               ║
║ git push                    ║ hg push                                                                                        ║
╚═════════════════════════════╩════════════════════════════════════════════════════════════════════════════════════════════════╝

Git saves a record of every version of committed files internally, while Hg saves just the changesets which can have a smaller footprint. Git makes it easier to change the history compared to Hg, but then again its a hate-it-or-love-it feature. I like Hg for former and Git for latter.

What I miss in Hg is the submodule feature of Git. Hg has subrepos but that's not exactly Git submodule.

Ecosystem around the two can also influence one's choice: Git has to be more popular (but that's trivial), Git has GitHub while Mercurial has BitBucket, Mercurial has TortoiseHg for which I haven't seen an equivalent as good for Git.

Each has its advantages and disadvantages, with either of them you're not going to lose.

Forbore answered 26/2, 2013 at 12:25 Comment(0)
K
11

Mercurial is almost fully written in python. Git's core is written in C (and should be faster, than Mercurial's) and tools written in sh, perl, tcl and uses standard GNU utils. Thus it needs to bring all these utils and interpreters with it to system that doesn't contain them (e.g. Windows).

Both support work with SVN, although AFAIK svn support is broken for git on Windows (may be I am just unlucky/lame, who knows). There're also extensions which allow to interoperate between git and Mercurial.

Mercurial has nice Visual Studio integration. Last time I checked, plugin for Git was working but extremely slow.

They basic command sets are very similar(init, clone, add, status, commit, push, pull etc.). So, basic workflow will be the same. Also, there's TortoiseSVN-like client for both.

Extensions for Mercurial can be written in python (no surprise!) and for git they can be written in any executable form (executable binary, shell script etc). Some extensions are crazy powerful, like git bisect.

Knepper answered 21/10, 2009 at 5:13 Comment(6)
Mercurial core is written in C too FYI (but it's probably a smaller core than git).Larondalarosa
I use git-svn on Windows without any trouble. That's using Cygwin (the only right way to use git on Windows if you ask me). Can't speak for msysgit.Horsehair
@Dan Moulding: Yes, I've experienced problems with msysgit. Maybe need give a try to cygwin port (I had some poor experience of using cygwin earlier, so I avoided it). Thanks for advice!Knepper
I personally dislike cygwin's intrusion into the registry to store user data. It's a PITA to get it to run off USB key and keep a local c:\ drive copy synchronized for when I want to run faster than my USB key can go. :-/Ics
I use the Git plugin for Visual Studio mentioned above, and the performance of the current version is good. It shells out to the command-line tools to do the work, so I don't think that it will significantly lose performance on large projects.Whim
@Chris Kaminski, subst your actual installation place to "W:\" and install Cygwin to W:\. This will make it relocatable.Nadinenadir
P
11

If you need good Windows support, you might prefer Mercurial. TortoiseHg (Windows explorer plugin) manages to offer a simple to use graphical interface to a rather complex tool. As state here, you will also have a Visual Studio plugin. However, last time I tried, the SVN interface didn't work that well on Windows.

If you don't mind the command line interface, I would recommend Git. Not for technical reason but for a strategical one. The adoption rate of git is much higher. Just see how many famous open source projects are switching from cvs/svn to Mercurial and how many are switching to Git. See how many code/project hosting providers you can find with git support compared to Mercurial hosting.

Prerogative answered 21/10, 2009 at 23:50 Comment(2)
There is also TortoiseGit, if you don't like using the command line. (But it requires msysgit to be installed.)Admeasure
Our company ended up choosing git because of its great support on Windows - check out Git Extensions. I'm biased because I'm now a contributor, but I wasn't when we started using it.Milliemillieme
H
8

Check out Scott Chacon's post from a while back.

I think git has a reputation for being "more complicated", though in my experience it's not more complicated than it needs to be. IMO, the git model is way easier to understand (tags contain commits (and pointers to zero or more parent commits) contain trees contain blobs and other trees... done).

It's not just my experience that git is not more confusing than mercurial. I'd recommend again reading this blog post from Scott Chacon on the matter.

Huntress answered 21/10, 2009 at 5:51 Comment(5)
Mercurial model is actually almost identical: changelog points to manifest point to file revisions/blob... done. If you were comparing the on-disk format you probably didn't account for the packs file which are more tricky to explain than the simple revlog format from hg.Larondalarosa
Well, that simplified model ignores tagging which is considerably clunkier in practice in hg (though I do argue that git tag is a bit confusing because it doesn't create a tag object by default). The on-disk format was particularly expensive for both projects that had a history of a lot of filenames.Huntress
I don't think the model ignores tagging: tagging is trivial in Mercurial -- as you know, it's just a file which gives names to SHA-1 hashes. There's not guesswork as to how tags flow around in the system: they move along with pushes and pulls. And if there is a tag conflict, well then it's also trivial to solve it: you solve it like any other conflict. After all, it's just a line in a text file. I think the simplicity of this model is a very nice feature.Limner
Dustin: Yeah, users are often confused by the fact that you cannot see the the 1.0 tag in .hgtags when you've checked out revision 1.0. However, you don't need to look inside .hgtags and you you'll find that hg tags still lists all tags. Furthermore, this behavior is a simple consequence of storing tags in a version controlled file -- again the model is easy to grasp and very predictable.Limner
Martin Geisler I'd argue that rules for tags in Mercurial, required because it uses version-controled file for transport, with layer on special rules to make tags non-versioned, is anything but easy to grasp.Bushhammer
H
8

I've used Git for a little over a year at my present job, and prior to that, used Mercurial for a little over a year at my previous job. I'm going to provide an evaluation from a user's perspective.

First, both are distributed version control systems. Distributed version control systems require a change in mindset from traditional version control systems, but actually work much better in many ways once one understands them. For this reason, I consider both Git and Mercurial much superior to Subversion, Perforce, etc. The difference between distributed version control systems and traditional version control systems is much larger than the difference between Git and Mercurial.

However, there are also significant differences between Git and Mercurial that make each better suited to its own subset of use cases.

Mercurial is simpler to learn. I got to the point where I rarely had to refer to documentation or notes after a few weeks of using Mercurial; I still have to refer to my notes regularly with Git, even after using it for a year. Git is considerably more complicated.

This is partly because Mercurial is just plain cleaner. You rarely have to branch manually in Mercurial; Mercurial will create an anonymous branch automatically for you if and when you need it. Mercurial nomenclature is more intuitive; you don't have to worry about the difference between "fetch" and "pull" as you do with Git. Mercurial is a bit less buggy. There are file name case sensitivity issues that used to cause problems when pushing projects across platforms with both Git and Mercurial; this were fixed in Mercurial some time ago while they hadn't been fixed in Git last I checked. You can tell Mercurial about file renames; with Git, if it doesn't detect the rename automatically - a very hit or miss proposition in my experience - the rename can't be tracked at all.

The other reason for Git's additional complication, however, is that much of it is needed to support additional features and power. Yes, it's more complicated to handle branching in Git - but on the other hand, once you have the branches, it's not too difficult to do things with those branches that are virtually impossible in Mercurial. Rebasing branches is one of these things: you can move your branch so that its base, instead of being the state of the trunk when you branched, is the state of the trunk now; this greatly simplifies version history when there are many people working on the same code base, since each of the pushes to trunk can be made to appear sequential, rather than intertwined. Similarly, it's much easier to collapse multiple commits on your branch into a single commit, which can again help in keeping the version control history clean: ideally, all the work on a feature can appear as a single commit in trunk, replacing all the minor commits and subbranches that the developer may have made while developing the feature.

Ultimately I think the choice between Mercurial and Git should depend on how large your version control projects are, measured in terms of the number of people working on them simultaneously. If you have a group of a dozen or more working on a single monolithic web application, for example, Git's more powerful branch management tools will make it a much better fit for your project. On the other hand, if your team is developing a heterogeneous distributed system, with only one or two developers working on any one component at any one time, using a Mercurial repository for each of the component projects will allow development to proceed more smoothly with less repository management overhead.

Bottom line: if you have a big team developing a single huge application, use Git; if your individual applications are small, with any scale coming from the number rather than the size of such applications, use Mercurial.

Hoag answered 11/5, 2014 at 5:1 Comment(0)
M
4

One difference totally unrelated to the DVCSs themselves:

Git seems to be very popular with C developers. Git is the de-facto repository for the Linux Kernel and this may be the reason why it is so popular with C developers. This is especially true for those that have the luxury of only working in the Linux/Unix world.

Java developers seem to favor Mercurial over Git. There are possibly two reasons for that: One is that a number of very large Java projects are hosted on Mercurial, including the JDK itself. Another is that the structure and clean documentation of Mercurial appeals to people coming from the Java camp whereas such people find Git inconsistent wrt command naming and lacking in documentation. I'm not saying that is actually true, I'm saying people have got used to something from their usual habitat and then they tend to choose DVCS from that.

Python developers almost exclusively favor Mercurial, I would assume. There's actually no rational reason for that other than the fact that Mercurial is based on Python. (I use Mercurial too and I really don't understand why people make a fuss about the implementation language of the DVCS. I don't understand a word of Python and if it wasn't for the fact that it is listed somewhere that it is based on Python then I wouldn't have known).

I don't think you can say that one DVCS fits a language better than another, so you shouldn't choose from that. But in reality people choose (partly) based on which DVCS they get most exposed to as part of their community.

(nope, I don't have usage statistics to back up my claims above .. it is all based on my own subjectivity)

Mckown answered 28/5, 2013 at 11:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.