Why Mercurial doesn't need "recursive merge strategy"?

Asked 2/5, 2011 at 10:12 Answered 9/4, 2019 at 21:50

AFAIK git's default merge strategy is "recursive" which means when more than one "common ancestor" ends up being a "good candidate", git will merge them and create a new "virtual common ancestor" for the contributors. It basically helps solving situations where files were already merged and it avoids merging them again or coming up with incorrect merge contributors.

My question is: if Mercurial doesn't use "recursive", how does it handle the same situation?

Thanks

Ocasio answered 2/5, 2011 at 10:12 Comment(9)

How is this different from just obtaining the last common snapshot as the common ancestor? – Neuburger 2/5, 2011 at 10:15

@lasse: this is different as it produces different merge resolutions, that are hopefully better – Uthrop 2/5, 2011 at 10:39

I haven't actually tried it, but I overheard last week-end at the mercurial sprint that recursive can create mismerges as well (and it have happened in the past). – Arellano 2/5, 2011 at 11:28

What I meant was; Mercurial already avoids re-merging things it has already merged by finding the last common ancestor of the two branches being merged. So if you already have 2 heads + 1 common ancestor, what would having extra ancestors give you? Presumably, the changes of those other ancestors are already present in the common ancestor. – Neuburger 2/5, 2011 at 12:42

@lasse v. karlsen: recursive merge is supposed to avoid repeating the same merge over and over in case of criss-cross merges. – Arellano 2/5, 2011 at 12:43

I have tried to find a conclusive example of how a criss-cross merge looks, ie. a real file. Could you give me one? Forgive me for being dense, but I'm still learning all the nuances of DVCS', and I'd like to learn about this scenario, how to avoid it, detect it, and/or resolve it. – Neuburger 2/5, 2011 at 19:57

I have been a lead dev on an internal source control system for Oracle. So we have been solving this problem and it is one of the unsolvable problems in single merge. The recursive merge idea is only possible if every one of the merges you need to do is automatic merge. But in certain trees to do a cross-merge of a branch, you have to do N 3-way merges. If even one of those is not a good candidate for automatic merge, you end up with errors. The way git does it, as so many other things is wrong and leads to missing code chunks in certain edge cases. – Apollo 8/10, 2016 at 4:58

@JiriKlouda Do you still think that Git handles this situation not very well? Can you give examples of Git showing where it fails to handle this problem? Thanks – Ichthyosaur 21/12, 2021 at 11:51

Git cannot handle the situation any better because it does not have data structures that would record information needed to compute which version is the correct base for merge. Further git has a mechanism which can completely remove the correct base version for merge from its version history making it impossible to identify it as it does not even exist. Situation with versioning systems is not even a tiny bit better than when I made my comment. There is zero development in this area on theoretical level. And that has a direct cost in billions of dollars to Fortune 500. – Apollo 21/12, 2021 at 19:7

Most version control system do not know how to handle a situation where there is multiple base versions for a merge. The math merge equation is

Result = Destination + SumOf(I=1-N)(Base(I) - Source(I))

In most cases N=1 and you got a classic merge with source, destination and base versions which a typical 3-way merge tool can handle. Although many source control systems do not have even in this simple case a correct algorithm for finding the Base version. To do so you need to trace back through version tree going up the merge arrows, until you meet at a common ancestor. But sometimes the common ancestor is too far, not fitting the equation above for N=1 and in that case you need to find multiple common ancestors for multiple partial merges.

Example would be a case where a branch is merged down and up multiple times, then we try to cross merge the changes from this branch to another branch. In such case the N > 1, but lower than the number of merge downs on the source branch.

This is one of the hardest thing to do in branch merging and I don't know a source control system that actually does it correctly.

Apollo answered 17/5, 2011 at 10:9 Comment(1)

It has passed several years since your answer. What is from your view the source control system today that best attacks this problem, and why? Thanks. – Ichthyosaur 21/12, 2021 at 11:49

The original author of Mercurial wrote about why he didn't use recursive merge strategy (link): Basically the answer is:

For the cases where ancestor ambiguity is the most interesting [...] recursive merges don't help at all. So I don't think they warrant the extra complexity

But the full answer is really interesting to read so I encourage you to do so. I'll just copy it here in case it disappear:

> Does Mercurial supports recursive merge strategy like git? It is used
> in situation when
> merge has two "common" ancestors (also know as criss-cross merge)
> 
> According to http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html
> Mercurial
> does not support it but I wanted to ask to make sure that nothing has changed.

Indeed. But you shouldn't judge the situation from this blog post as
it's not coherent.

In particular, the example given under "Why merge recursive is better –
a step by step example" doesn't appear to be a recursive merge situation
at all! Notice the key difference in topology as compared with the
initial diagrams: no criss-crossing merges leading up to the merge. Some
kind of bait and switch happening here.

In the example itself, Git will choose the same (single) ancestor in a
merge between nodes 5 and 4 as Mercurial would, 0. And thus both give
the result 'bcdE'. So we've learned precisely nothing about recursive
merge and how it compares to Mercurial from this example. The claim that
Mercurial chooses the "deepest" ancestor: also wrong and nonsensical.
The deepest ancestor is the root.

This seems to be yet another instance of "Git is incomprehensible,
therefore Git is magic, therefore Git magically works better" logic at
work.

Let's _actually_ work his original example diagram which has the
criss-crossing merges (which I guess he copied from someone who knew
what they were talking about). I'm going to ignore the blogger's
nonsensical use of arrows that point the wrong way for branch merges and
thus add cycles into the "directed acyclic graph". Here history flows
from left to right, thus the edges are right to left:

a---b-d-f---?
 \   \ /   / 
  \   X   /
   \ / \ /
    c-e-g

Let's make up a simple set of changes to go with that picture. Again,
think of each character as a line:

a = "a"
b = "a1"
c = "1a"
d = "a2"
e = "2a"
f = merge of d and c = "1a2" 
g = merge of e and b = "2a1"

When we merge f and g, our greatest common ancestor is either b or c. So
we've got the following cases:

b: we had a1 originally, and are looking at 1a2 and 2a1. So we have a
conflict at the start, but can simply choose 2 for the end as only one
side touched the end.

c: we had 1a originally, and are looking at 1a2 and 2a1. So we have a
conflict at the end, but can simply choose 2 for the start as only one
side touched the start.

Mercurial will choose whichever one of these it finds first, so we have
one conflict to resolve. It definitely does not choose 'a' as the
ancestor, which would give two conflicts.

Now what a recursive merge would do would be merging b and c first,
giving us "1a1". So now when we merge, we don't have conflicts at the
front or the back.

So yay, in this simplest of examples, it's a win. But cases where this
actually matters aren't terribly common (let's call it 1% to be
generous) and cases where it actually automatically solves the problem
for you seamlessly are actually less than half of THOSE cases.

Instead, if you've got conflicts in your recursive merge, now you've
made the whole situation more confusing. Take your blog post as Exhibit
A that most people don't understand recursive merge at all which means
when a merge goes wrong, not only do you need an expert to diagnose it,
you need an expert to tell you who the 'experts' even are.

We talk about recursive merge occasionally. But as it happens, for the
cases where ancestor ambiguity is the most interesting (merging with
backouts, exec bit changes), recursive merges don't help at all. So I
don't think they warrant the extra complexity.

Lemke answered 9/4, 2019 at 21:50 Comment(0)

Recommended topics

Hot tags