How does the patience algorithm differ from the default git diff
algorithm, and when would I want to use it?
You can read a post from Bram Cohen, the author of the patience diff algorithm, but I found this blog post to summarize the patience diff algorithm very well:
Patience Diff, instead, focuses its energy on the low-frequency high-content lines which serve as markers or signatures of important content in the text. It is still an LCS-based diff at its core, but with an important difference, as it only considers the longest common subsequence of the signature lines:
Find all lines which occur exactly once on both sides, then do longest common subsequence on those lines, matching them up.
When should you use patience diff? According to Bram, patience diff is good for this situation:
The really bad cases are ones where two versions have diverged dramatically and the developer isn't being careful to keep patch sizes under control. Under those circumstances a diff algorithm can occasionally become 'misaligned' in that it matches long sections of curly brackets together, but it winds up correlating the curly brackets of functions in one version with the curly brackets of the next later function in the other version. This situation is very ugly, and can result in a totally unusable conflict file in the situation where you need such things to be presented coherently the most.
You can also use it for merges (worked really well here for some XML conflicts):
git merge --strategy-option=patience ...
git config --global diff.algorithm patience
–
Cambogia git merge -X patience
. –
Doggone git merge -Xpatience
–
Renita The patience diff algorithm is a slower diff algorithm that shows better results in some cases.
Suppose you have the following file checked in to git:
.foo1 {
margin: 0;
}
.bar {
margin: 0;
}
Now we reorder the sections and add a new line:
.bar {
margin: 0;
}
.foo1 {
margin: 0;
color: green;
}
The default diff algorithm claims that the section headings have changed:
$ git diff --diff-algorithm=myers
diff --git a/example.css b/example.css
index 7f1bd1e..6a64c6f 100755
--- a/example.css
+++ b/example.css
@@ -1,7 +1,8 @@
-.foo1 {
+.bar {
margin: 0;
}
-.bar {
+.foo1 {
margin: 0;
+ color: green;
}
Whereas patience diff shows a result that is arguably more intuitive:
$ git diff --diff-algorithm=patience
diff --git a/example.css b/example.css
index 7f1bd1e..6a64c6f 100755
--- a/example.css
+++ b/example.css
@@ -1,7 +1,8 @@
-.foo1 {
- margin: 0;
-}
-
.bar {
margin: 0;
}
+
+.foo1 {
+ margin: 0;
+ color: green;
+}
There's a good discussion of subjective diff quality here, and git 2.11 is exploring diff heuristics further.
Note that the patience diff algorithm still has some known pathological cases.
© 2022 - 2024 — McMap. All rights reserved.
--histogram
parameter which "...extends the patience algorithm to "support low-occurrence common elements" git-scm.com/docs/git-diff.html – Pachysandra