How to make Google Diff Match Patch prefer changes at the end of a string?
Asked Answered
H

1

16

I am using the diff_main method of Google's DiffMatchPatch library to get diffs which I then use in my app. Consider this case:

Old string:

Tracker.Dependency.prototype.changed = function () {
   for (var id in this._dependentsById)
     this._dependentsById[id]._compute();
};

New string:

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

The addition diff I get is:

;
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c

Whereas it would seem that for human consumption a more reasonable diff would be:

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

Is there any way I can make DiffMatchPatch produce the second result rather than the first?

You can see an example here: https://jsfiddle.net/puje78vL/1/

Hyperesthesia answered 3/2, 2016 at 18:6 Comment(2)
want to share a fiddle?Transduction
Sorry everyone for not updating this sooner with further details. I was very ill and didn't get to my laptop until now.Hyperesthesia
N
8

I have created a JSFiddle based on the library's author example page (assuming that you want the Javascript version based on the question tag).

And using this code would give me what you expect:

var dmp = new diff_match_patch();

function launch() {
  var text1 = document.getElementById('text1').value;
  var text2 = document.getElementById('text2').value;

  var d = dmp.diff_main(text1, text2);
  var ds = dmp.diff_prettyHtml(d);

  document.getElementById('outputdiv').innerHTML = ds;
}

You can also look at the console and see the raw answer (the arrays) where you can also see that diff_main is returning what you are expecting. Are you doing something different? If so please share your code.

New Info

Now that you provided the full text I can give you a better answer: the result you are seeing is ok, it is just the way the algorithm works

I'll try to explain to you what is going on and how to fix this. Let's take a look at the final part of each text:

Text 1

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Text 2

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

Lets notice this:

  1. The final }; of the changed function on Text 1 has no carriage return after it.
  2. The final }; of the changed function on Text 2 has carriage return after it.
  3. The final }; of the autorun function on Text 2 has no carriage return after it.

So the algorithm that calculates the diffs will match 1 with 3, leaving 2 as the added text. This is why tou are getting that output.

Now in order to get the desired output you will need to match 1 with 2. This means add a new empty line at the end of Text 1 as you can see on tour updated JSFIddle:

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};[PRESS ENTER HERE TO ADD NEW LINE]

Take notice that if you use just this text the algorithm will work correctly (as I showd in my orginal answer). It is after you add some more text when this confussion starts to happen, not sure why.

Newton answered 5/2, 2016 at 21:16 Comment(7)
Yeah when I made a fiddle with such an isolated test case, it worked fine. I have to look into the code a bit more, but I have been quite ill last few days :(Hyperesthesia
Then you will need to provide your code, environment, etc. Otherwise there is no way we can help any more.Haley
I added a fiddle that demonstrates the example.Hyperesthesia
Thanks for the research. It's a little unsatisfactory to tell my users to "add new lines to the previous revision of your code". Sounds about as good as saying "it's not a bug, it's a feature".Hyperesthesia
What if you add the line to the first text programatically? The end user won't notice.Haley
Well if you add a line also to text 2, than the problem shows up again.Hyperesthesia
Maybe try another diff library. Like prettydiff.Haley

© 2022 - 2024 — McMap. All rights reserved.