Compare files and return only the differences using Notepad++
Asked Answered
M

4

34

Notepad++ has a Compare Plugin tool for comparing text files, which operates like this:

Launch Notepad++ and open the two files you wish to run a comparison check on.

Click the “Plugins” menu,

Select “Compare” and click “Compare.”

The plugin will run a comparison check and display the two files side by side, with any differences in the text highlighted.

This is a nice feature, and which I have used happily for some time. Now, I have been looking for an option to go further and select the highlighted differing lines (e.g. by deleting the non-highlighted ones), or vice versa: i.e. expunge the highlighted lines.

Is there a straightforward way to achieve this?

Mazurka answered 28/6, 2015 at 11:4 Comment(1)
Is superuser.com/questions/562208/… what you meant?Tressa
W
12

To substract two files in notepad++ (file1 - file2) you may follow this procedure:

  1. Recommended: If possible, remove duplicates on both files, specially if the files are big. To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
  2. Add ---------------------------- as a footer on file1 (add at least 10 dashes). This is the marker line that separates file1 content from file2.
  3. Then copy the contents of file2 to the end of file1 (after the marker)
  4. Control + H
  5. Search: (?m-s)^(?:-{10,}+\R[\s\S]*+|(.*+)\R(?=(?:(?!^-{10,}$)-++|[^-]*+)*+^-{10,}+\R(?:^.*+\R)*?\1(?:\R|\z))) note: use case sensitivity according to your needs
  6. Replace by: (leave empty)
  7. Select Regular expression radio button
  8. Replace All

You can modify the marker if It is possible that file1/file2 can have lines equal to the marker. In that case you will have to adapt the regular expression.

By the way, you could even record a macro to do all steps (add the marker, switch to file2, copy content to file1, apply the regex with a single button press.

Edited:

Changed the regex to add some improvements:

  • Speed related:
    • Avoid as much backtracking as possible
    • Avoid searching after the mark
  • Usability:
    • Dashes are allowed for the lines. But the separator is still ^-{10,}$
    • Works with other characters besides words

Speed comparison:

New method vs Old method

So basically 78ms vs 1.6seconds. So a nice improvement! That makes comparing Kilobyte-sized files possible.

Still you may want to use some dedicated program for comparing or substracting bigger files.

Wsan answered 25/2, 2020 at 15:50 Comment(4)
Added another option. Usually, if the files are not so big and I like to see a graphical comparison, I use the other method, that implies using the Compare plugin. For big files and just for getting the result, I use this methodWsan
Extremely slow for big files.Officialdom
Of course It is :-)! If you want to substract files there are much better alternatives. If you want to do it with notepad++... you either do this or develop your own plugin. Anyways, I have improved the regex, now it should be much faster (78ms vs 1.6s for a test I did) So this makes possible comparing bigger files than before. But of course at some point it will be too slow if the file weights too muchWsan
Amazing solution for my quick task. Thank you very much for regexTyrocidine
W
2

I have a dirty workaround for this. It saves some time compared to Control+C, Alt+Tab, Control+V; Control+C, Alt+Tab, Control+V; ... but It may not be worth on big files or if the differences for both files are big. For bigger files you may prefer using some other tool.

Typically this works best when comparing group of 'words' and does not work with content that is tabulated (like source code)

So the workaround is:

  1. Optional: (depends on the content that's being compared) Sort both files (it will make the future comparison easier) To do this: Edit => Line operations => Sort Lines Lexicographically Ascending (do it on both files)
  2. Compare files with the plugin
  3. Choose one file and inspect the lines you want to keep. Add one tabulator before each of those lines. Remeber you can select several lines and press tab for tabulating them. Optionally, you may add tabulators to the lines you want to remove
  4. Sort the file. The tabulated lines will come up first. So now you can copy-paste them (or copy-paste the untabulated ones)
Wsan answered 19/7, 2018 at 17:11 Comment(0)
Y
1

If the number of differences is not large, a quicker method might be just bookmarking each differing line using keyboard shortcuts. Starting from the beginning of the file, press Alt+Page Down to focus on the first difference, and then press Ctrl+F2 to bookmark it. Continue with alternatingly pressing Alt+Page Down and Ctrl+F2 until the last difference.

With all the differing lines bookmarked, you can use any of the operations under "Search -> Bookmarks" menu:

  • Cut Bookmarked Lines
  • Copy Bookmarked Lines
  • Paste to (Replace) Bookmarked Lines
  • Remove Bookmarked Lines
  • Remove Unmarked Lines
Yoder answered 6/6, 2022 at 19:25 Comment(0)
A
-2

move the files to a linux box and then execute diff command: $ diff file1.txt file2.txt > file_diff.txt

Airminded answered 22/10, 2020 at 14:47 Comment(3)
No doubt there are plenty alternatives for extracting differences between files, but that's not the question.Bricker
I like this approach the best.Smythe
This is the right answerAedes

© 2022 - 2024 — McMap. All rights reserved.