Find duplicates and delete all in notepad++
Asked Answered
H

4

20

I have multiple email addresses. I need to find and delete all (including found one). Is this possible in notepad++?

example:[email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected],

I need results back like

[email protected], [email protected], [email protected], [email protected], [email protected], [email protected],

How to do in notepad++?

Huarache answered 11/2, 2016 at 1:15 Comment(5)
You could find and replace all of them with an empty string, thus deleting them all, and then manually write one line back in.Valdovinos
Possible duplicate of Removing duplicate rows in Notepad++Duplet
@Duplet That duplicate thread is talking about leave it single if find duplicate rows. But My case I need to delete all duplicate rows if find any.Huarache
OK @Huarache wrong duplicate. Have you checked the other questions on Stack Overflow? What did your searches for [notepad++] delete duplicates reveal?Duplet
#3958850Banville
S
50

If it is possible to change the sequence of the lines you could do:

  1. sort line with Edit -> Line Operations -> Sort Lines Lexicographically ascending
  2. do a Find / Replace:
    • Find What: ^(.*\r?\n)\1+
    • Replace with: (Nothing, leave empty)
    • Check Regular Expression in the lower left
    • Click Replace All

How it works: The sorting puts the duplicates behind each other. The find matches a line ^(.*\r?\n) and captures the line in \1 then it continues and tries to find \1 one or more times (+) behind the first match. Such a block of duplicates (if it exists) is replaced with nothing.

The \r?\n should deal nicely with Windows and Unix lineendings.

Schoolhouse answered 11/2, 2016 at 21:21 Comment(2)
Hi, thanks a lot ! It was very useful. However it doesn't work if a duplicate line is on two last lines.Pace
@Pace Maybe the very last line does not end with a newline? The \r?\n means newline, so the newline character is part of the match. (Try pressing Return after the very last line, so that it ends with a newline.)Schoolhouse
L
17

You just have to Edit->Line Operations->Remove Duplicate Lines

Lail answered 26/5, 2021 at 5:44 Comment(0)
T
4

You need the textFX plugin. Then, just follow these instructions:

Paste the text into Notepad++ (CTRL+V). ...
Mark all the text (CTRL+A). ...
Click TextFX → Click TextFX Tools → Click Sort lines case insensitive (at column)
Duplicates and blank lines have been removed and the data has been sorted alphabetically.

Personally, I would use sort -i -u source >dest instead of notepad++

Tuber answered 11/2, 2016 at 1:23 Comment(0)
P
0

You could use

Click TextFX → Click TextFX Tools → Click Sort lines case insensitive (at column) Duplicates and blank lines have been removed and the data has been sorted alphabetically.

as indicated above. However, the way I did it because I need to replace the duplicates by blank lines and not just remove the lines, once sorted alphabetically:

REPLACE:
((^.*$)(\n))(?=\k<1>)

by

$3

This will convert:

Shorts
Shorts
Shorts
Shorts
Shorts
Shorts Two Pack
Shorts Two Pack
Signature Braces
Signature Braces
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers
Signature Cotton Trousers

to:

Shorts

Shorts Two Pack

Signature Braces










Signature Cotton Trousers

That's how I did it because I specifically needed those lines.

Persuasive answered 10/7, 2017 at 17:26 Comment(1)
I have the same need, but I can't get your code to work. I pasted your example with the shorts and trousers into Notepad++ but the regex you present above didn't find any matches. And yes, I've got "Regular expressions" selected in the Replace-dialogue window.Tini

© 2022 - 2024 — McMap. All rights reserved.