Removing duplicate rows in vi?

S

15

173

I have a text file that contains a long list of entries (one on each line). Some of these are duplicates, and I would like to know if it is possible (and if so, how) to remove any duplicates. I am interested in doing this from within vi/vim, if possible.

Smoothtongued answered 8/12, 2008 at 22:24 Comment(3)

Looks like a duplicate of #747189 – Mcneal 18/2, 2010 at 21:1

This one is 1 year old; that one is 10 months. So, other way around. – Smoothtongued 26/2, 2010 at 19:50

@Smoothtongued consensus now is to prioritize upvote count (which you also have more of): meta.stackexchange.com/questions/147643/… And those are not duplicates, that one does not mention Vim :-) – Persistence 8/8, 2016 at 8:38

H

393

If you're OK with sorting your file, you can use:

:sort u

Halfwitted answered 8/12, 2008 at 22:32 Comment(7)

If sorting is unacceptable, use :%!uniq to simply remove duplicate entries without sorting the file. – Progressionist 6/3, 2018 at 14:39

once you use the command the whole file changes? how do you go back? I already saved the file by mistake ... my bad – Hermosillo 4/5, 2018 at 14:51

Just use Vim's undo command: u – Worldwide 11/8, 2019 at 0:23

@Worldwide but I already closed the file and it does not seem to remember data from the last session. (This is a downside of always closing files while saving by just pressing ZZ.) – Hermosillo 25/2, 2021 at 18:16

@cryptic0, uniq won't work unless the duplicates are sorted a$b$a$ does nothing – Zobe 2/5, 2021 at 17:58

@nilon, that's when using persistent undo comes in handy https://mcmap.net/q/144659/-using-vim-39-s-persistent-undo – Lachus 14/10, 2022 at 9:54

You can select the lines you want sorted and deduplicated first with V or something similar, then issue the command. – Likeness 11/5, 2023 at 12:32

W

38

Try this:

:%s/^\(.*\)\(\n\1\)\+$/\1/

It searches for any line immediately followed by one or more copies of itself, and replaces it with a single copy.

Make a copy of your file though before you try it. It's untested.

Withrow answered 8/12, 2008 at 22:27 Comment(7)

@hop Thanks for testing it for me. I didn't have access to vim at the time. – Withrow 9/12, 2008 at 20:50

this hightlights all the duplicate lines for me but doesn't delete, am I missing a step here? – Transude 14/9, 2012 at 23:57

I'm pretty sure this will also highlight a line followed by a line that has the same "prefix" but is longer. – Uncritical 29/4, 2015 at 1:47

This is the better solution and doesn't change the line numbers as well. Thanks – Phosphoroscope 2/10, 2017 at 5:44

The only issue with this is that if you have multiple duplicates (3 or more of the same lines), you have to run this many times until all dups are gone since this only removes them one set of dups at a time. – Freeload 22/1, 2018 at 23:56

g/\v([^ ].*)$\n\1/d avoiding blank lines would be great – Ludeman 19/3, 2018 at 18:26

Another drawback of this: this won't work unless your duplicate lines are already next to each other. Sorting first would be one way of ensuring they're next to each other. At that point, the other answers are probably better. – Freeload 9/3, 2019 at 22:42

F

32

From command line just do:

sort file | uniq > file.new

Furcula answered 11/4, 2011 at 16:31 Comment(5)

This was very handy for me for a huge file. Thanks! – Sapsago 25/1, 2014 at 15:59

Couldn't get the accepted answer to work, as :sort u was hanging on my large file. This worked very quickly and perfectly. Thank you! – Cercus 23/3, 2015 at 15:50

'uniq' is not recognized as an internal or external command, operable program or batch file. – Uncritical 29/4, 2015 at 1:49

Yes -- I tried this technique on a 2.3 GB file, and it was shockingly quick. – Insouciant 6/2, 2017 at 19:46

@Uncritical You are on windows PC? Maybe you can use cygwin. – Journal 8/5, 2018 at 12:55

S

15

awk '!x[$0]++' yourfile.txt if you want to preserve the order (i.e., sorting is not acceptable). In order to invoke it from vim, :! can be used.

Store answered 4/8, 2016 at 12:38 Comment(3)

This is lovely! Not needing to sort is exactly what I was looking for! – Lyceum 13/10, 2017 at 16:36

what does it do? – Zobe 2/5, 2021 at 21:58

This can also be done in perl if it strikes your fancy perl -nle 'print unless $seen{$_}++' yourfile.txt – Pollinize 30/3, 2023 at 19:43

M

6

I would combine two of the answers above:

go to head of file
sort the whole file
remove duplicate entries with uniq

1G
!Gsort
1G
!Guniq

If you were interested in seeing how many duplicate lines were removed, use control-G before and after to check on the number of lines present in your buffer.

Magneto answered 9/12, 2008 at 1:16 Comment(1)

'uniq' is not recognized as an internal or external command, operable program or batch file. – Uncritical 29/4, 2015 at 1:48

B

6

g/^\(.*\)$\n\1/d

Works for me on Windows. Lines must be sorted first though.

Banausic answered 1/11, 2009 at 18:23 Comment(1)

This will delete a line following a line which is it's prefix: aaaa followed by aaaabb will delete aaaa erroneously. – Uncritical 29/4, 2015 at 1:51

L

4

Select the lines in visual-line mode (Shift+v), then :!uniq. That'll only catch duplicates which come one after another.

Lanam answered 8/12, 2008 at 22:32 Comment(2)

Just to note this will only work on computers with the uniq program installed i.e. Linux, Mac, Freebsd etc – Leisurely 11/2, 2014 at 11:26

This will be the best answer to those who don't need sorting. And if you are windows user, consider to try Cygwin or MSYS. – Southsoutheast 30/6, 2016 at 4:44

H

4

If you don't want to sort/uniq the entire file, you can select the lines you want to make uniq in visual mode and then simply: :sort u.

Habit answered 13/1, 2021 at 9:56 Comment(1)

If you know the line numbers you want sorted to unique you can prefix the starting and ending line numbers, eg. if you want to sort+unique lines 5 through 10 the command would be :5,10 sort u – Pollinize 30/3, 2023 at 19:39

M

1

Regarding how Uniq can be implemented in VimL, search for Uniq in a plugin I'm maintaining. You'll see various ways to implement it that were given on Vim mailing-list.

Otherwise, :sort u is indeed the way to go.

Maintop answered 9/12, 2008 at 10:5 Comment(0)

J

0

I would use !}uniq, but that only works if there are no blank lines.

For every line in a file use: :1,$!uniq.

Jegar answered 8/12, 2008 at 22:34 Comment(0)

A

0

:%s/^\(.*\)\(\n\1\)\+$/\1/gec

or

:%s/^\(.*\)\(\n\1\)\+$/\1/ge

this is my answer for you ,it can remove multiple duplicate lines and only keep one not remove !

Apuleius answered 30/4, 2014 at 6:45 Comment(0)

L

0

This version only removes repeated lines that are contigous. I mean, only deletes consecutive repeated lines. Using the given map the function does note mess up with blank lines. But if change the REGEX to match start of line ^ it will also remove duplicated blank lines.

" function to delete duplicate lines
function! DelDuplicatedLines()
    while getline(".") == getline(line(".") - 1)
        exec 'norm! ddk'
    endwhile
    while getline(".") == getline(line(".") + 1)
        exec 'norm! dd'
    endwhile
endfunction
nnoremap <Leader>d :g/./call DelDuplicatedLines()<CR>

Ludeman answered 19/3, 2018 at 20:36 Comment(0)

C

0

An alternative method that does not use vi/vim (for very large files), is from the Linux command line use sort and uniq:

sort {file-name} | uniq -u

Cavuoto answered 16/10, 2018 at 11:20 Comment(0)

S

0

This command got me a buffer without any duplicate lines without sorting, and it shouldn't be very hard to research why it works or how it could work better:

:%!python3.11 -c 'exec("import fileinput\nLINES = []\nfor line in fileinput.input():\n    line = line.splitlines()[0]\n    if line not in LINES:\n        print(line)\n        LINES.append(line)\n")'

Sidewalk answered 28/12, 2023 at 3:19 Comment(0)

S

-1

This worked for me for both .csv and .txt

awk '!seen[$0]++' <filename> > <newFileName>

Explanation: The first part of the command prints unique rows and the second part i.e. after the middle arrow is to save the output of the first part.

awk '!seen[$0]++' <filename>

>

<newFileName>

Stela answered 17/10, 2018 at 10:2 Comment(0)

Recommended topics

Hot tags