How can I mark/highlight duplicate lines in VI editor?
Asked Answered
S

6

85

How would you go about marking all of the lines in a buffer that are exact duplicates of other lines? By marking them, I mean highlighting them or adding a character or something. I want to retain the order of the lines in the buffer.

Before:

foo
bar
foo
baz

After:

foo*
bar
foo*
baz
Syncretism answered 12/8, 2009 at 18:53 Comment(0)
C
120

As an ex one-liner:

:syn clear Repeat | g/^\(.*\)\n\ze\%(.*\n\)*\1$/exe 'syn match Repeat "^' . escape(getline('.'), '".\^$*[]') . '$"' | nohlsearch

This uses the Repeat group to highlight the repeated lines.

Breaking it down:

  • syn clear Repeat :: remove any previously found repeats
  • g/^\(.*\)\n\ze\%(.*\n\)*\1$/ :: for any line that is repeated later in the file
    • the regex
      • ^\(.*\)\n :: a full line
      • \ze :: end of match - verify the rest of the pattern, but don't consume the matched text (positive lookahead)
      • \%(.*\n\)* :: any number of full lines
      • \1$ :: a full line repeat of the matched full line
    • exe 'syn match Repeat "^' . escape(getline('.'), '".\^$*[]') . '$"' :: add full lines that match this to the Repeat syntax group
      • exe :: execute the given string as an ex command
      • getline('.') :: the contents of the current line matched by g//
      • escape(..., '".\^$*[]') :: escape the given characters with backslashes to make a legit regex
      • syn match Repeat "^...$" :: add the given string to the Repeat syntax group
  • nohlsearch :: remove highlighting from the search done for g//

Justin's non-regex method is probably faster:

function! HighlightRepeats() range
  let lineCounts = {}
  let lineNum = a:firstline
  while lineNum <= a:lastline
    let lineText = getline(lineNum)
    if lineText != ""
      let lineCounts[lineText] = (has_key(lineCounts, lineText) ? lineCounts[lineText] : 0) + 1
    endif
    let lineNum = lineNum + 1
  endwhile
  exe 'syn clear Repeat'
  for lineText in keys(lineCounts)
    if lineCounts[lineText] >= 2
      exe 'syn match Repeat "^' . escape(lineText, '".\^$*[]') . '$"'
    endif
  endfor
endfunction

command! -range=% HighlightRepeats <line1>,<line2>call HighlightRepeats()
Crete answered 13/8, 2009 at 8:7 Comment(8)
i can't get this to work. i've put the function in my ~/.vimrc but when i run ":call HighlightRepeats()" i get an error: Error detected while processing function HighlightRepeats: line 10: E28: No such highlight group name: RepeatOvercapitalize
Daps0l: try adding hi link Repeat Statement to your ~/.vimrc.Crete
(it's probably because your colorscheme doesn't define the Repeat highlighting group)Crete
this is probably a dumb question, but how do I clear the highlighting, it changes the color of the duped lines, but I can't get it backMichelemichelina
if anyone else has the above problem, just use :e to clear itMichelemichelina
What if I want to replace the duplicate lines instead of highlighting them?Karr
I think there is a missing backslash: \%(.*\n\)* should be \%\(.*\n\)* I cannot edit the response.Bombast
@Bombast - nope, \%( ... \) is how to do non-capturing grouping in vi regexCrete
D
86

None of the answers above worked for me so this is what I do:

  1. Sort the file using :sort
  2. Execute command :g/^\(.*\)$\n\1$/p
Dwarfism answered 24/2, 2015 at 8:8 Comment(5)
Thank you. I feel this is better approach. With this we can find duplicates lines as well customize up to required lengthHallah
I'm going to need a hand with the explanation. g global command (run through each line from top to bottom in this case) ^\(.*\)$ capture the entire line ... \n and the newline character ... \1$ and the previous line (to the end) This part just checks if the next line is the same as the current line. But why does my Vim highlight multiple separate occurrences, and what is the p (paste) for?Funchal
@Ari: p is for "print". Actually you can remove that, because it is the default command. Not sure what you mean with "why does my Vim highlight multiple separate occurrences". Vim should highlight/print duplicate lines.Sidewinder
Tried using vim -u NONE and it opens a split pane with the results. I understand what you mean by print in this context. One of my plugins shows this imgur.com/a/6Qu8Q65 with highlighted rows. On my other Vim instance it didn't show the bottom split and only highlighted rows instead.Funchal
@Funchal my Vim is also opening a pane for the results. Not sure why. I'm unfortunately not an expert in configuring Vim.Sidewinder
A
21
  1. :sort and save it in file1.
  2. :sort u and save it in file2.
  3. gvimdiff or tkdiff the two files.
Astatic answered 10/5, 2017 at 6:36 Comment(0)
M
5

Why not use:

V*

in normal mode.

It simply searches all matches of current line, thus highlighting them (if the setting is enabled, which I think it's the default) Besides, you can then use

n

To navigate through the matches

Morentz answered 17/8, 2009 at 6:0 Comment(3)
Visual mode doesn't support * by default. It's probably a function you have in your .vimrc. Something like this: xno * :<c-u>cal<SID>VisualSearch()<cr>/<cr> xno # :<c-u>cal<SID>VisualSearch()<cr>?<cr> fun! s:VisualSearch() let old = @" | norm! gvy let @/ = '\V'.substitute(escape(@", '\'), '\n', '\\n', 'g') let @" = old endf)Unblock
Arg, the formatting messed up. Here's what I meant: pastebin.com/f2ee37c92Unblock
It would only match one thing at a time, whereas I'd prefer to indicate all lines that are duplicates of other lines all at once. Nice function though, seems handy.Syncretism
O
2

Run through the list once, make a map of each string and how many times it occurs. Loop through it again, and append your * to any string that has a value of more than one in the map.

Organogenesis answered 12/8, 2009 at 19:0 Comment(0)
I
2

Try:

:%s:^\(.\+\)\n\1:\1*\r\1:

Hope this works.

Update: next try.

:%s:^\(.\+\)$\(\_.\+\)^\1$:\1\r\2\r\1*:
Iphigenia answered 13/8, 2009 at 4:58 Comment(1)
This will only detect adjacent duplicate lines, and will only mark the first copy, not the second.Crete

© 2022 - 2024 — McMap. All rights reserved.