Using regular expressions to do mass replace in Notepad++ and Vim
Asked Answered
N

16

32

So I've got a big text file which looks like the following:

<option value value='1' >A
<option value value='2' >B
<option value value='3' >C
<option value value='4' >D

It's several hundred lines long and I really don't want to do it manually. The expression that I'm trying to use is:

<option value='.{1,}' >

Which is working as intended when i run it through several online regular expression testers. I basically want to remove everything before A, B, C, etc. The issue is when I try to use that expression in Vim and Notepad++, it can't seem to find anything.

Numeration answered 13/11, 2008 at 16:25 Comment(0)
P
20

Everything before the A, B, C, etc.

That seems so simple I must be misinterpreting you. It's just

:%s/<.*>//
Polinski answered 13/11, 2008 at 17:25 Comment(0)
S
62

In Notepad++ you don't need to use Regular Expressions for this.

Hold down alt to allow you to select a rectangle of text across multiple rows at once. Select the chunk you want to be rid of, and press delete.

Saury answered 16/11, 2010 at 17:20 Comment(2)
A little late maybe, but +1 for a third option that should have been mentioned two years ago but wasn't. And of course, vim also supports rectangular selections.Depalma
Definitely a big +1. Had no ideea Notepad++ can do that.Gauntlet
T
30

In Notepad++ :

<option value value='1' >A
<option value value='2' >B
<option value value='3' >C
<option value value='4' >D


Find what: (.*)(>)(.)
Replace with: \3

Replace All


A
B
C
D
Terti answered 7/1, 2009 at 20:7 Comment(1)
+1 because it gave the regex syntax used in Notepad++, not Vim.Ablution
P
20

Everything before the A, B, C, etc.

That seems so simple I must be misinterpreting you. It's just

:%s/<.*>//
Polinski answered 13/11, 2008 at 17:25 Comment(0)
B
8

There is a very simple solution to this unless I have not understood the problem. The following regular expression:

(.*)(>)(.*)

will match the pattern specified in your post.

So, in notepad++ you find (.*)(>)(.*) and replace it with \3.

The regular expressions are basically greedy in the sense that if you specify (.*) it will match the whole line and what you want to do is break it down somehow so that you can extract the string you want to keep. Here, I have done exactly the same and it works fine in Notepad++ and Editplus3.

Breach answered 13/11, 2008 at 17:1 Comment(2)
I tried yourt solution with another search pattern. I have a filewith lines "a = something" " b = " "c = sthelse" and wanted to get rid of "a = " and keep onlythe last parts with a pipe symbol "something| | sthelse". so i tried: (.*)( = )(.) -> |\3 . That gave me an empty file. What am i doing wrong?Carcinoma
Hey, not very clear from your comment. Please comment with what you have right now and what you would like to change it to. Thanks.Breach
H
7

There are two problems with your original solution. Firstly, your example text:

<option value value='1' >A

has two occurences of the "value" word. Your regex does not. Also, you need to escape the opening brace in the quantifier of your regex or Vim will interpret it as a literal brace. This regex works:

:%s/<option value value='.\{1,}' >//g
Herder answered 13/11, 2008 at 16:43 Comment(0)
H
6

This will remove the option tag and just leave the letters in vim:

:%s/<option.*>//g
Hypodermis answered 13/11, 2008 at 16:29 Comment(0)
L
4

It may help if you're less specific. Your expression there is "greedy", which may be interpreted different ways by different programs. Try this in vim:

%s/^<[^>]+>//
Lox answered 13/11, 2008 at 16:30 Comment(2)
The '+' needs to be escaped: ^<[^>]\+> for this expression to work. I like this version better than the @Hypodermis answer that is getting voted up.Breakwater
Shoot, you're right, thanks for the correction. And thanks; I always believe simpler is better, esp. when it comes to regex.Lox
H
4

In notepad++

Search

(<option value="\w\w">)\w+">(.+)

Replace with

\1\2
Horoscopy answered 29/3, 2011 at 9:48 Comment(0)
A
3

In vim

:%s/<option value='.\{1,}' >//

or

:%s/<option value='.\+' >//

In vim regular expressions you have to escape the one-or-more symbol, capturing parentheses, the bounded number curly braces and some others.

See :help /magic to see which special characters need to be escaped (and how to change that).

Angelaangele answered 13/11, 2008 at 17:1 Comment(0)
P
2

Having the same problem (with jQuery " done..." strings), but only in Notepad++, I asked, received good friendly replies (that made me understand what I had missed), then spent the time to build a detailed step-by-step explanation, see Finding Line Beginning using Regular expression in Notepad++

Versailles, Tue 27 Apr 2010 22:53:25 +0200

Proselytism answered 21/4, 2010 at 7:43 Comment(0)
D
2

Notepad ++ : Search Mode = Regular expression

Find what: (.*>)(.)

Replace with: \2

Dirham answered 22/4, 2013 at 8:7 Comment(0)
A
1

This will work. Tested it in my vim. the single quotes are the trouble.

1,$s/^<option value value=['].['] >/
Acronym answered 13/11, 2008 at 16:31 Comment(0)
O
1

Vim:

:%s/.* >//

Oshinski answered 13/11, 2008 at 16:33 Comment(0)
L
1

A little after the fact, but in case its useful to anyone, I was able to follow one of the examples on here (by sdgfsdg) and quickly pick up Regular Expressions for Notepad++.

I had to similarly pull out some redundant data from a list of HTML select dropdown options, of the form:

<select>
  <option value="AC">saint_helena">Ascension Island</option>
  <option value="AD">andorra">Andorra</option>
  <option value="AE">united_arab_emirates">United Arab Emirates</option>
  <option value="AF">afghanistan">Afghanistan</option>:
  ...
</select>

And what I really wanted was:

<select>
  <option value="AC">Ascension Island</option>
  <option value="AD">Andorra</option>
  <option value="AE">United Arab Emirates</option>
  <option value="AF">Afghanistan</option>
  ...
</select>

After some hair-pulling I realized that as of version 5.8.5 (Sep. 2010) the Regular Expressions still don't seem to allow certain loops in the expressions (unless there is another syntax), for example, the following would find even ">united_arab_emirated_emirates"> despite its additional separating underscores:

(">)([a-z]+([_]*[a-z]*)*)(">)

This query worked in most generic RegEx tools but while within Notepad++, I had to account for the maximum number of nested underscores (which unfortunately was 8) by hand, using the much uglier:

(">)([a-z]+[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*)[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*(">)

If someone knows a way to simulate a Regex loop in Notepad++'s replace feature, please let me know.


Find what: *(">)([a-z]+[_][a-z][_][a-z][_][a-z][_][a-z])[_][a-z][_][a-z][_][a-z][_][a-z](">)*


Replace with: ">


Result: 255 occurrences were replaced.

Lupulin answered 24/12, 2010 at 8:6 Comment(0)
S
1

Here's a nice article on Notepad++ Regular expressions
http://markantoniou.blogspot.com/2008/06/notepad-how-to-use-regular-expressions.html

Savoury answered 21/1, 2011 at 8:51 Comment(0)
H
0

Very simple just Find:

<option value value=.*?>

and Click Replace

Hilarity answered 9/5, 2016 at 6:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.