A little after the fact, but in case its useful to anyone, I was able to follow one of the examples on here (by sdgfsdg) and quickly pick up Regular Expressions for Notepad++.
I had to similarly pull out some redundant data from a list of HTML select dropdown options, of the form:
<select>
<option value="AC">saint_helena">Ascension Island</option>
<option value="AD">andorra">Andorra</option>
<option value="AE">united_arab_emirates">United Arab Emirates</option>
<option value="AF">afghanistan">Afghanistan</option>:
...
</select>
And what I really wanted was:
<select>
<option value="AC">Ascension Island</option>
<option value="AD">Andorra</option>
<option value="AE">United Arab Emirates</option>
<option value="AF">Afghanistan</option>
...
</select>
After some hair-pulling I realized that as of version 5.8.5 (Sep. 2010) the Regular Expressions still don't seem to allow certain loops in the expressions (unless there is another syntax), for example, the following would find even ">united_arab_emirated_emirates"> despite its additional separating underscores:
(">)([a-z]+([_]*[a-z]*)*)(">)
This query worked in most generic RegEx tools but while within Notepad++, I had to account for the maximum number of nested underscores (which unfortunately was 8) by hand, using the much uglier:
(">)([a-z]+[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*)[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*[_]*[a-z]*(">)
If someone knows a way to simulate a Regex loop in Notepad++'s replace feature, please let me know.
Find what: *(">)([a-z]+[_][a-z][_][a-z][_][a-z][_][a-z])[_][a-z][_][a-z][_][a-z][_][a-z](">)*
Replace with: ">
Result: 255 occurrences were replaced.