Removing all punctuation except - and _ from a java string using RegEx
Asked Answered
W

1

5

I am trying to replace all punctuation except the - and _ using a method I found here, but I can only get it to work on " using the exact code as posted which used a negative lookahead:

(?!")\\p{punct}

//Java example:

String string = ".\"'";
System.out.println(string.replaceAll("(?!\")\\p{Punct}", ""));

I tried:

name = name.replaceAll("(?!_-)\\p{Punct}", ""); // which just replaces all punctuation.

name = name.replaceAll("(?!\_-)\\p{Punct}", ""); // which gives an error.

Thanks.

Wrongdoer answered 26/10, 2016 at 15:47 Comment(1)
If you happen to log in, please consider accepting the answer if it worked for you.Odele
K
10

Use a character class subtraction (and add a + quantifier to match chunks of 1 or more punctuation chars):

name = name.replaceAll("[\\p{Punct}&&[^_-]]+", "");

See the Java demo.

The [\\p{Punct}&&[^_-]]+ means match any char from \p{Punct} class except _ and -.

The construction you found can also be used, but you'd need to put the - and _ into a character class, and use .replaceAll("(?![_-])\\p{Punct}", ""), or .replaceAll("(?:(?![_-])\\p{Punct})+", "").

Kaule answered 26/10, 2016 at 15:50 Comment(3)
Nice. @Wiktor Any good tutorial to learn reg exp please? may with some advanced level tutorials explained as well?Plague
Read Lesson: Regular Expressions first. Then, I can suggest doing all lessons at regexone.com, reading through regular-expressions.info, regex SO tag description (with many other links to great online resources), and the community SO post called What does the regex mean. Also, rexegg.com is worth having a look at.Odele
Thanks @WiktorStribiżewWrongdoer

© 2022 - 2024 — McMap. All rights reserved.