Regular expression for replacing profanity words in a string
Asked Answered
S

2

5

I'm trying to replace a set of words in a text string. Now I have a loop, which does not perform well:

function clearProfanity(s) {
   var profanity = ['ass', 'bottom', 'damn', 'shit'];
   for (var i=0; i < profanity.length; i++) {
      s = s.replace(profanity[i], "###!");
   }
   return s;
}

I want something that works faster, and something that will replace the bad word with a ###! mark having the same length as the original word.

Surrounding answered 12/3, 2011 at 11:42 Comment(5)
Heh, "bottom" is a profanity?!Lulu
+1, I love when code is funnyCochineal
Can I suggest reading about the problems of a profanity filter before implementing your own? en.wikipedia.org/wiki/Scunthorpe_problemCowbind
I did, but do not intend to go that far... thanks anyway :)Surrounding
Didn't want to post as an answer since I authored the plugin. Check out the jQuery.ProfanityFilter. It's written to allow "Opt-in" profanity filtering, and doesn't turn assassin into ******inClarisaclarise
P
4

See it working: http://jsfiddle.net/osher/ZnJ5S/3/

Which basically is:

var PROFANITY = ['ass','bottom','damn','shit']
  , CENZOR = ("#####################").split("").join("########")
  ;
PROFANITY  = new RegExp( "(\\W)(" + PROFANITY.join("|") + ")(\\W)","gi");

function clearProfanity(s){
    return s.replace( PROFANITY
                    , function(_,b,m,a) { 
                         return b + CENZOR.substr(0, m.length - 1) + "!" + a
                      } 
                    );
}


alert( clearProfanity("'ass','bottom','damn','shit'") );

It would be better if the PROFANITY array would be initiated as a string, or better - directly as a Regular Expression:

//as string
var PROFANITY = "(\\W)(ass|bottom|damn|shit)(\\W)";
PROFANITY = new RegExp(PROFANITY, "gi"); 

//as regexp
var PROFANITY = /(\W)(ass|bottom|damn|shit)(\W)/gi
Portal answered 12/3, 2011 at 11:45 Comment(4)
Thanks for all the explanation!Surrounding
@Surrounding @Radagast the Brown -- So what happens to "unbiassed"? That's really a far too simple approach. The regex really needs to look for word boundaries.Epsilon
Yea, it is rather simple, but it's not a profanity engine, it's a simple client-side JS function that cover's my behind. It's good enough for me.Surrounding
@Epsilon is right. I added the handling of word boundriesPortal
K
5

Here's one way to do it:

String.prototype.repeat = function(n){
    var str = '';
    while (n--){
        str+=this;
    }
    return str;
}

var re = /ass|bottom|damn|shit/gi
  , profane = 'my ass is @ the bottom of the sea, so shit \'nd damn';

alert(profane.replace(re,function(a) {return '#'.repeat(a.length)}));
//=>my ### is @ the ###### of the sea, so #### 'n ####

To be complete: here's a simpler way to do it, taking word boundaries into account:

var re = /\W+(ass|shit|bottom|damn)\W+/gi
      , profane = [ 'My cassette of forks is at the bottom'
                   ,'of the sea, so I will be eating my shitake'
                   ,'whith a knife, which can be quite damnable'
                   ,'ambassador. So please don\'t harrass me!'
                   ,'By the way, did you see the typo'
                   ,'in "we are sleepy [ass] bears"?']
                  .join(' ')
                  .replace( re, 
                              function(a){ 
                                return a.replace(/[a-z]/gi,'#'); 
                              } 
                   );
alert(profane);
Kraut answered 12/3, 2011 at 11:52 Comment(1)
thanks for the word-boundry handling. however - mind that cutting from a given prepared-in-advance string is better than concatenating chars in a loop, done for every occurrence you wish to replace.Portal
P
4

See it working: http://jsfiddle.net/osher/ZnJ5S/3/

Which basically is:

var PROFANITY = ['ass','bottom','damn','shit']
  , CENZOR = ("#####################").split("").join("########")
  ;
PROFANITY  = new RegExp( "(\\W)(" + PROFANITY.join("|") + ")(\\W)","gi");

function clearProfanity(s){
    return s.replace( PROFANITY
                    , function(_,b,m,a) { 
                         return b + CENZOR.substr(0, m.length - 1) + "!" + a
                      } 
                    );
}


alert( clearProfanity("'ass','bottom','damn','shit'") );

It would be better if the PROFANITY array would be initiated as a string, or better - directly as a Regular Expression:

//as string
var PROFANITY = "(\\W)(ass|bottom|damn|shit)(\\W)";
PROFANITY = new RegExp(PROFANITY, "gi"); 

//as regexp
var PROFANITY = /(\W)(ass|bottom|damn|shit)(\W)/gi
Portal answered 12/3, 2011 at 11:45 Comment(4)
Thanks for all the explanation!Surrounding
@Surrounding @Radagast the Brown -- So what happens to "unbiassed"? That's really a far too simple approach. The regex really needs to look for word boundaries.Epsilon
Yea, it is rather simple, but it's not a profanity engine, it's a simple client-side JS function that cover's my behind. It's good enough for me.Surrounding
@Epsilon is right. I added the handling of word boundriesPortal

© 2022 - 2024 — McMap. All rights reserved.