Design pattern for blocking undesirable content
Asked Answered
F

3

6

Last year I was working on a Christmas project which allowed customers to send emails to each other with a 256 character free-text field for their Christmas request. The project worked by searching the (very-large) product database for suggest products that matched the text field, but offered a free text option for those customers that could not find the product in question.

One obvious concern was the opportunity for customers to send rather explicit requests to some unsuspecting customer with the company's branding sitting around it.

The project did not go ahead in the end, for various reasons, the profanity aspect being one.

However, I've come back to thinking about the project and wondering what kinds of validation could be used here. I'm aware of clbuttic which I know is the standard response to any question of this nature.

The solutions that I considered were:

  • Run it through something like WebPurify
  • Use MechanicalTurk
  • Write a regex pattern which looks for the word in the list. A more complicated version of this would consider plurals and past tenses of the word as well.
  • Write an array of suspicious words, and score each one. If the submission goes above a score, the validation fails.

So there are two questions:

  1. If the submission fails, how do you handle it from a UI perspective?
  2. What are the pros and cons of these solutions, or any others that you can suggest?

NB - answers like "profanity filters are evil" are irrelevant. In this semi-hypothetical situation, I haven't decided to implement a profanity filter or been given the choice of whether or not to implement one. I just have to do the best I can with my programming skills (which should be on a LAMP stack if possible).

Flimsy answered 25/4, 2011 at 16:52 Comment(1)
+1 for referencing the standard clbuttic response =)Repose
R
7

Have you thought about bayesian filtering? Bayesian filtering is not just for detecting spam. You can train them in a variety of text recognition tasks. Grab a bayesian filter, collect a bunch of request texts and start marking them as containing profanity or not. After some time (how much time depends a lot on the amount and type of training data) your filter will be able to detect requests containing profanity from those containing no profanity.

It's not fool-proof, but it's much, much better than simple string matching and trying to deal with clbuttic problems. You have a variety of possibilities for bayesian filtering in PHP.

bogofilter

Bogofilter is a stand-alone bayesian filter that runs on any unix-y OS. It's targeted at filtering e-mail but you can train it for any kind of text. I have succesfully used this to implement a custom comment spam filter for my own website (source). You can interface with bogofilter like you can with any other commandline application. See my source code link for an example.

Roll your own

If you like a challenge, you could implement a bayesian filter entirely from scratch. Here's a decent article about implementing a bayesian filter in PHP.

Existing PHP libraries

(Ab)use an existing e-mail filter

You could use a standard SpamAssassin or DSpam installation and train it to recognise profanity. Just make sure that you disable options specifically aimed at e-mail messages (e.g. parsing mime blocks, reading headers) and just enable the options that deal with the baysian text processing. DSpam may be easier to adapt. SpamAssassin has the advantage that you can add custom rules on top of the bayesian filter. For SpamAssassin, make sure you disable all the default rules and write your own rules instead. The default rules are all targeted at spam e-mail detection.

Rosierosily answered 27/4, 2011 at 20:52 Comment(2)
Other than installing Akismet on Wordpress, I've never touched Bayesian filtering. Do you know any filters that you'd recommend?Flimsy
I was writing them up as you were typing your comment :-)Rosierosily
M
0

In the past, I've used a glorified form of str_replace. Here was my rationale:

  1. Profane words could afford to be replaced by silly words, conveying the original point of the message but discouraging the use of profanity
  2. On successful posts where filtering took place, users were shown a success message, but there was a notification that sanitization had taken place (something like, "Your post was added, potty mouth.")
  3. I didn't ever wan the submission to fail. Posts were either posted uncensored, or censored. In your case, you might want to prevent profane posts entirely.

For what it's worth, Apple only recently stopped banning obscene language in their free laser engravings. Perhaps they had a reasonable rationale?

Marxismleninism answered 25/4, 2011 at 17:23 Comment(4)
How did you get round the 'clbuttic' problem? Did you only replace whole words? And what about stuff like f.u.c.k or @$$?Flimsy
I stripped punctuation, and had to hard-code extra listings for alternate spellings.Marxismleninism
Did you strip punctuation or convert it? For example, all @s to as. I tried using regex but it got very messy.Flimsy
I just blacklisted @55 and its asinine equivalent.Marxismleninism
C
0

What about using a few string matching rules and sticking only those into a moderation queue?

It sounds like many requests may not use the free text field so they should safely go through.

Then, only a small percentage should trip your string matches to end up in moderation. Even with a large userbase this should keep moderation time to a minimum. You might even make obvious profanity like the f or n word automatic fails to cut the remaining list down even more.

Make your moderation page easy to use and highlight the words that flagged the messages and that should make it a quick process to scan through and clean up. Adjust as needed if people are trying to post too much garbage or if there's too many false positives.

Or just use this strategy with baysian filtering like @Sander suggested.

Edit: Also a "report abuse" button will help you find out if bad stuff is getting through, but this would involve saving sent messages for a while and that might not be ideal if this is going to be highly active.

Corody answered 27/4, 2011 at 21:3 Comment(4)
The 'report abuse' button is a very good idea, since it gives a constructive feedback loop for receivers whenever the filter does fail. For the moderation queue, it still begs the question of how to recognise f.u.c.k and @$$.Flimsy
Hmmm.... maybe the best way to deal with that is to flag words that are contain a high % of non-alphanumeric characters. Then the only problem would be f u c k but you could strip all whitespace and scan to catch those. Also thought I should mention that 'report abuse' also protects your image in the eyes of the recipient. By providing it you are letting people know that the system has potential for abuse and that you are trying hard to prevent it. In other words, the second they report it to you is the second they stop blaming you because now you're on their side as a protector.Corody
I thought about stripping out whitespace, but give myself a new problem - lack of word boundaries.Flimsy
Stripping the whitespace is only for catching words like b u t t h o l e - you would do it after your other checks. Too sumarize: 1. Scan for obvious words and either fail or flag for moderation. 2. Scan for words with high % of non-alphanumerics and flag for moderation if needed. 3. Strip whitespace and scan one final time for obvious words and flag for moderation. 4. Include a button to report abuse for anything that sneaks through. 5. Once active - adjust your filters if you are getting false positives or abuse reports.Corody

© 2022 - 2024 — McMap. All rights reserved.