I'd like to work on a BB Code filter for a PHP website. (I'm using CakePHP, it would be a BB Code helper). I have the following requirements:
BB Code can be nested. So something like this is valid
[block] [block] [/block] [block] [block] [/block] [/block] [/block]
Bbcodes can have 0 or more parameters.
Example:
[video: url="url", width="500", height="500"]Title[/video]
BB Code might have multiple behaviours
Let's say
[url]text[/url]
would be transformed to[url:url="text"]text[/url]
or the video BB Code would be able to choose between YouTube, Dailymotion, etc.
I've already done something with regex, but my biggest problem was matching parameters. In fact, I got nested BB Code and BB Code with 0 parameters to work. But when I added a regex match for parameters, it didn't match nested BB Code correctly:
"\[($tag)(=.*)\"\](.*)\[\/\1\]"
(It wasn't .*
but the non-greedy matcher)
I don't have the complete regex with me right now, But I had something that looked like that(above).
Is there a way to match BB Code with regex or something else?
The only thing I can think of is to use the visitor pattern and to split my text with each possible tags. This way, I can have a bit more of control over my text parsing and I could probably validate my document so if the input text doesn't have valid BB Code... I could notify the user with a error before saving anything.
I would use SableCC to create my text parser.