List of metacharacters for MySQL regexp square brackets
Asked Answered
N

2

2

Strangely I can't seem to find anywhere a list of the characters that I can't safely use as literals within MySQL regular expression square brackets without escaping them or requiring the use of a [:character_class:] thing.

(Also the answer probably needs to be MySQL specific because MySQL regular expressions seem to be lacking compared those in Perl/PHP/Javascript etc).

Nix answered 15/11, 2011 at 18:27 Comment(0)
S
4

Almost all metacharacters (including the dot ., the +, * and ? quantifiers, the end-of-string anchor $, etc.) have no special meaning in character classes, with a few notable exceptions:

  • closing bracket ], for obvious reasons
  • caret ^, which is used to negate the character class (eg: [^ab] matches any character but a and b).
  • hyphen -, which is used to denote a range (eg: [0-9] matches any digit)

However, these can still be added without escaping if placed in strategic locations within the character class:

  • the closing bracket can be placed right after the opening bracket, eg: []a] matches ] or a.
  • the caret can be placed anywhere but after the opening bracket, eg: [a^] matches ^ or a
  • the hyphen can be placed right after the opening bracket or before the closing bracket, eg: [-a] and [a-] both match a and -.

More information can be found in the man page on POSIX regex (thanks Tomalak Geret'kal!)

Sixfooter answered 15/11, 2011 at 18:34 Comment(2)
Thanks @NullUserException. I'll use this.Nix
[citation needed]. OP fooled into taking the lazy path with (a) no proof that it's accurate, and (b) no lesson learned in finding the proper documentation. Shame.Plumy
P
0

From the documentation, right near the top:

This section summarizes, with examples, the special characters and constructs that can be used in MySQL for REGEXP operations. It does not contain all the details that can be found in Henry Spencer's regex(7) manual page. That manual page is included in MySQL source distributions, in the regex.7 file under the regex directory.

Said manpage can be found copied here (thanks, Google!). The information you're looking for is available in there.

Plumy answered 15/11, 2011 at 18:36 Comment(10)
Thanks, but thats for the special characters for regular expressions - not the ones for use within square brackets which I believe is a different set.Nix
@spiderplant0: Huh? [..] is part of regular expression syntax, and there are several paragraphs talking about the rules for it. There's no simple list, because the semantics are more complicated than that. You should read the man page I linked you to again, and spend longer than 3 minutes on it this time please.Plumy
@NullUserExceptionఠ_ఠ: The man page that it links to is the authoritative source of all information pertaining to the topic. It completely answers the question.Plumy
In the standard MYSQl documentation, there are several paragraphs talking about the rules which I read before asking this question. But as I'm not an expert I was confused and didnt appreciate that *, ? etc did not need to be escaped when used in square brackets even though they did in non-square brackets regular expression. So I was asking for a definitive list to clear things up. Hence I accepted NullUserException answer.Nix
@spiderplant0: I already explained why there is not a "list of characters", and all NullUserException has done is to re-word the authoritative text that you couldn't find. What a waste of time. >.<Plumy
@TomalakGeret'kal +1 for the docs, which made me realize the backslash does not have special meaning in POSIX character classes, unlike PCRE.Sixfooter
I would like to point out however that just a link to an external source isn't really a proper answer on SO. See: meta.stackexchange.com/questions/8231/…Sixfooter
@NullUserExceptionఠ_ఠ: I usually flag to delete any answers that are just links rather than explanations. However I felt that this was a slightly different case; the core issue here really is that the OP was unable to locate this information in the documentation (see first sentence), and the most benefit to be gained is in helping him/her to better interpret the manual. It's really much more than just a link, though I appreciate that the difference may appear subtle.Plumy
Tomalak Geret'kal, I appreciate your time and effort, and manuals are my first port of call. But sometimes, something in the manual may not be clear to the reader. Even if it seems obvious to someone else. Hence the benefit of SO for clearing things up. So a direct answer to the question is usually best - which is why I chose NullUserExceptionఠ_ఠ's answer. In this case (as it usually is) it was not just about plugging in the solution but to make the subject clearer to the questioner.Nix
@spiderplant0: Please use @notification syntax like everybody else; I found your comment only by chance.Plumy

© 2022 - 2024 — McMap. All rights reserved.