How do I match [ in a Smalltalk regular expression?
Asked Answered
S

2

5

I want to match [ in a regular expression in Pharo 6.

This works fine:

| matcher |
matcher := RxMatcher forString: '\['.
matcher matches: '['. "produces true"

However, I can't see how to do this inside a []. Neither [[] nor [\[] work.

I can match closing ] fine with []], but I can't work out how to do this with [.

Sulfonate answered 22/8, 2017 at 17:49 Comment(3)
have you tried this? [\[]Jobbery
@Jobbery he said, ... Neither [[] nor [\[] work.Everard
ha .. I missed that. thanks for pointing it out @EverardJobbery
B
5

Unsupported

Looking at the implementation of RxParser>>atom and RxParser>>characterSet, escaping characters in the range set is simply not supported.

According to the documentation, other "special" characters (^,-,]) can be handled only by a specific placement within the set so not to trigger parsing of a different branch.

Workaround

A workaround would be to split the range set into or-ed group, e.g.

[[a-z]

into

(\[|[a-z])

Better Tool

Note that Pharo users are typically directed to use PetitParser instead of regular expressions for text parsing, as PetitParser is easier to manage and debug. A sort of more object-oriented take on regular expressions to say the least.

Benz answered 22/8, 2017 at 18:57 Comment(0)
S
1

I am adding a GNU Smalltalk related answer because the question is tagged with [smalltalk] and therefore likely to turn up in internet search results.

In GNU Smalltalk, regexs have Perl like syntax, and the character [ can be escaped as \[. For example:

st> '[ac' =~ '\[[ab]' 
MatchingRegexResults:'[a'
st> '[bc' =~ '\[[ab]' 
MatchingRegexResults:'[b'

Escaping works within a range as well:

st> '[bc' =~ '[\[b]' 
MatchingRegexResults:'['

Which probably makes it worth while to mention that the message =~ can be passed to a string along with a regex.

Sulfate answered 24/8, 2017 at 23:29 Comment(2)
Your pattern isn't equivalent AFAICS. I'm trying to match [ inside a range, whereas that pattern is looking for a [ unconditionally.Sulfonate
@WilfredHughes I've edited my post to clarify that escaping works within a range as well.Sulfate

© 2022 - 2024 — McMap. All rights reserved.