Lua string manipulation pattern matching alternative "|"
Asked Answered
N

3

7

Is there a way I can do a string pattern that will match "ab|cd" so it matches for either "ab" or "cd" in the input string. I know you use something like "[ab]" as a pattern and it will match for either "a" or "b", but that only works for one letter stuff.

Note that my actual problem is a lot more complicated, but essentially I just need to know if there is an OR thing in Lua's string manipulation. I would actually want to put other patterns on each sides of the OR thing, and etc. But if it works with something like "hello|world" and matches "hello, world!" with both "hello" and "world" then it's great!

Novah answered 6/10, 2013 at 21:51 Comment(2)
I'm not sure it can be easily accomplished using built-in pattern matching. If you need to perform complex parsing/matching using Lua, you may try lpeg: inf.puc-rio.br/~roberto/lpegSprung
@peterm Yes, this is quite easy with lpeg! (lpeg.P("foo") + "bar"):match(input)Ondine
H
5

Using logical operator with Lua patterns can solve most problems. For instance, for the regular expression [hello|world]%d+, you can use

string.match(str, "hello%d+") or string.match(str, "world%d+")

The shortcut circuit of or operator makes sure the string matches hello%d+ first, if if fails, then matches world%d+

Harrow answered 7/10, 2013 at 2:3 Comment(0)
S
4

Unfortunately Lua patterns are not regular expressions and are less powerful. In particular they don't support alternation (that vertical bar | operator of Java or Perl regular expressions), which is what you want to do.

A simple workaround could be the following:

local function MatchAny( str, pattern_list )
    for _, pattern in ipairs( pattern_list ) do
        local w = string.match( str, pattern )
        if w then return w end
    end
end


s = "hello dolly!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "cruel world!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "hello world!"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

s = "got 1000 bucks"
print( MatchAny( s, { "hello", "world", "%d+" } ) )

Output:

hello
world
hello
1000

The function MatchAny will match its first argument (a string) against a list of Lua patterns and return the result of the first successful match.

Slim answered 6/10, 2013 at 22:22 Comment(0)
A
3

Just to expand on peterm's suggestion, lpeg also provides a re module that exposes a similar interface to lua's standard string library while still preserving the extra power and flexibility offered by lpeg.

I would say try out the re module first since its syntax is a bit less esoteric compared to lpeg. Here's an example usage that can match your hello world example:

dump = require 'pl.pretty'.dump
re = require 're'


local subj = "hello, world! padding world1 !hello hello hellonomatch nohello"
pat = re.compile [[
  toks  <-  tok (%W+ tok)*
  tok   <-  {'hello' / 'world'} !%w / %w+
]]

res = { re.match(subj, pat) }
dump(res)

which would output:

{
  "hello",
  "world",
  "hello",
  "hello"
}

If you're interested in capturing the position of the matches just modify the grammar slightly for positional capture:

tok   <-  {}('hello' / 'world') !%w / %w+
Arlie answered 7/10, 2013 at 1:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.