I am trying to take a logical match criteria like:
(("Foo" OR "Foo Bar" OR FooBar) AND ("test" OR "testA" OR "TestB")) OR TestZ
and apply this as a match against a file in pig using
result = filter inputfields by text matches (some regex expression here));
The problem is I have no idea how to trun the logical expression above into a regex expression for the matches method.
I have fiddled around with various things and the closest I have come to is something like this:
((?=.*?\bFoo\b | \bFoo Bar\b))(?=.*?\bTestZ\b)
Any ideas? I also need to try to do this conversion programatically if possible.
Some examples:
a - The quick brown Foo jumped over the lazy test (This should pass as it contains foo and test)
b - the was something going on in TestZ (This passes also as it contains testZ)
c - the quick brown Foo jumped over the lazy dog (This should fail as it contains Foo but not test,testA or TestB)
Thanks
test
in mandatory afterfoo bar
part. If so should it be also included in match (you are using look-ahead (?=...) so probably not). Also you are saying that there should be)
beforeOR TestZ
so is it right thatTestZ
is enough for single match? – Colombes