Having the "two problems" proverb in mind, I'd still say this is the job for a regular expression. Regexes compile to state machines which check all possible variants in parallel, not one-by-one.
Here's an implementation that leverages that:
import re
def split_string(string, prefixes):
regex = re.compile('|'.join(map(re.escape, prefixes))) # (1)
while True:
match = regex.match(string)
if not match:
break
end = match.end()
yield string[:end]
string = string[end:]
if string:
yield string # (2)
prefixes = ['over','under','re','un','co']
assert (list(split_string('recouncoundo',prefixes))
== ['re','co','un','co','un','do'])
Note how the regular expression is constructed in (1):
- the prefixes are escaped using
re.escape
so that special characters don't interfere
- the escaped prefixes are joined using the
|
(or) regex operator
- the whole thing gets compiled.
The line (2) yields the final word, if any is left over after splitting prefixes. You might want to remove the if string
check if you want the function to return an empty string if nothing remains after prefix stripping.
Also note that re.match
(contrary to re.search
) only looks for the pattern at the beginning of the input string, so there's no need to append ^
to the regex.
match(word, i)
, I never noticed thatmatch
also haspos
andendpos
. – Heeley