Does anybody know if Python (any version) used NFAs (Non-Deterministic Finite Automata) to evaluate regular expressions or does it use some other mechanism? Please provide links/reference if available.
Does Python use NFAs for regular expression evaluation in the re module?
Since most RE engines nowadays allow for non-regular languages to be matched I doubt any modern RE engine actually still uses NFAs or DFAs anymore. –
Danelle
Well, since an RE engine can identify a subset of RE's that are regular, and those are in common use, it makes sense to optimize for those scenarios. So it's entirely possible that they sometimes use NFA's or DFA's. –
Matless
NFA.
See Friedl's Mastering Regular Expressions, 3rd edition, chapter 4 - table 4-1, page 145.
Google books has a preview to it.
This should take less than a ms on a DFA:
$ time python3 -c 'import re; re.match("a?"*25+"a"*25, "a"*25)'
real 0m7.273s
Change 25 with 100 and it won't terminate for a lifetime.
Here is how it looks on a DFA (grep):
$ time echo "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" |grep "a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?a\?aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
real 0m0.063s
There is a great discussion of the topic at http://swtch.com/~rsc/regexp/regexp1.html
NFA.
See Friedl's Mastering Regular Expressions, 3rd edition, chapter 4 - table 4-1, page 145.
Google books has a preview to it.
© 2022 - 2024 — McMap. All rights reserved.