Two elementary results in the conversion of regular expressions of complements of regular languages involve truly superexponential runtime. See this Computer Science question for more details.
The definition of 'superexponential'
There are various differeing definitions of exponential time.
- The most restrictive definition defines exponential time as being upper bounded by a simple exponential function
B
x
for some B > 1
. This defines the E complexity class.
- The most conventional definition, EXPTIME, consists of all algorithms with runtime requiring only a single exponent to express. Namely, it is of the form
B
g(x)
where g
is polynomial in x
. Polynomials do not require any exponents to express.
- Factorial time
n!
, which asymptotically dominates any simple exponential function, is bounded above by n
n
, which can be converted to e
n ln n
. Therefore, the expression n
n
is still singly exponential, and factorial time still qualifies under EXPTIME.
- The function
2
2
x
is a doubly exponential function. This function cannot be expressed using two font sizes. This function is outside of EXPTIME, but is in 2-EXPTIME.
- The least restrictive definition, ELEMENTARY, consists of all algorithms with runtime being an elementary expression, which may contain any number of nested exponents. It is a union of all
k
-EXPTIME time complexities.
- Tetration is an example of a nonelementary primitive recursive operation. The same applies for all higher hyperoperations.
If you are thinking about any algorithm, including those with little practical value, the Ackermann function and the Knuth-up-arrow function comes into mind. This is among the first algorithms known to be non-primitive recursive, also it is asymptotically faster than any primitive recursive function.
Computing a Knuth-up arrow function A(b,x,n)
= b
↑
n
x
for nontrivial base b
and 'superexponent' x
requires 'super-primitive recursive' time, also of the order of the (n + 2)
-th hyperoperation.
Two regular expression conversion problems
A regular expression R
(not a POSIX regex) is a template that specifies a match pattern in text. Let L
be the set of all text strings of the form described by R
. They are used to determine whether a string is valid or not as an input in constant space; the regular expression ^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$
describes a valid email address.
Kleene's theorem: It is well-known that for every regular language L
in some finite alphabet Σ
, there exists a regular expression R
and a deterministic finite-state machine M
that produces exactly the language L
.
A double exponential problem
From Klenne's theorem, the complement of the regular language ∁L
is another regular language. The DFA M-
for ∁L
is identical to the DFA M
for L
except that the states being accepting or nonaccepting are all inverted. Derive the regular expression for ∁L
without the negation bar.
Answer: This problem is surprisingly nasty. The regular expression for the complement language is doubly exponential to the length of the regular expression of the base language. A proof has been provided on Page 8 of an Arxiv article by Wouter Gelade and Frank Neven.
A superexponential algorithm
Definition: A generalized regular expression is a regular expression that also allows the complementaion operator, allowing sections that is not of the regex form to precede or follow sections that are of the regex form.
Letting ¬
as the complementation operator, an example is ^¬((0|1)*00(0|1)*)[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$
, which describes all strings that can be partitioned into a part that do not contain two zeroes in a row and a part that represents an email address.
Generalized regular expressions have the exact same strength as standard regular expressions without complementation, but can be used to succinctly represent a regular language containing a part that is described as the complement of a regular language with a short regular expression.
Your task is to convert the generalized regular expression into a standard regular expression.
Answer: You can recursively convert each complemented expression into a NFA, and then convert each NFA into a DFA. However, the running time of this algorithm is nonelementart 2
↑↑
Θ(n)
, since each layer of recursive negation exponentially increases the number of states of the DFA. In fact, this problem still remains nonelementary if Kleene closure is disallowed. (source: Algorithms by Jeff Erickson)