On minimising automata with errors Pawel Gawrychowski
1
Artur Jez˙
1 University 2 University
1
Andreas Maletti
2
of Wroclaw, Poland,
of Stuttgart, Germany
25 August 2011
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
1 / 16
Introduction DFA are omnipresent Important problem: minimisation
Problem (DFA minimisation) For a DFA M find the smallest DFA recognizing the same language L(M). Well studied, O(n log n) solution known, not much hope to improve it.
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
2 / 16
Introduction DFA are omnipresent Important problem: minimisation
Problem (DFA minimisation) For a DFA M find the smallest DFA recognizing the same language L(M). Well studied, O(n log n) solution known, not much hope to improve it.
Relaxations (allow some errors!) cover automata (errors on words ≥ k) hyper-minimisation (finitely many errors/errors on short words) k-minimisation (errors on words of length ≤ k) almost-minimisation (finitely many errors on prefixes of infinite word) Minimisation algorithms for each of these cases. P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
2 / 16
Our results Problem (k-minimisation) Given a DFA M, find the smallest DFA N such that L(M) ∩ Σ≥k = L(N) ∩ Σ≥k
Positive Simple O(n log2 n) algorithm for k-minimisation. For all k in parallel, in a certain sense.
Negative Hardness for some generalisations (limiting both the length and number of erroneous words).
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
3 / 16
k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
4 / 16
k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?
Definition (k-similarity, [Gawrychowski, Je˙z, MFCS 2009]) q ∼k p ⇐⇒ d(p, q) + min(level(p), level(q)) < k d(p, q) = max{|w | : w ∈ L(p)4L(q)} level(p): longest word leading to p (may be +∞)
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
4 / 16
k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?
Definition (k-similarity, [Gawrychowski, Je˙z, MFCS 2009]) q ∼k p ⇐⇒ d(p, q) + min(level(p), level(q)) < k d(p, q) = max{|w | : w ∈ L(p)4L(q)} level(p): longest word leading to p (may be +∞)
New simple algorithm! 1
calculate ∼k
2
while there are p ∼k q, level(p) ≤ level(q) merge p to q,
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
4 / 16
Distance tree need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r )) represent d as a tree
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
5 / 16
Distance tree 8
need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r ))
5
represent d as a tree
4
Definition (distance-tree) rooted tree (with weighted edges)
2
each state is a leaf 1 p
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
q
MFCS 2011 (Warsaw)
5 / 16
Distance tree need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r )) represent d as a tree
Definition (distance-tree) rooted tree (with weighted edges) each state is a leaf d(p, q) = height of lca(p, q) (+∞ if in different trees)
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
5 / 16
Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?
Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
6 / 16
Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?
Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!
Dictionary lists states with the same successors allows update when p and q are merged
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
6 / 16
Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?
Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!
Dictionary lists states with the same successors allows update when p and q are merged
Theorem We can build distance tree using O(n log n) dictionary operations.
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
6 / 16
Types of dictionaries
Deterministic: balanced tree Θ(logn) Randomized: hashing Θ(1) quadratic memory without initialization trick Θ(1) 2
log log n exponential trees in the RAM model Θ( log log log n )
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
7 / 16
How to use distance tree Theorem Given a distance tree we can 1
calculate the sizes of all k-minimal DFA (for all valid k) in time O(n),
2
construct a k-minimal DFA for a given k in time O(n),
3
iteratively construct (representations of) k-minimal DFA for all k in time O(n log n).
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
8 / 16
How to use distance tree Theorem Given a distance tree we can 1
calculate the sizes of all k-minimal DFA (for all valid k) in time O(n),
2
construct a k-minimal DFA for a given k in time O(n),
3
iteratively construct (representations of) k-minimal DFA for all k in time O(n log n).
Idea the bigger k, the more states we can glue together, for each state p calculate the smallest k for which it is merged depends on level(p) and the distance tree can be done in a single transversal of the tree.
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
8 / 16
Generalisations
Improvements partial transition function I
P. Gawrychowski
known results for DFA minimisation (on trees as well)
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
9 / 16
Generalisations
Improvements partial transition function I
known results for DFA minimisation (on trees as well)
take the number of errors into the account I
for hyper-minimisation: O(n2 ) algorithm returning DFA F F
P. Gawrychowski
hyper-minimal committing the least number of errors
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
9 / 16
Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
10 / 16
Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?
Known results minimisation O(|δ| log n) hyper-minimisation O(|δ| log2 n) minimisation for tree automata O(|δ| log n) but not for k-minimisation I I
P. Gawrychowski
obstacle: finite languages and ∅ are close construction of distance-tree for acyclic automata
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
10 / 16
Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?
Known results minimisation O(|δ| log n) hyper-minimisation O(|δ| log2 n) minimisation for tree automata O(|δ| log n) but not for k-minimisation I I
obstacle: finite languages and ∅ are close construction of distance-tree for acyclic automata
Theorem We can construct a distance tree for an acyclic DFA in O(|δ| log n) time.
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
10 / 16
Idea
Sort (topologically) the states. All state with the same length of the longest recognized words can be processed at once. Assuming we have a set of states S such that we know the fragment of the tree corresponding to their successors. Can we build the fragment of the tree corresponding to S?
Yes! Use divide-and-conquer. We can fairly easily detect which states in S are at distance m, for any m (by looking at their successors).
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
11 / 16
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
12 / 16
n 2
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
12 / 16
n 2
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
12 / 16
n 2
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
12 / 16
Limiting the number of errors
Can we control both type and number of errors?
Theorem (A. Maletti, CIAA 2010) Given a DFA we can in O(n2 ) give a DFA which is hyper-minimal, commits the least number of errors among the hyper-minimal automata. Generalize to k-minimisation?
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
13 / 16
Limiting the number of errors Problem Given a DFA M, is there a DFA N, such that 1
N is k-minimal for M
2
N commits at most m errors compared to M, |L(N) 4 L(M)| ≤ m.
Problem Given a DFA M, is there DFA N, such that 1
N has at most s states, and
2
N commits at most m errors compared to M; i.e., |L(N) 4 L(M)| ≤ m.
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
14 / 16
Limiting the number of errors Problem Given a DFA M, is there a DFA N, such that 1
N is k-minimal for M
2
N commits at most m errors compared to M, |L(N) 4 L(M)| ≤ m.
Problem Given a DFA M, is there DFA N, such that 1
N has at most s states, and
2
N commits at most m errors compared to M; i.e., |L(N) 4 L(M)| ≤ m.
Hardness Both problems are NP-hard. P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
14 / 16
Open problems 1
Can we build the distance tree for an acyclic automaton in linear time?
2
Is there a deterministic O(n log n) time hyper-minimization algorithm? (without the memory trick)
3
Can we find the best hyper-minimal DFA faster than in O(n2 )?
4
Is it possible to somehow relax the condition that we want to get at most m errors so that k-minimization with errors is polynomial? Say, m must be a function of the smallest possible number of errors, or n...
5
Can we find the smallest N which agrees with M on words of lengths k1 , k1 + 1, . . . , k2 in polynomial time?
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
15 / 16
Thank you for your attention!
P. Gawrychowski
A. Je˙z,
A. Maletti
On minimising automata with errors
MFCS 2011 (Warsaw)
16 / 16