On minimising automata with errors

Report 1 Downloads 55 Views
On minimising automata with errors Pawel Gawrychowski

1

Artur Jez˙

1 University 2 University

1

Andreas Maletti

2

of Wroclaw, Poland,

of Stuttgart, Germany

25 August 2011

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

1 / 16

Introduction DFA are omnipresent Important problem: minimisation

Problem (DFA minimisation) For a DFA M find the smallest DFA recognizing the same language L(M). Well studied, O(n log n) solution known, not much hope to improve it.

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

2 / 16

Introduction DFA are omnipresent Important problem: minimisation

Problem (DFA minimisation) For a DFA M find the smallest DFA recognizing the same language L(M). Well studied, O(n log n) solution known, not much hope to improve it.

Relaxations (allow some errors!) cover automata (errors on words ≥ k) hyper-minimisation (finitely many errors/errors on short words) k-minimisation (errors on words of length ≤ k) almost-minimisation (finitely many errors on prefixes of infinite word) Minimisation algorithms for each of these cases. P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

2 / 16

Our results Problem (k-minimisation) Given a DFA M, find the smallest DFA N such that L(M) ∩ Σ≥k = L(N) ∩ Σ≥k

Positive Simple O(n log2 n) algorithm for k-minimisation. For all k in parallel, in a certain sense.

Negative Hardness for some generalisations (limiting both the length and number of erroneous words).

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

3 / 16

k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

4 / 16

k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?

Definition (k-similarity, [Gawrychowski, Je˙z, MFCS 2009]) q ∼k p ⇐⇒ d(p, q) + min(level(p), level(q)) < k d(p, q) = max{|w | : w ∈ L(p)4L(q)} level(p): longest word leading to p (may be +∞)

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

4 / 16

k-minimisation, k-similarity Hopcroft’s algorithm: find identical states and merge them. What means identical now?

Definition (k-similarity, [Gawrychowski, Je˙z, MFCS 2009]) q ∼k p ⇐⇒ d(p, q) + min(level(p), level(q)) < k d(p, q) = max{|w | : w ∈ L(p)4L(q)} level(p): longest word leading to p (may be +∞)

New simple algorithm! 1

calculate ∼k

2

while there are p ∼k q, level(p) ≤ level(q) merge p to q,

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

4 / 16

Distance tree need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r )) represent d as a tree

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

5 / 16

Distance tree 8

need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r ))

5

represent d as a tree

4

Definition (distance-tree) rooted tree (with weighted edges)

2

each state is a leaf 1 p

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

q

MFCS 2011 (Warsaw)

5 / 16

Distance tree need a compact representation of d. observation: ultra-metric d(p, q) ≤ max(d(p, r ), d(q, r )) represent d as a tree

Definition (distance-tree) rooted tree (with weighted edges) each state is a leaf d(p, q) = height of lca(p, q) (+∞ if in different trees)

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

5 / 16

Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?

Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

6 / 16

Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?

Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!

Dictionary lists states with the same successors allows update when p and q are merged

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

6 / 16

Building distance tree bottom-up Work in phases. In the `-th phase we want to glue together states p, q with d(p, q) = `. How to detect such pairs?

Σ = {a, b} If d(δ(p, a), δ(q, a) = ` and d(δ(p, b), δ(q, b)) = `0 , then we can glue p and q in the max(`, `0 ) + 1-th phase!

Dictionary lists states with the same successors allows update when p and q are merged

Theorem We can build distance tree using O(n log n) dictionary operations.

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

6 / 16

Types of dictionaries

Deterministic: balanced tree Θ(logn) Randomized: hashing Θ(1) quadratic memory without initialization trick Θ(1) 2

log log n exponential trees in the RAM model Θ( log log log n )

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

7 / 16

How to use distance tree Theorem Given a distance tree we can 1

calculate the sizes of all k-minimal DFA (for all valid k) in time O(n),

2

construct a k-minimal DFA for a given k in time O(n),

3

iteratively construct (representations of) k-minimal DFA for all k in time O(n log n).

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

8 / 16

How to use distance tree Theorem Given a distance tree we can 1

calculate the sizes of all k-minimal DFA (for all valid k) in time O(n),

2

construct a k-minimal DFA for a given k in time O(n),

3

iteratively construct (representations of) k-minimal DFA for all k in time O(n log n).

Idea the bigger k, the more states we can glue together, for each state p calculate the smallest k for which it is merged depends on level(p) and the distance tree can be done in a single transversal of the tree.

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

8 / 16

Generalisations

Improvements partial transition function I

P. Gawrychowski

known results for DFA minimisation (on trees as well)

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

9 / 16

Generalisations

Improvements partial transition function I

known results for DFA minimisation (on trees as well)

take the number of errors into the account I

for hyper-minimisation: O(n2 ) algorithm returning DFA F F

P. Gawrychowski

hyper-minimal committing the least number of errors

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

9 / 16

Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

10 / 16

Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?

Known results minimisation O(|δ| log n) hyper-minimisation O(|δ| log2 n) minimisation for tree automata O(|δ| log n) but not for k-minimisation I I

P. Gawrychowski

obstacle: finite languages and ∅ are close construction of distance-tree for acyclic automata

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

10 / 16

Partial transition function Usually, the DFA has few meaningfully transitions: δ is partial. Dependency on |δ| and not |Σ|n?

Known results minimisation O(|δ| log n) hyper-minimisation O(|δ| log2 n) minimisation for tree automata O(|δ| log n) but not for k-minimisation I I

obstacle: finite languages and ∅ are close construction of distance-tree for acyclic automata

Theorem We can construct a distance tree for an acyclic DFA in O(|δ| log n) time.

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

10 / 16

Idea

Sort (topologically) the states. All state with the same length of the longest recognized words can be processed at once. Assuming we have a set of states S such that we know the fragment of the tree corresponding to their successors. Can we build the fragment of the tree corresponding to S?

Yes! Use divide-and-conquer. We can fairly easily detect which states in S are at distance m, for any m (by looking at their successors).

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

11 / 16

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

12 / 16

n 2

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

12 / 16

n 2

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

12 / 16

n 2

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

12 / 16

Limiting the number of errors

Can we control both type and number of errors?

Theorem (A. Maletti, CIAA 2010) Given a DFA we can in O(n2 ) give a DFA which is hyper-minimal, commits the least number of errors among the hyper-minimal automata. Generalize to k-minimisation?

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

13 / 16

Limiting the number of errors Problem Given a DFA M, is there a DFA N, such that 1

N is k-minimal for M

2

N commits at most m errors compared to M, |L(N) 4 L(M)| ≤ m.

Problem Given a DFA M, is there DFA N, such that 1

N has at most s states, and

2

N commits at most m errors compared to M; i.e., |L(N) 4 L(M)| ≤ m.

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

14 / 16

Limiting the number of errors Problem Given a DFA M, is there a DFA N, such that 1

N is k-minimal for M

2

N commits at most m errors compared to M, |L(N) 4 L(M)| ≤ m.

Problem Given a DFA M, is there DFA N, such that 1

N has at most s states, and

2

N commits at most m errors compared to M; i.e., |L(N) 4 L(M)| ≤ m.

Hardness Both problems are NP-hard. P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

14 / 16

Open problems 1

Can we build the distance tree for an acyclic automaton in linear time?

2

Is there a deterministic O(n log n) time hyper-minimization algorithm? (without the memory trick)

3

Can we find the best hyper-minimal DFA faster than in O(n2 )?

4

Is it possible to somehow relax the condition that we want to get at most m errors so that k-minimization with errors is polynomial? Say, m must be a function of the smallest possible number of errors, or n...

5

Can we find the smallest N which agrees with M on words of lengths k1 , k1 + 1, . . . , k2 in polynomial time?

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

15 / 16

Thank you for your attention!

P. Gawrychowski

A. Je˙z,

A. Maletti

On minimising automata with errors

MFCS 2011 (Warsaw)

16 / 16