A CHARACTERIZATION OF THE ARITHMETICAL ... - Semantic Scholar

Report 2 Downloads 163 Views
International Journal of Foundations of Computer Science c World Scientific Publishing Company

A CHARACTERIZATION OF THE ARITHMETICAL HIERARCHY BY LANGUAGE EQUATIONS

ALEXANDER OKHOTIN School of Computing, Queen’s University, Kingston, Ontario, Canada∗ [email protected]

Received (received date) Revised (revised date) Communicated by Editor’s name ABSTRACT Language equations with all Boolean operations and concatenation and a particular order on the set of solutions are proved to be equal in expressive power to the first-order Peano arithmetic. In particular, it is shown that the class of sets representable using k variables (for every k > 2) is exactly the k-th level of the arithmetical hierarchy, i.e., the sets definable by recursive predicates with k alternating quantifiers. The property of having an extremal solution is shown to be nonrepresentable in first-order arithmetic. Keywords: Language equations, arithmetical hierarchy, decision problem, descriptional complexity, descriptive complexity.

1. Introduction A language equation is a formally specified relationship between sets of strings, one of the fundamental mathematical objects in computer science. Their significance was recognized in the 1960s, when two important notions of formal language theory, finite automata [24, 4, 25, 14] and context-free grammars [6, 3, 1], were characterized by systems of language equations Xi = ϕi (X1 , . . . , Xn ) (1 6 i 6 n). More characterizations of this kind have recently been found. First, a class of transformational grammars obtained by extending the context-free rules and the context-free derivation with a new “conjunction” operation, called conjunctive grammars [15], has been represented using the standard language equations [6, 3, 1] augmented with intersection [16]. Second, trellis automata [5, 10] (also known as one-way real-time cellular automata), a simple model of massively parallel computation introduced in the 1980s, have been characterized by a subclass of language equations with intersection, in which concatenation is restricted to linear [18, 20]. Third, for a more general class of language equations that allows the use of complement, it was shown that their unique, componentwise least and componentwise ∗ Present

address: Department of Mathematics, University of Turku, 20014, Turku, Finland.

1

greatest solutions characterize recursive, recursively enumerable and co-recursively enumerable languages, respectively [17, 21]. Together these results establish a certain correspondence between the specification of languages by algebraic relations and by computation. This can be compared to the systematically investigated correspondence between the computational models studied in the complexity theory and various restricted logics [12], known as the theory of descriptive complexity, which includes characterizations as precise as the equality between DSPACE(nk ) and the set of languages definable by partial fixed points of first-order Boolean queries with k + 1 variables [11, 13]. The present paper establishes a remotely similar result for language equations with all Boolean operations and concatenation: it is shown that the number of variables in these equations precisely equals the level in the arithmetical hierarchy their extremal solutions describe, while for an unbounded number of variables such solutions define exactly the class of sets specifiable in first-order arithmetic. Since these language equations are a generalization of the classical context-free language equations [6], the new result can also be compared to the hierarchy of n-nonterminal context-free languages [7, 8]. Indeed, in both cases languages are defined by solutions of systems of language equations of the form Xi = ϕi (X1 , . . . , Xn ) (1 6 i 6 n), which are least with respect to a certain partial order on vectors of languages. If the right-hand sides ϕi may contain concatenation and union, the solutions of such systems are context-free and line up into a proper hierarchy according to the number of variables [7]; if the set of operations is augmented with complement, the arithmetical hierarchy is obtained. The known types of explicit language equations without negation are reviewed in Section 2; in particular, the representation of trellis automata by one of these classes is described. The next Section 3 discusses the impact of negation on the properties of language equations, and proposes a new method of partially ordering their solutions by sequentially checking the inclusion of components. In Section 4 it is proved that the components of sequentially least and sequentially greatest solutions are arithmetical sets, and every i-th component of every such solution is in the i-th level of the arithmetical hierarchy. Section 5 presents a construction of a k-variable language equation for every given set in Σk (k > 2). The technical results of the paper are put together in Section 6: besides characterizing every level of the arithmetical hierarchy, these results imply that the decision problem of determining the existence of a sequentially least or a sequentially greatest solution cannot be specified by a first-order formula of formal arithmetic. 2. Language equations: the monotone case Quite a few different types of language equations were considered in the literature. The first to be studied [6, 24] were the explicit systems, resolved with respect to the unknowns:    X1 = ϕ1 (X1 , . . . , Xn ) .. (1) .   Xn = ϕn (X1 , . . . , Xn ) 2

There are n > 1 language variables, which assume values of languages over a common alphabet Σ, and there is an equation for every variable, with this variable on the left and with the right-hand side containing constant languages from a predefined set and any variables, connected using some set of operations on languages. The minimal set of constants, {ε} and {a} for all a ∈ Σ, is assumed in this paper. If this set of operations is comprised of union and concatenation, the resulting systems have a formal interpretation in terms of semirings, and have been shown to characterize the context-free grammars [6, 1]. If concatenation is restricted to linear, then linear context-free grammars are similarly characterized. If concatenation is further restricted to one-sided linear, an algebraic representation of nondeterministic finite automata is obtained [24, 25]. Recently it was shown that explicit systems of language equations (1) with union, intersection and concatenation are equivalent to conjunctive grammars [15, 16], a generalization of context-free grammars with an conjunction operation that specifies the condition of satisfying several rules simultaneously. This also applies to the linear case, where the language equations with union, intersection and linear concatenation characterize the corresponding class of linear conjunctive grammars. The latter were then found to be equivalent [18] to trellis automata, a model of parallel computation in electronic circuits [5, 10]. Trellis automata will be essentially used later on, so let us give their definition and review their representation by language equations. A trellis automaton, defined as a quintuple (Σ, Q, I, δ, F ), processes an input string of length n using a uniform array of n(n+1)/2 processor nodes, as in Figure 1. Each processor computes a value from a fixed finite set Q. The processors in the bottom row obtain their values directly from the input symbols using a function I : Σ → Q. The rest of the processors compute the function δ : Q × Q → Q of the values in their predecessors. The string is accepted if and only if the value computed by the topmost processor belongs to the state of accepting states F ⊆ Q.

a1

a2

a3

a4

Figure 1: Computation done by a trellis automaton. Evidently, trellis automata are one of the simplest computational models, and they are known to be equivalent to one of the simplest types of language equations: Theorem 1 ([18]) A language L is a component of a unique solution of a system (1) with {∪, ∩, lin·} if and only if L is recognized by a trellis automaton. The proof of Theorem 1 is by effective construction of a trellis automaton out of a system of equations, and vice versa. It is also known that a single variable in such 3

systems is enough to simulate the computation of any trellis automaton, though in this case some “junk” will inevitably remain in the solution: Theorem 2 ([20]) For every trellis automaton M over Σ there exists and can be effectively constructed a one-variable language equation X = ϕ(X) over Σ ∪ {#} (# ∈ / Σ) with the unique solution #L(M ) ∪ L0M , where L0M ⊆ Σ∗ . 3. Nonmonotone case and partial orders on solutions Language equations without negation described in the previous section share the important property of the monotonicity of their right hand sides with respect to the following partial order: Definition 1 (Partial order of componentwise inclusion) Let Σ be an alphabet, let n > 1, let L0i , L00i ⊆ Σ∗ (1 6 i 6 n). Then (L01 , . . . , L0n ) 4 (L001 , . . . , L00n ) if and only if L01 ⊆ L001 , . . . , L0n ⊆ L00n . By the lattice-theoretic fixpoint theorem, this monotonicity guarantees the existence of a least and a greatest solution with respect to this order [1, 16], and it is the least solution which is typically taken as the vector defined by a system. If the operation of complementation is allowed in a system (1), then the properties of having a solution or having a solution least or greatest with respect to the order “4” become nontrivial. For instance, the equation X = X has no solutions, while the system X = X, Y = X has multiple pairwise incomparable solutions {(L, L) | L ⊆ Σ∗ }. In fact, the problem of checking whether a system has a unique, a least or a greatest solution is Π2 -complete [17]. The classes of languages specified by components of unique, least and greatest solutions of these systems are the recursive, the recursively enumerable and the co-recursively enumerable sets, respectively. Though the solutions of the system X = X, Y = X are incomparable with respect to Definition 1, the solution (∅, Σ∗ ) can in some sense be considered the “right” one, because, informally, X specifies ∅, as no string is required to be in X by the equation X = X, and then Y has to be Σ∗ . This suggests the following method of choosing a minimal solution: minimize the first component, then minimize the second component, and so on. Thus the notion of a “right” solution can be formalized as the minimality with respect to the following partial order on vectors of languages: Definition 2 (Partial order of sequential componentwise inclusion) (L01 , . . . , L0n ) 4seq (L001 , . . . , L00n ) if either these vectors coincide, or there exists a number i (1 6 i 6 n), such that L01 = L001 , . . . , L0i−1 = L00i−1 and L0i ⊂ L00i . In other words, vectors are compared lexicographically: if one’s first component is less than the other’s, the former vector is deemed less than the latter; if their first components coincide, then the second components are similarly compared, etc. Once a pair L0i ⊂ L00i is found, the rest of the components are ignored. If a pair of incomparable components is encountered, the vectors are considered incomparable. A solution of a language equation least or greatest with respect to this order will be called a sequentially least (greatest) solution; the system X = X, Y = X above has

4

the sequentially least solution (∅, Σ∗ ) and the sequentially greatest solution (Σ∗ , ∅). Note that if a system has a least or a greatest solution according to Definition 1, then this solution is at the same time sequentially least or greatest. This paper considers sequentially least and greatest solutions of language equations with all Boolean operations and concatenation, with constant languages {a} (a ∈ Σ) and {ε}. For convenience, instead of explicit systems (1), the following simpler form of language equations will be used: ϕ(X1 , . . . , Xn ) = ∅

(2)

S A system (1) can be written in the form (2) as 16i6n (X1 ∆ ϕ1 (X1 , . . . , Xn )) = ∅, where ∆ denotes the symmetric difference of sets. Similarly, an equation (2) can be trivially “resolved” by adding a variable T and using the equations Xi = Xi (1 6 i 6 n) and T = T ∩ ϕ(X1 , . . . , Xn ) [17], where the last equation effectively requires (2). Another simple fact worth note is that, as long as complement can be freely specified, least and greatest solutions can be transformed to each other by complementing every variable: Proposition 1 Let ϕ(X1 , . . . , Xk ) be an expression. Then: 1. (L1 , . . . , Lk ) is a solution of ϕ(X1 , . . . , Xk ) = ∅ if and only if (L1 , . . . , Lk ) is a solution of ϕ(X1 , . . . , Xk ) = ∅. 2. ϕ(X1 , . . . , Xk ) = ∅ has a unique (a least, a greatest, a sequentially least, a sequentially greatest) solution if and only if ϕ(X1 , . . . , Xk ) = ∅ has a unique (a greatest, a least, a sequentially greatest, a sequentially least, respectively) solution. 3. (L1 , . . . , Lk ) is the unique (the least, the greatest, the sequentially least, the sequentially greatest) solution of ϕ(X1 , . . . , Xk ) = ∅ if and only if (L1 , . . . , Lk ) is the unique (the greatest, the least, the sequentially greatest, the sequentially least) solution of ϕ(X1 , . . . , Xk ) = ∅. Indeed, the statement “(L1 , . . . , Lk ) is a solution of ϕ(X1 , . . . , Xk ) = ∅” means that ϕ(L1 , . . . , Lk ) = ∅, which is equivalent to ϕ(L1 , . . . , Lk ) = ∅, i.e., to the vector (L1 , . . . , Lk )’s being a solution of ϕ(X1 , . . . , Xk ) = ∅. The second and the third parts of Proposition 1 easily follow from the first part. By this duality, all the technical results of this paper can be established for least solutions only, and each claim will automatically extend to greatest solutions. 4. Components of solutions are arithmetical sets Suppose a language equation ϕ(X1 , . . . , Xn ) = ∅ has a sequentially least or a sequentially greatest solution. To what classes of languages do the components of this solution belong? In this section it is shown that every i-th component of a sequentially least or a sequentially greatest solution of any equation is in the i-th level of the arithmetical hierarchy: in Σi or in Πi , respectively. 5

According to the classical recursion theory [22], a language L is in Σk if it can be specified as {w | ∃x1 ∀x2 . . . Qk xk R(w, x1 , . . . , xk )} for some recursive predicate R, where Qk = ∃ if k is odd, Qk = ∀ if k is even. Similarly, L is in Πk if its complement is in Σk , i.e., if it admits representation {w | ∀x1 ∃x2 . . . Qk xk R(w, x1 , . . . , xk )}. The following simple notion proved to be important in the study of language equations: Definition 3 ([17]) Two languages L1 , L2 ⊆ Σ∗ are said to be equal modulo a third language M ⊆ Σ∗ (denoted L1 = L2 (mod M )), if L1 ∩ M = L2 ∩ M . A vector (L1 , . . . , Ln ) is said to be a solution modulo M of a language equation ϕ(X1 , . . . , Xn ) = ∅, if ϕ(L1 , . . . , Ln ) = ∅ (mod M ). A solution in the usual sense is a solution modulo Σ∗ . The language M is typically finite and substring-closed (i.e., for every w ∈ M , all substrings of w should be in M ). A solution modulo a finite language is a vector of finite languages, hence it can be finitely represented, and, given a vector of finite languages and a language equation, the property of being a solution modulo a given finite M can be algorithmically checked. This makes the notion of equality modulo finite languages indispensable in the analysis of the computational properties of language equations [17]. Let us use this notion to formulate the following auxiliary result on sequentially least solutions: Lemma 1 Let ϕ(X1 , . . . , Xn ) = ∅ be a language equation that has a sequentially least solution. Let (L1 , . . . , Ln ) be this solution. Fix a variable Xi (1 6 i 6 n). Then for every finite substring-closed language M there exists a finite substringclosed language M 0 ⊇ M , such that for every solution modulo M 0 of the form (L1 ∩ M 0 , . . . , Li−1 ∩ M 0 , L0i , L0i+1 , . . . , L0n ) it holds that Li ∩ M ⊆ L0i ∩ M . In order to explain this result, let us first consider a weaker statement: for every finite substring-closed language M and for every solution (L1 , . . . , Li−1 , L0i , L0i+1 , . . . , L0n )

(3)

of the equation (where L1 , . . . , Li−1 are taken from the known sequentially least solution, while L0i , L0i+1 , . . . , L0n are arbitrary), it should hold that Li ∩ M ⊆ L0i ∩ M . That is easily seen to be true, because Li ⊆ L0i by the definition of the sequentially least solution. The lemma provides a stronger statement: the condition Li ∩ M ⊆ L0i ∩ M must hold not only for solutions, but already for solutions modulo some finite M 0 . Unlike solutions in the usual sense, solutions modulo finite M 0 have a finite representation, which allows to reason about them using first-order logic and to manipulate them algorithmically. This will be used below to determine an unknown sequentially least solution by conducting an infinite search for an unknown finite M 0 . The proof of Lemma 1 given below relies upon the following useful property of language equations with Boolean operations and concatenation: Lemma 2 ([17]) Let ϕ(X1 , . . . , Xn ) = ∅ be a language equation and let M be a finite substring-closed language. Then there exists a finite substring-closed language

6

M 0 ⊇ M , such that for every solution (L01 , . . . , L0n ) modulo M 0 there exists a solution in the normal sense (L1 , . . . , Ln ), such that L0i = Li (mod M ) for all i. Proof of Lemma 1. Define ψi (Xi , . . . , Xn ) = ϕ(L1 , . . . , Li−1 , Xi , . . . , Xn ) and consider the language equation ψi (Xi , . . . , Xn ) = ∅. Note that a vector (L0i , . . . , L0n ) is a solution of the new equation if and only if the vector (L1 , . . . , Li−1 , L0i , L0i+1 , . . . , L0n ) is a solution of the original equation. Let M 0 be the language given by Lemma 2 for the equation ψi (Xi , . . . , Xn ) = ∅ and for the given M . Suppose there exists a solution (L0i , L0i+1 , . . . , L0n )

(4)

modulo M 0 , such that Li ∩ M 6⊆ L0i ∩ M . Then Lemma 2 asserts that there exists a solution in the normal sense (L00i , L00i+1 , . . . , L00n ), such that L00i = L0i (mod M ). Therefore, Li ∩ M 6⊆ L00i ∩ M and consequently Li 6⊆ L00i . On the other hand, by the construction of the equation ψi = ∅, (L1 , . . . , Li−1 , L00i , L00i+1 , . . . , L00n ) is a solution of ϕ = ∅, and its i-th component is not a superset of Li . Since (L1 , . . . , Li−1 , Li , Li+1 , . . . , Ln ) is the sequentially least solution of ϕ(X1 , . . . , Xi , . . . , Xn ) = ∅, this is a contradiction. 2 Lemma 3 Let ϕ(X1 , . . . , Xn ) = ∅ be a language equation with a sequentially least solution (L1 , . . . , Ln ). Then each Li (1 6 i 6 n) is recursively enumerable in {L1 , . . . , Li−1 }. In other words, there exists a Turing machine with i − 1 oracles for L1 , . . . , Li−1 , which recognizes the language Li . Proof. Construct the following procedure that determines the membership of strings in Li : Input: w ∈ Σ∗ . Let M be the set of substrings of w. For all finite substring-closed languages M 0 ⊇ M : / L0i . For every (L0i , L0i+1 , . . . , L0n ), such that L0i , . . . , L0n ⊆ M 0 and w ∈ 0 0 0 0 If ϕ(L1 , . . . , Li−1 , Li , Li+1 , . . . , Ln ) = ∅ (mod M ), then Proceed to the next M 0 . Halt and accept. The outer loop considers all finite substring-closed languages (there are countably many of them) in any order. For every finite M 0 there are finitely many different vectors (L0i , L0i+1 , . . . , L0n ) to consider. For each of these vectors the algorithm checks the equality of ϕ(L1 , . . . , Li−1 , L0i , L0i+1 , . . . , L0n ) to ∅ modulo M 0 . This can easily be done, provided that the languages L1 , . . . , Li−1 , L0i , . . . , L0n are known modulo M 0 . Of these, L0i , . . . , L0n have just been constructed and thus are known, while the required languages L1 ∩ M 0 , . . . , Li−1 ∩ M 0 can be determined by invoking each of these i oracles for each w ∈ M 0 , using (i − 1) · |M 0 | queries in total. Thus each iteration of the outer loop terminates in a finite number of steps, while the whole computation terminates if a suitable M 0 is found, and continues infinitely otherwise. By Lemma 1, for the chosen M there should exist a finite substring-closed language M00 ⊇ M , such that every solution modulo M00 , taken modulo M , should 7

have the i-th component greater or equal to Li ∩ M . Consider the following two alternatives: • If w ∈ Li , where Li is the i-th component of the sequentially least solution of the equation, then for every solution (L1 , . . . , Li−1 , L0i , . . . , L0n ) modulo M00 , w ∈ Li ∩ M ⊆ L0i ∩ M . When M00 is eventually considered by the procedure (at the iteration M 0 = M00 ), the condition in the if statement is false for all solutions modulo M 0 , and thus the string is accepted. • Let w ∈ / Li . Since for every M 0 the vector (L1 ∩ M 0 , . . . , Li−1 ∩ M 0 , Li ∩ 0 M , Li+1 ∩ M 0 , . . . , Ln ∩ M 0 ) is among the solutions modulo M 0 and w ∈ / Li , this vector is considered in the inner loop and the condition in the if statement evaluates to true. Therefore, every time the procedure continues with the next M 0 , and thus never terminates. Hence the procedure accepts every string from Li , and does not terminate on the strings not from Li . This is being done using oracles for the languages L1 , . . . , Li−1 , so the language Li is recursively enumerable in {L1 , . . . , Li−1 }. 2 Theorem 3 If an equation ϕ(X1 , . . . , Xn ) = ∅ has a sequentially least solution (L1 , . . . , Ln ), then each Lk is in Σk . If an equation has a sequentially greatest solution (L01 , . . . , L0n ), each L0k is in Πk . Proof. It is sufficient to prove the first case. Induction on k. Basis, k = 1. By Lemma 3, L1 is recursively enumerable (without any oracles), i.e., is in Σ1 . Induction step. Suppose that each Li (1 6 i < k) is in Σi . Then L1 , . . . , Lk−1 are all in Σk−1 . Since, according to Lemma 3, Lk is recursively enumerable in {L1 , . . . , Lk−1 }, Lk is recursively enumerable in some set from Σk−1 , and hence is in Σk [22]. 2 5. Representation of arithmetical sets Let us now use language equations to represent the sets from the arithmetical hierarchy. A language {w | ∃x1 ∀x2 . . . Qk xk R(w, x1 , . . . , xk )} ∈ Σk will be specified almost exactly according to this definition, by starting from the base recursive set R and then implementing each of the k quantifiers. However, it would require a certain precision to fit all of this into exactly k variables (which, by the results of the previous section and by the strictness of the arithmetical hierarchy, is the best that can possibly be achieved). Let us first rewrite the condition on w as ∃x1 ¬∃x2 . . . ¬∃xk R(w, x1 , . . . , xk ), which leaves us with one type of quantifiers and the negation, the latter directly expressible using complement. The construction is done inductively on k and actually starts from a recursively enumerable (or Σ1 ) set {w#x1 # . . . #xk−2 #xk−1 | ∃xk R(w, x1 , . . . , xk )}. While no technique to specify an arbitrary recursively enumerable set using a one-variable language equation is

8

known, there is a method to specify a language closely related to a given r.e. language using, indeed, just one variable. Theorem 4 ([21]) For every recursively enumerable language L ⊆ Σ∗ there exists a language equation ξ(X) = ∅ over an alphabet Σ0 ⊃ Σ, such that its least solution is L(T )† ∪ L0 for some L0 ⊆ (Σ0 \ {†})∗ . Given a Turing machine for L, ξ can be effectively constructed. The complete construction, accompanied by a proof of its correctness, can be found in the cited paper. Here it is used as the basis of induction for the representation of the upper levels of the arithmetical hierarchy. Let us explain its essential idea. It is well-known that for every Turing machine T its computations can be encoded as strings, so that the language of all valid accepting computations of T , defined as VALC(T ) = {w#CT (w) | T accepts w} (5) (where CT (w) is a suitable encoding of the computation of T on w) is an intersection of two linear context-free languages [2, 9]. Consequently, it can be described using a system of language equations with union, intersection and linear concatenation. Using language equations further equipped with complement, it is possible to extract the language recognized by T out of the language of its computations, basically by expressing a condition that every string w#CT (w) ∈ VALC(T ) should be in X#Σ∗ , which effectively means w ∈ X, thus making X = L(T ) the least solution of the equation [17]. About half a dozen variables are needed to implement these ideas directly [17]: consider the nonterminals in linear context-free grammars becoming variables. This is where trellis automata come forth: the very same language VALC(T ) is represented using a trellis automaton M , which is then simulated by a one-variable language equation. Afterwards L(T ) is carefully extracted from this encoding of VALC(T ), with both languages sharing the same variable. The construction used in the proof of Theorem 4 thus goes as follows: i. Two linear context-free grammars, G1 and G2 , such that VALC(T ) = L(G1 )∩ L(G2 ), are constructed using the well-known method [9, 2]. ii. These two grammars are converted into two trellis automata, and the closure properties of the latter [5] allow us to construct a single trellis automaton M , such that L(M ) = VALC(T ). iii. According to Theorem 2, this trellis automaton M is converted into a singlevariable language equation X = ϕ(X) with a unique solution #L(M ) ∪ L0M . iv. An expression ψ(Y, Z), such that ψ(#L(M ) ∪ L0M , Z) = ∅ if and only if L(M )† ⊆ Z, is constructed (see also Lemma 4 below). This implements the aforementioned method of extracting L(T ) out of VALC(T ). v. The expressions ϕ and ψ are combined to produce another expression ψ(X), such that the language equation ξ(X) = ∅ has the unique solution L(T )† ∪ #VALC(T ) ∪ L0M . 9

The subsets #VALC(T ) and L0M of this solution are two layers of garbage left by the chain of simulations. Each of the two steps (from L0M to L(M ) = VALC(T ) and from VALC(T ) to L(T )) produces a separate layer, containing the encoded computations of M and encoded computations of T , respectively. Their union #VALC(T ) ∪ L0M is the language L0 from the statement of the theorem. While L(T )† ∪ L0 is not precisely L(T ), the first quantifier will carry out the garbage, so that each k-th variable (k > 2) define any language from Σk exactly. Let us show how existential quantification can be implemented by language equations and the notion of their least solution. Lemma 4 Let Σ be an alphabet, let # ∈ / Σ. Let L ⊆ Σ∗ #Σ∗ . Define the function of two variables ψ(Y, Z) = Y \ Z#Σ∗ . Then the least value of Z, such that ψ(L, Z) = ∅, is L0 = {w ∈ Σ∗ | ∃x ∈ Σ∗ : w#x ∈ L}. Proof. It is easy to see that ψ(L, L0 ) = ∅: L ⊆ {w ∈ Σ∗ |∃x ∈ Σ∗ : w#x ∈ L}#Σ∗ , because for every w#x ∈ L, w ∈ L0 by the definition of L0 , and hence w#x ∈ L0 #Σ∗ . Consider any L00 , such that ψ(L, L00 ) = ∅. Then L ⊆ L00 #Σ∗ , and hence for every string w#x ∈ L the string w has to be in L00 . This effectively means that L0 ⊆ L00 , showing that Z = L0 is the least value, such that ψ(L, Z) = ∅. 2 Theorem 5 Let k > 2. Then for every L ∈ Σk there exists a language equation ϕ(X1 , . . . , Xk ) = ∅, such that the last component of its sequentially least solution is L. Given a representation of L by a quantified recursive predicate, this language equation can be effectively constructed. Proof. Induction on k. Basis, k = 2. Let L be in Σ2 . Then it can be obtained from a co-RE set using existential quantification: L = {w ∈ Σ∗ | ∃x ∈ Σ∗ : w#x ∈ L0 }, where L0 is recursively enumerable. By Theorem 4, there exists a language equation ξ(X) = ∅, such that its least solution is X = L0 † ∪ L0 . Define ψ(Y, Z) = Y \ Z#(Σ ∪ {†})∗ . According to Lemma 4, the least Z, such that ψ(L0 †, Z) = ∅, is precisely the given language L ∈ Σ2 . Hence the language equation ξ(X1 ) ∪ ψ(X1 ∩ Σ∗ #Σ∗ †, X2 ) = ∅ has the sequentially least solution (L0 † ∪ L0 , L). Induction step k → k + 1. If L is in Σk+1 for some k > 2, it can be obtained from a Πk set (i.e., from a complement of a Σk set) using existential quantification: L = {w ∈ Σ∗ | ∃x ∈ Σ∗ : w#x ∈ L0 }, where L0 ⊆ Σ∗ #Σ∗ is in Σk . By the induction hypothesis, there exists a language equation ϕ(X1 , . . . , Xk ) = ∅ with a sequentially least solution (L1 , . . . , Lk ), such that Lk = L0 . Define the function of two variables ψ(Y, Z) = Y \ Z#Σ∗ . Again, Lemma 4 asserts that L is the least value of Z, such that ψ(L0 , Z) = ∅. Hence the equation ϕ(X1 , . . . , Xk ) ∪ ψ(Xk ∩ Σ∗ #Σ∗ , Xk+1 ) = ∅ has the sequentially least solution (L1 , . . . , Lk , L). 2 6. Characterizations Now the simulations of language equations by recursive predicates with quantifier prefixes (Theorem 3) and vice versa (Theorem 5) can be combined to obtain the main result of this paper.

10

Theorem 6 (Characterization of the arithmetical hierarchy) For every k > 2, the class of languages representable as components of sequentially least (sequentially greatest) solutions of k-variable language equations is exactly Σk (Πk , respectively). Proof. According to Proposition 1, it is sufficient to prove the case of sequentially least solutions only. Let ϕ(X1 , . . . , Xk ) = ∅ (k > 1) be an arbitrary language equation that has a sequentially least solution, and let (L1 , . . . , Lk ) be this least solution. By Theorem 3, every Li is in Σi and hence all of them are in Σk . Conversely, for every set L in Σk (k > 2), by Theorem 5, there is a language equation ϕ(X1 , . . . , Xk ) = ∅, such that the last component of its sequentially least solution is L. 2 Lifting the restriction on the number of variables, we obtain the equivalence of language equations with unbounded number of variables to the infinite union of the arithmetical hierarchy. This infinite union is the set of all sets expressible in formal Peano arithmetic. Corollary 1 (Characterization of first-order arithmetic) A language L can be represented as a component of a sequentially least (or sequentially greatest) solution of a language equation if and only if L is definable in first-order arithmetic. Now consider the problem of determining whether a given language equation has a sequentially least or a sequentially greatest solution. This property is nontrivial, which is demonstrated by the one-variable equation X = (X ∩ ε) ∪ a(X ∩ ε) that has two incomparable solutions, {ε} and {a}. It is known that the property of having a unique solution can be expressed by a first-order formula with two quantifiers [17], and the corresponding decision problem is in fact Π2 -complete. A similar result for the property of having a componentwise least or greatest solution has also been obtained [17]. However, the case of sequentially least and greatest solutions turns out to be much more difficult. Theorem 7 The set of language equations of the form ϕ(X1 , . . . , Xn ) = ∅ (n > 1) that have a sequentially least (sequentially greatest) solution is not an arithmetical set, i.e., cannot be specified by a first-order formula of elementary number theory. Proof. It suffices to prove the case of a sequentially least solution; the case of a sequentially greatest solution then follows by Proposition 1. Let Lseq.` be the language of string representations of those and only those language equations that have a sequentially least solution. Let us demonstrate that every arithmetical set is one-to-one reducible to Lseq.` . Since every arithmetical set is in Σk for some k > 2 [22], by Theorem 5, there exists an equation ϕ(X1 , . . . , Xk−1 , Xk ) = ∅ (6) with a sequentially least solution (L1 , . . . , Lk−1 , Lk ), where Lk is the set in question. Let us augment this equation with a new variable Y and with the following equation: Y = (Y ∩ Xk ∩ w) ∪ a(Y ∩ w)

11

(7)

Formally, (6, 7) can be equivalently written as a single equation ϕ(X1 , . . . , Xk−1 , Xk ) ∪ (Y ∆ ((Y ∩ Xk ∩ w) ∪ a(Y ∩ w)) = ∅ | {z }

(8)

ψ(X1 ,...,Xk ,Y )

The candidate for being the sequentially least solution of this equation must be of e for some L e ⊆ Σ∗ . Consider two possible cases: the form (L1 , . . . , Lk , L) • If w ∈ Lk , then Xk ∩ w equals ∅, and thus (7) becomes equivalent to Y = a(Y ∩ w). The string w cannot be in Y , hence w ∈ Y and Y ∩ w = {w}. Therefore, Y has to be {aw}, and (L1 , . . . , Lk , {aw}) is the sequentially least solution of the system. • If w ∈ / Lk , then Xk ∩ w is {w}, and the equation (7) is equivalent to Y = (Y ∩ w) ∪ a(Y ∩ w). Now, if w ∈ Y , then w ∈ / Y , aw ∈ / a(Y ∩ w) and accordingly aw ∈ / Y . On the other hand, if w ∈ / Y , then w ∈ Y , aw ∈ a(Y ∩ w) and therefore aw ∈ Y . This means that there cannot be a least Y , because (L1 , . . . , Lk , {w}) and (L1 , . . . , Lk , {aw}) are both solutions of the equation, while (L1 , . . . , Lk , ∅) is not. Hence there is no sequentially least solution. Therefore, the equation (8) has a sequentially least solution if and only if w ∈ Lk , which completes the reduction of an arbitrary arithmetical set Lk to Lseq.` . Now suppose that Lseq.` is arithmetical itself, i.e., is in Σk for some k > 1. Then every arithmetical set is reducible to a set in Σk , and hence is in Σk itself. The arithmetical hierarchy is thus supposed to collapse, which is a contradiction, since it is known to be proper [22]. 2 7. Conclusion It has been shown that solutions of language equations with all Boolean operations, least or greatest with respect to a certain quite natural partial order, define precisely the sets representable in first-order arithmetic. The number of variables in these equations was found to correspond to the level in the arithmetical hierarchy. This result complements the previously known facts on the representation of different language families by language equations. These families of languages are shown in Figure 2, where arrows indicate inclusion. The bottom level of the hierarchy is formed by regular languages, which admit numerous representations by language equations [24, 25]. Next there come the classical language equations that specify linear context-free and context-free languages [1, 6]. Going up in the hierarchy, the language equations equivalent to trellis automata (or linear conjunctive grammars) [18], the language equations equivalent to conjunctive grammars [16], and the language equations used in the formal definition of Boolean grammars [19] are found; these three families are contained in the cubic-time deterministic contextsensitive languages [10, 15, 19], and already the trellis automata can recognize some P-complete sets [10].

12

Figure 2: Classes of languages characterized by language equations. Then a wide gap in the known hierarchy is encountered: from a proper subset of P to the whole family of recursive languages, which is characterized by unique solutions of language equations with all Boolean operations and concatenation [17]. Componentwise least and greatest solutions of these equations specify the recursively enumerable and the co-recursively enumerable sets. Finally, as it has just been proved, sequentially least and sequentially greatest solutions of equations with different number of variables characterize all levels of the arithmetical hierarchy starting from the second, while the whole class of arithmetical sets is represented by equations with an unbounded number of variables. The new result suggests a new perspective on language equations: these equations, such as those studied in this paper, could be viewed as an applied logic. Indeed, first of all, there is an evident similarity in the expressive means: settheoretic operations work as if propositional connectives, while certain techniques described in the paper allow us to simulate quantifiers. Second, the fact that a naturally defined class of language equations is equal in power to the first-order Peano arithmetic speaks for itself. Perhaps a logic-oriented approach to the further study of language equations could eventually lead to some beautiful results. Certainly there remain many undiscovered types of language equations. Likely some of them would give further characterizations of different computational models, such as the models studied in the complexity theory, which have already been successfully characterized by various restricted logics [12, 23]. Finding some classes of language equations to fill the gap between Boolean grammars [19] and the language equations that specify the recursive languages [17] can be proposed as a worthy research problem. Indeed, most of the interesting families of languages lie there – for instance, can one characterize the polynomial hierarchy in the same way as the arithmetical hierarchy has been characterized in this paper? – and the corresponding algebraic methods of language specification await their discovery. References 1. J. Autebert, J. Berstel, L. Boasson, “Context-Free Languages and Pushdown Automata”, in: Rozenberg, Salomaa (Eds.), Handbook of Formal Languages, Vol. 1, Springer-Verlag, Berlin, 1997, 111–174. 2. B. S. Baker, R. V. Book, “Reversal-bounded multipushdown machines”, Journal of

13

Computer and System Sciences, 8 (1974), 315–332. 3. N. Chomsky, M. P. Sch¨ utzenberger, “The algebraic theory of context-free languages”, in: Braffort, Hirschberg (Eds.), Computer Programming and Formal Systems, North-Holland, Amsterdam, 1963, 118–161. 4. J. H. Conway, Regular Algebra and Finite Machines, Chapman and Hall, London, 1971. 5. K. Culik II, J. Gruska, A. Salomaa, “Systolic trellis automata”, I and II, International Journal of Computer Mathematics, 15 (1984), 195–212, and 16 (1984), 3–22. 6. S. Ginsburg, H. G. Rice, “Two families of languages related to ALGOL”, Journal of the ACM, 9 (1962), 350–371. 7. J. Gruska, “On a classification of context-free grammars”, Kybernetika, 3 (1967), 22–29. 8. J. Gruska, “Some classifications of context-free languages”, Information and Control, 14 (1969), 152–179. 9. J. Hartmanis, “Context-free languages and Turing machine computations”, Proceedings of Symposia in Applied Mathematics, Vol. 19, AMS, 1967, 42–51. 10. O. H. Ibarra, S. M. Kim, “Characterizations and computational complexity of systolic trellis automata”, Theoretical Computer Science, 29 (1984), 123–153. 11. N. Immerman, “DSPACE[nk ]=VAR[k + 1]”, Sixth IEEE Symposium on Structure in Complexity Theory (July 1991), 334–340. 12. N. Immerman, Descriptive complexity, Springer-Verlag, New York, 1998. 13. N. Immerman, J. F. Buss, D. A. M. Barrington, “Number of variables is equivalent to space”, Journal of Symbolic Logic, 66 (2001), 1217–1230. 14. E. L. Leiss, Language equations, Springer-Verlag, New York, 1999. 15. A. Okhotin, “Conjunctive grammars”, Journal of Automata, Languages and Combinatorics, 6:4 (2001), 519–535. 16. A. Okhotin, “Conjunctive grammars and systems of language equations”, Programming and Computer Software 28 (2002), 243–249. 17. A. Okhotin, “Decision problems for language equations with Boolean operations”, Automata, Languages and Programming (ICALP 2003, Eindhoven, The Netherlands, June 30–July 4, 2003), LNCS 2719, 239–251; journal version submitted. 18. A. Okhotin, “On the equivalence of linear conjunctive grammars to trellis automata”, RAIRO Informatique Th´eorique et Applications, 38 (2004), 69–88. 19. A. Okhotin, “Boolean grammars”, Information and Computation, 194 (2004), 19–48. 20. A. Okhotin, “On the number of nonterminals in linear conjunctive grammars”, Theoretical Computer Science, 320:2–3 (2004), 419–448. 21. A. Okhotin, “On computational universality in language equations”, Machines, Computations and Universality (Proceedings of MCU 2004, Saint-Petersburg, Russia, September 21–24, 2004), LNCS 3354, 292–303. 22. H. Rogers, Jr., Theory of Recursive Functions and Effective Computability, McGrawHill, 1967. 23. W. C. Rounds, “LFP: a logic for linguistic descriptions and an analysis of its complexity”, Computational Linguistics, 14:4 (1988), 1–9. 24. A. Salomaa, Theory of Automata, Pergamon Press, Oxford, 1969. 25. S. Yu, “Regular Languages”, in: Rozenberg, Salomaa (Eds.), Handbook of Formal Languages, Vol. 1, Springer-Verlag, Berlin, 1997, 41–110.

14