Infinite sets that admit fast exhaustive search Mart´ın Escard´o School of Computer Science, University of Birmingham, UK Version of January 27, 2008
Abstract. Perhaps surprisingly, there are infinite sets that admit mechanical exhaustive search in finite time. An old example is the Cantor space of infinite sequences of binary digits. We investigate three related questions, in the realm of higher-type computation: (i) What kinds of infinite sets are exhaustible? It turns out that they have to be compact in the topological sense. A complete description is obtained by developing a computational version of an Arzela–Ascoli type characterization of compact subsets of function spaces. Another, less explicit, one is perhaps more appealing: a non-empty set is exhaustible if and only if it is a computable image of the Cantor space. (ii) How do we systematically build infinite exhaustible sets? Here some well-known topological closure properties of compact sets are shown to hold for exhaustible sets. This includes a computational version of the Tychonoff theorem, asserting that exhaustible sets are closed under countable products. (iii) How fast can exhaustive search over infinite sets be performed? Although exhaustive search over infinite sets is of course intractable, as is the case for finite sets, algorithms that are fast in surprising classes of instances are exhibited. We formulate time-complexity conjectures, which are backed by experiments, where the size of the input predicate for the search algorithm is taken as its modulus of uniform continuity. Keywords. Higher-type computability and complexity, Kleene–Kreisel functional, PCF, domain theory, programminglanguage semantics, topology, k-space, compactly generated space, Haskell, functional programming. Mathematics Subject Classification 2000. 03D65, 68Q55, 06B35, 54D50.
1
Introduction
A wealth of computational problems of interest have the following form: given a set K and a property p of elements of K, check whether or not all elements of K satisfy p. For K fixed in advance, this is known as the emptiness problem for p. One is often interested in suitable restrictions on the possible syntactical forms of the predicate p that guarantee that the emptiness problem is decidable (or, less ambitiously, that the non-emptiness problem is semi-decidable) uniformly in the syntactical form of p. In this work, on the other hand, the emphasis is on the set K rather than the predicate p, and, moreover, we don’t look at the syntactic structure of p: we consider all decidable predicates on K, in the sense of computability theory, given as black boxes. Exhaustively searchable sets. We say that the set K is exhaustible if the above problem can be algorithmically solved in finite time, for any decidable property p, uniformly p. Thus, the input of the decision algorithm is a black box for p and the output is the truth value of the statement that all elements of K satisfy p. In the realm of higher-type computability theory [23], the algorithm has type (C → B) → B, where C is a type, K ⊆ C, and B is the 1
type of booleans, so that (C → B) is the type of decidable predicates on C. The question investigated in this work is what sets, if any, are exhaustible in the realm of higher-type computability theory [19]. Clearly, finite sets of computable elements are exhaustible. What may be rather unclear is whether there are infinite examples. Intuitively, there can be none: how could one possibly check infinitely many cases in finite time? This intuition is correct when K is a set of natural numbers: it is a theorem that, in this case, K is exhaustible if and only if it is finite. This can be proved by reduction to the halting problem, but there is also a purely topological argument (Remark 3.3.5). However, it turns out that there is a rich supply of infinite exhaustible sets. A first example, the Cantor space of infinite sequences of binary digits, goes back to the 1950’s, or even earlier, with the work of Brouwer, as discussed in the related-work paragraph below. The infinite case. Our primary contribution is a comprehensive investigation of infinite exhaustible sets in the realm of higher-type computability theory (with classical logic as our meta-theory). We develop tools for systematically building them and some characterizations, including the following: they are closed under intersections with decidable sets, under the formation of computable images and of finite and countably infinite products, and in the non-empty case they are precisely the computable images of the Cantor space. An Arzela–Ascoli type characterization is formulated and proved in Section 3. If a problem of the above form has a negative solution, one would like to be able to algorithmically find a counter-example. If this is possible, we say that the set K is searchable. It turns out that exhaustibility coincides with searchability, which supports the intuitive understanding of exhaustive search, but involves an elaborate construction. The closure properties and characterizations of exhaustibility resemble those of compactness in topology. This is no accident: exhaustible sets are to compact sets as computable functions are to continuous maps. This plays a crucial role in the correctness proofs of some of the algorithms, and, indeed, in their very construction. Thus, the specifications of all of our algorithms can be understood without much background, but an understanding of the working of some of them requires some amount of topology. We have organized the construction of these algorithms in two parts: Section 2 addresses those that are motivated by topology but don’t rely on knowledge of topology for their formulation or correctness proofs, and Section 3 collects those that crucially depend on topological considerations for their proofs, together with the necessary topological background. Complexity considerations. Our secondary contribution is a preliminary investigation of the run-time behaviour of some of our algorithms and of the complexity of the search problem in the infinite case. Although we don’t have conclusive results in this direction, we have some surprising experimental results and tentative theoretical explanations and conjectures (Section 4). These experiments are implemented in a fragment of the highertype functional programming language Haskell [13], which is essentially the same as PCF (simply-typed lambda-calculus with arithmetic and fixed-point recursion) [29, 25]. Related work. Brouwer’s Fan functional gives the modulus of uniform continuity of a discrete-valued continuous functional on the Cantor space. According to personal communication by Normann, computability of the Fan functional was known in the late 1950’s. This immediately gives rise to the exhaustibility of the Cantor space. A number of authors have considered definability of the Fan functional in various formal systems. Normann [23] cites Tait (1958, unpublished), Gandy (around 1982, unpublished) and Berger [5] (1990). Tait showed that the Fan functional is not definable from Kleene’s schemes S1–S9 interpreted over total functionals. Berger showed that it is PCF definable, and, in order to do that, he first explicitly defined a search functional for the Cantor space. Berger observed
2
that, for partial functionals, PCF definability coincides with S1–S9 definability. Then Hyland informed the community that Gandy was aware of the S1–S9 definability of the Fan functional for the partial interpretation of Kleene’s schemes, although Gandy’s definition seems to be lost. Totality assumptions. Some of the above results crucially rely on a notion of totality. For example, to show that exhaustible sets are searchable, we need to assume that they consist of total elements. But there are two contenders for a notion of totality in highertype computation, namely Kleene–Kreisel totality and hereditarily effective totality. Our results hold for the former but fail for the latter. This failure is to be expected: it is well known that, for the hereditarily effective notion, there is no total Fan functional [4], and hence the set of total elements of the Cantor type cannot be exhaustible. Put another way, the above algorithms for the Fan functional are total in the Kleene–Kreisel sense, but not in the hereditarily effective sense. In the language of Plotkin [27], we work with PCFdefinable functionals under the semantic notion of totality. Acknowledgements. I have benefited from discussions with Andrej Bauer, Ulrich Berger, Dan Ghica, Achim Jung, John Longley, Matthias Schr¨oder, and Alex Simpson. I also thank Dag Normann for having answered questions regarding the history and technical ramifications of the subject, and for sending me a copy of Tait’s unpublished manuscript — but the reader should consult his paper [23] for a more accurate and detailed account of the development of the subject.
2
Topologically inspired algorithms
In this section we develop algorithms that don’t require knowledge of topology but are motivated by topological considerations. The intuition behind the topological notion of compactness is that compact sets behave, in many relevant respects, as if they were finite. Infinite sets that admit exhaustive search in finite time share the same intuition. Hence it is natural to conjecture that they also share similar structural properties. For example, compact sets are closed under the formation of products (Tychonoff theorem). Motivated by this, in this section we show that exhaustible sets are closed under countable products, and we also export other closure properties from topology to computation. After introducing the relevant computational background, we define the two main notions investigated in this work, namely those of exhaustibility and searchability, and then develop algorithms that implement the closure properties.
2.1
Background on higher-type computation
As discussed in e.g. [23, 16, 17], there are many equivalent approaches to higher-type computation. Kleene defined the total functionals directly, but it has been found more convenient to work with the larger collection of partial functionals and isolate the total ones within them, as done by Kreisel. The approaches are equivalent, and such total functionals are often referred to as Kleene–Kreisel functionals. It turns out that, as discussed by Normann [23], this coincides with another approach known in the computer-science community: equivalence classes of total functionals on Scott domains. In this section we work with total functionals on Scott domains, and in Section 3 we work with a characterization of Kleene–Kreisel functionals, due to Hyland, in terms of compactly generated topological spaces. Types. The simple types are defined by induction as σ, τ ::= o | ι | σ × τ | σ → τ, 3
with usual rules for bracketing, where o and ι are ground types for booleans and natural numbers respectively. The subset of pure types is defined by σ ::= ι | σ → ι. As usual, we’ll occasionally reduce statements about simple types to statements about pure types. Partial functionals. For each type σ, define a Scott domain Dσ of partial functionals of type σ by induction as follows: Do = B = B⊥ ,
D ι = N = N⊥ ,
Dσ×τ = Dσ × Dτ , Dσ→τ = (Dσ → Dτ ) = Dτ Dσ , where B = {ff, tt} is the set of booleans and the products and exponentials are calculated in the cartesian closed category of continuous maps of Scott domains, where a Scott domain is an algebraic, bounded complete, and directed complete poset [1]. Total functionals. For each type σ, define, a set Tσ ⊆ Dσ of total functionals and a relation ∼σ on Dσ as follows, where γ ranges over ground types o and ι: To = B,
Tι = N,
x ∼γ y ⇐⇒ x, y ∈ Tγ and x = y. Tσ×τ = Tσ × Tτ , (x, x0 ) ∼σ×τ (y, y 0 ) ⇐⇒ x ∼σ y ∧ x0 ∼τ y 0 , Tσ→τ = {f ∈ Dσ→τ | f (Tσ ) ⊆ Tτ }, f ∼σ→τ g ⇐⇒ ∀x ∼σ y.f (x) ∼τ g(y). Then the set Tσ can be recovered from the relation ∼σ as x ∈ Tσ ⇐⇒ x ∼σ x, and the relation can be recovered from the set as x ∼σ y ⇐⇒ x u y ∈ Tσ ⇐⇒ x, y ∈ Tσ and x and y are bounded above. See e.g. [6] and [27]. In particular, ∼σ is an equivalence relation on Tσ . Computability. A partial functional is computable iff it is PCF-definable from parallel-or and parallel-exists [25]. This is a theorem, but we take it as our definition. All computable functionals we construct are defined in PCF without parallel extensions. This definition includes, in particular, total functionals. An interesting fact, which we don’t use, is that every total functional definable in PCF with parallel extensions is equivalent to one definable in PCF without parallel extensions [20].
4
2.2
Exhaustible and searchable sets
We now formulate the central notions investigated in this work. Through this section, D = Dσ , T = Tσ and D0 = Dσ0 , T 0 = Tσ0 for arbitrary simple types σ and σ 0 . Definition 2.2.1. If K is a subset of D, we say that a predicate p ∈ (D → B) is defined on K, or total on K, if p(x) 6= ⊥ for every x ∈ K. Definition 2.2.2. We say that a set K ⊆ D is exhaustible if there is a computable functional ∀K : (D → B) → B such that for any p ∈ (D → B) defined on K, ( tt if p(x) = tt for all x ∈ K, ∀K (p) = ff if p(x) = ff for some x ∈ K. Such a functional is not uniquely determined, because its behaviour is not specified for predicates p that are not defined on K. For the sake of clarity, we’ll often write “∀K (λx. . . . )” as “∀x ∈ K. . . . ”. Clearly, it is equivalent to instead require the existence of a computable functional ∃K : (D → B) → B such that for any p ∈ (D → B) defined on K, ( tt if p(x) = tt for some x ∈ K, ∃K (p) = ff if p(x) = ff for all x ∈ K, because such functionals are inter-definable by the De Morgan Laws and hence we’ll freely switch between them. Definition 2.2.3. We say that a set K ⊆ D is searchable if there is a computable functional εK : (D → B) → D such that, for every predicate p ∈ (D → B) defined on K, 1. εK (p) ∈ K, and 2. p(εK (p)) = tt if p(x) = tt for some x ∈ K. Again, notice that εK is not uniquely determined by K. Thus, εK (p) is an example of an element of K for which p holds, if such an element exists, or a counter-example in K if no such example exists. Lemma 2.2.4. Searchable sets are exhaustible. Proof. Define ∃K (p) = p(εK (p)). With 1 = {?}, an equivalent definition of searchability, which will not be used, is that 1. K has a computable element eK , and 2. there is ε0K : (D → B) → 1 + D computable such that ε0K (p) = ? if there is no example, and otherwise ε0K (p) ∈ K and p(ε0K (p)) = tt. In fact, given εK one can define eK = εK (λx. tt) and ε0K (p) = if p(εK (p)) then εK (p) else ?. Conversely, given ε0K and eK as specified, one can define εK (p) = if ε0K (p) = ? then eK else ε0K (p). The empty set is exhaustible with realizer ∀∅ (p) = tt, but it is not searchable because the condition ε∅ (p) ∈ ∅ cannot hold. But we’ll see in Section 3.4 that, for non-empty entire sets, defined below, the two notions turn out to agree. 5
Definition 2.2.5. We say that a set K is entire if it consists of total elements and is closed under total equivalence. Notice that if p is total then it is defined on every entire set. Even if p is not total and K is not entire, p(x) = p(x0 ) for all x ∼ x0 in K, because if x ∼ x0 then x and x0 are bounded above and hence so are p(x) and p(x0 ), which then must be equal as they are non-bottom by definition. But if x ∈ K and x0 ∼ x for x0 outside K, it doesn’t follow that p(x0 ) 6= ⊥ (consider e.g. K = {λi. tt} for σ = ι → o and p(α) = α(⊥)). Q Let Dω = (N → D) and, for any sequence Ki of subsets of D, let i Ki be the set of functions α ∈ Dω with αi = α(i) ∈ Ki for all i ∈ N ⊆ N . The following closure properties of entire sets are easily verified: 1. If K ⊆ D and K 0 ⊆ D0 are entire, so is K × K 0 ⊆ D × D0 Q 2. If Ki is a sequence of entire subsets of D, then i Ki is an entire subset of Dω . Definition 2.2.6. The image of an entire set by a total function doesn’t need to be entire, but it consists of total elements, and hence its closure under total equivalence is entire. We refer to this as its entire image. (Thus, entire images are defined for total functions and entire sets only.) Definition 2.2.7. 1. An entire set F ⊆ D is decidable if there is a total computable map ψF : D → B such that, for all total x ∈ D, ψF (x) = tt iff x ∈ F . 2. For given K ⊆ D, we say that a set F ⊆ D is decidable on K if there is a computable map ψF : D → B defined on K such that, for all x ∈ K, ψF (x) = tt iff x ∈ F . (Then an entire set F is decidable if and only if it is decidable on T .) 3. An entire set F ⊆ D is semi-decidable if there is a computable map χF : D → B such that, for all total x ∈ D, ψF (x) = tt if x ∈ F , and ψF (x) = ⊥ otherwise. 4. F is co-semi-decidable if its complement in the set of total elements is decidable. Notice that the functions ψF and χF are not uniquely determined by F , because their behaviours are specified on a subset of D.
2.3
Building new searchable sets from old
We develop algorithms that show that exhaustible and searchable sets are closed under various constructions. Starting from the finite sets, this allows one to systematically build plenty of infinite searchable sets. Proposition 2.3.1. Let K, F ⊆ D with F decidable on K. 1. If K is exhaustible then so is K ∩ F . 2. If K is searchable then so is K ∩ F , provided it is non-empty. Proof. Define ∃K∩F (p) = ∃x ∈ K.ψF (x) ∧ p(x), εK∩F (p) = if ∃x ∈ K ∩ F.p(x) then εK (λx.ψ(x) ∧ p(x)) else εK (ψF ). The topological motivation for the above proposition is that the intersection of a closed set with a compact set is compact. Decidable sets correspond to sets that are open and closed, and hence, bearing in mind that exhastible sets (ought to) correspond compact sets, the above proposition ought to be true, which it is. The motivation for the following proposition is that, in topology, continuous images of compact sets are compact. In fact, it arises by replacing continuity by computability and compactness by exhaustibility. For later use, we also make sure it holds in the world of entire sets. 6
Proposition 2.3.2. Exhaustible and searchable sets are closed under the formation of computable images, and also under the formation of computable entire images. Proof. Given f : D → D0 and K ⊆ D exhaustible, define ∀f (K) (q) = ∀x ∈ K.q(f (x)). This proves closure of exhaustible sets under images. Regarding entire images of entire exhaustible sets, if f is total and K is entire with entire image L, then we can take ∀L = ∀f (K) . To verify this, let q be defined on L. Then q is defined on f (K) ⊆ L, and hence if q(l) = tt for all l ∈ L, then ∀L (q) = tt. If, on the other hand, q(l) = ff for some l ∈ L, then l ∼ f (x) for some x ∈ K. But then q(f (x)) = ff, and so ∀T (q) = ff, which concludes the verification. For K ⊆ D searchable, define εf (K) (q) = f (εK (λx.q(f (x))). That is, first find x such that q(f (x)) holds, using εK , and then apply f to such x. This proves closure under images, and the argument for entire images is similar to the previous. The following corresponds to the fact that compact sets in topology are closed under finite products: Proposition 2.3.3. Exhaustible and searchable sets are closed under the formation of finite products. Proof. For K ⊆ D and K 0 ⊆ D0 exhaustible, define ∀K×K 0 (p) = ∀x ∈ K.∀x0 ∈ K 0 .p(x, x0 ). For K ⊆ D and K 0 ⊆ D0 searchable, to compute εK×K 0 (p) we first find x ∈ K such that there is x0 ∈ K 0 with p(x, x0 ), and then find x0 ∈ K 0 such that p(x, x0 ), i.e. x = εK (λx.∃x0 ∈ K 0 .p(x, x0 )), x0 = εK 0 (λx0 .p(x, x0 )), using the fact that searchable sets are exhaustible, and let εK×K 0 (p) = (x, x0 ). Compact sets in topology are closed under arbitrary products. We now show that searchable sets are closed under countable products. like show that for any Q We would Q sequence searchable sets Ki ⊆ Di , their product i Ki ⊆ i Di is also searchable, but this would require dependent types, which are not part of the traditional higher-type computation formalism. So we Qassume that the components Ki of the product are all subsets of the same type D, so that i Ki ⊆ Dω instead. Given search functionals εKi ∈ ((D → B) → D), we wish to find a search functional εQi Ki ∈ ((Dω → B) → Dω ). The idea, that iterates the proof of Proposition 2.3.3, is to let εQi Ki (p) = x0 x1 x2 . . . xn . . . , where x0 ∈ K0 is such that ∃α ∈
Q
i
Ki+1 .p(x0 α), 7
x1 ∈ K1 is such that ∃α ∈
Q
i
Ki+2 .p(x0 x1 α),
... xn ∈ Kn is such that ∃α ∈
Q
i
Ki+n+1 .p(x0 x1 . . . xn α),
... The component xn will be found using εKn , and existential quantifications will be recursively reduced to search. To make this precise, we change notation. Given ε ∈ ((D → B) → D)ω , such that εi searches over Ki , we wish to find Π(ε) ∈ (Dω → B) → Dω Q that searches over i Ki . That is, we want a functional Π : ((D → B) → D)ω → ((Dω → B) → Dω ) that transforms a sequence of search operators over D into a search operator over Dω : Q Π(ε)(p)(0) = x0 s.t. ∃α ∈ i Ki+1 .p(x0 α), Q Π(ε)(p)(1) = x1 s.t. ∃α ∈ i Ki+2 .p(x0 x1 α), ... Π(ε)(p)(n) = xn s.t. ∃α ∈
Q
i
Ki+n+1 .p(x0 x1 . . . xn α),
... To complete the derivation of the functional Π, we reduce the existential quantification to a suitable recursive Q call to Π. If the functional Π is to meet its specification, Π(λi.εi+n+1 ) should search over i Ki+n+1 . But a searchable set is exhaustible by Lemma 2.2.4. To implement the proof of this lemma in our situation, for any given p, n, xn , define pn,xn (α)
= p(x0 x1 . . . xn−1 xn α) = p(Π(ε)(p)(0)Π(ε)(p)(1) . . . Π(ε)(p)(n − 1)xn α).
Then ∃α ∈
Q
i
Ki+n+1 .p(x0 x1 . . . xn α)
is equivalent to pn,xn (Π(λi.εn+i+1 )(pn,xn )). To find xn such that this holds, we use εn : Π(ε)(p)(n) = εn (λxn .pn,xn (Π(λi.εn+i+1 ))(pn,xn )). Because we don’t want a different variable xn for each n, we rename the variable to simply x. This completes our derivation of the product functional: Definition 2.3.4. The product functional Π : ((D → B) → D)ω → ((Dω → B) → Dω ) is recursively defined by Π(ε)(p)(n) = εn (λx.pn,x (Π(λi.εn+i+1 ))(pn,x )) 8
where Π(ε)(p)(i) if i < n, pn,x (α) = p λi. x if i = n, αi−n−1 if i > n.
Q Theorem 2.3.5. If each εi searches over a set Ki ⊆ D then Π(ε) searches over i Ki . Q Proof. By construction, it is clear that p(Π(ε)(p)) = tt iff there isQ α ∈ i Ki with p(α) = tt, provided the recursion converges in the sense that Π(ε)(p) ∈ i Ki . To establish this, Q we first show that p(Π(ε)(p)) 6= ⊥ for any p defined on i Ki . For ε and p fixed, and for each finite sequence β over D, define W (β) = Π(λi .εi+|β| )(λα.p(βα)), where βα denotes the concatenation of β and α, and |β| denotes the length of β. It is easy to see that W (β) satisfies the equation W (β) = xW (βx) where x = ε|β| (λy.p(βyW (βy))). Here βx denotes the sequence β extended by the element x, and xW (βx) is the sequence with first element x followed Q by the sequence W (βx). Claim: For any β ∈ i and a limit point ⊥. Then a function p : X → S is continuous iff p−1 (>) is open, and a set U ⊆ X is open iff its characteristic function χU , defined by χU (x) = > ⇐⇒ x ∈ U , is continuous. Thus, using the Sierpinski space, the notion of openness is reduced to that of continuity. The following reduces the notion of compactness to that of of continuity (a particular case of this is proved in [7], with essentially the same proof as the one give here). Lemma 3.1.1. If X is a k-space, a set K ⊆ X is compact if and only if the universal quantification functional ∀K : S X → S defined by ∀K (p) = > iff p(x) = > for all x ∈ K is continuous. 11
Proof. A set K is compact if and only if every directed cover of K by open sets has a member that covers K, because from any cover one obtains a directed cover with the same union by adding the finite unions of the members of the cover. Hence by definition of the Scott topology, a set is compact if and only if its open-neighbourhood filter is open in the Scott topology of the lattice of open sets. But U 7→ χU is a bijection from the lattice of open sets to the points of S X , and it was shown in [9] that the topology of the exponential S X is the one induced by this bijection. Hence the functional ∀K is continuous iff ∀−1 K (>) is open iff the set of characteristic functions χU with K ⊆ U is open iff the open neighbourhood filter of K is open iff K is compact.
3.2
Background on Kleene–Kreisel functionals
For each type σ, define by induction a set Cσ of Kleene–Kreisel functionals of type σ and a surjection ρσ : Tσ → Cσ as follows, so that Cσ ∼ = Tσ / ∼σ . For ground types and product types, define Co = To ,
Cι = T ι ,
Cσ×τ = Cσ × Cτ ,
ργ (x) = x. ρσ×τ = ρσ × ρτ .
For function types, consider the diagram Dσ (†)
f ? Dτ
⊃
Tσ
(1)
ρσ
- Cσ φ
(2) ⊃
? Tτ
ρτ -
? Cτ .
The square (1) commutes for some map Tσ → Tτ if and only if f ∈ Tσ→τ , and in this case the map is uniquely determined as the (co)restriction of f . Moreover, in this case, there is a unique map φ making the square (2) commute, because ρσ is a surjection. We define Cσ→τ = {φ : Cσ → Cτ | ∃f ∈ Tσ→τ .(2) commutes}, ρσ→τ (f ) = the unique φ such that (2) commutes. Then, by construction, for any σ and all x, y ∈ Dσ , we have that x ∼σ y iff x, y ∈ Tσ and ρσ (x) = ρσ (y). If (†) commutes, we say that f is a realizer of φ. A Kleene–Kreisel functional is computable iff it has a computable realizer. Lemma 3.2.1. Every Cσ is a computable retract of Cτ →ι for some τ . A stronger form of this is known as “simple types are retracts of pure types” (see e.g. [17]). Here we use the fact that every pure type is either ι or of the form τ → ι, and that ι is a retract of τ → ι for any τ .
12
Hyland’s characterization of Kleene–Kreisel functionals. For certain proofs and constructions of algorithms, we consider a topology on the set of Kleene-Kreisel functionals. Definition 3.2.2. Endow Tσ with the relative Scott topology and Cσ with the quotient topology of the surjection ρσ : Tσ → Cσ . We refer to this topology on Cσ as the Kleene– Kreisel topology, and to the resulting spaces Cσ as Kleene–Kreisel spaces. The points of the Kleene–Kreisel are often referred to as the continuous functionals in the higher-type computability (or higher-type recursion theory) literature. A proof of the following inductive topological characterization of the Kleene–Kreisel spaces, attributed to Hyland, can be found in Normann [19]. Lemma 3.2.3. 1. Cγ has the discrete topology for γ ground, 2. Cσ×τ = Cσ × Cτ and 3. Cσ→τ = Cτ Cσ , where the product and exponential are calculated in the cartesian closed category of Hausdorff k-spaces. The following two lemmas, which are part of the folklore of the subject, are applied in order to show that exhaustible sets of total elements are compact in the Kleene–Kreisel topology (Lemma 3.3.4(1)). A set is called clopen if it is both closed and open. Lemma 3.2.4. For every clopen U ⊆ Cσ there is a total predicate p ∈ (Dσ → B) such −1 −1 that ρ−1 (tt) and ρ−1 (ff). σ (U ) ⊆ p σ (Cσ \ U ) ⊆ p Proof. Because U is clopen, its characteristic function χU : C → B is continuous, and hence so is the composite i ◦ χU ◦ ρσ : Tσ → B, where i : B → B in the inclusion. Because T is dense in D (see e.g. [6]) and because Scott domains, and hence B, are densely injective (see e.g. [12]), by definition of injectivity this extends to a continuous function p : Dσ → B. Then p is total by construction, and the extension property amounts to the above set inclusions. A space is zero-dimensional iff it has a base of clopen sets. It is an open problem whether the spaces Cσ are zero-dimensional [3, 21]. If they are, the following lemma becomes superfluous. The zero-dimensional reflection ZC of a space C is obtained by taking the same set of points and the clopen sets as a base. Lemma 3.2.5. ZCσ and Cσ have the same compact subsets. Proof. We first show that KZCσ = Cσ for any type σ, where K is the coreflector into the category of k-spaces. The property KZC = C is easily seen to be inherited by retracts, and hence, by Lemma 3.2.1, it is enough to consider σ = τ → ι, and hence C = NY for some k-space Y . Exponentials in k-spaces are given by the k-coreflection of the compact-open topology on the set of continuous maps. When the target is N, the compact-open topology is clearly zero-dimensional and Hausdorff. Now, it is easy to see that KZC = C iff there is some zero-dimensional topology whose k-reflection is C, and hence we are done. The result then follows from the well-known fact that a Hausdorff space has the same compact sets as its k-coreflection.
13
3.3
Compactness of exhaustible sets
A notion analogous to exhaustibility, with the Sierpinski domain S playing the role of the boolean domain B, is considered in [7]. A crucial fact, formulated here as Lemma 3.1.1, is that the (now unique) exhaustion functional ∀K : (D → S) → S is continuous iff the set K is compact in the Scott topology of D. Hence, because computable functionals are continuous, Sierpinski-exhaustible sets are compact, and so Sierpinski exhaustibility is seen as articulating an algorithmic version of the topological notion of compactness. The computational idea is that, given any semi-decidable property of D, one can semi-decide whether it holds for all elements of K. Closure properties analogous to the above are established for Sierpinski exhaustibility in [7] (and redeveloped from a purely operational point of view in [8]). The present investigation can be seen as a natural follow-up of that work that arises by asking what changes if one moves from semi-decision problems to decision problems. One significant change is that continuity of a boolean exhaustion operator ∀K : (D → B) → B doesn’t entail the compactness of K in the Scott topology any longer. Examples 3.3.1. 1. There are exhaustible sets that fail to be compact in the Scott topology. By [28, 26], any second-countable T0 space, e.g. the real line R, can be embedded into the domain D = B ω . But R is a connected space, which is equivalent to saying that every continuous boolean-valued map defined on it is constant. Hence a predicate p ∈ (D → B) is defined on R iff it is constant on R. Therefore R is trivially exhaustible: ∀R (p) = p(0). But it is not compact. Notice also that any space embedded into the total elements of B ω must be totally disconnected, and hence any embedding of R into B ω must assign non-total elements of B ω to some real numbers. One may suspect that if such embeddings are ruled out, this problem would disappear. But this is not the case, as the next example shows. 2. There are exhaustible sets of total elements that fail to be Scott compact. In fact, there is a trivial and pervasive counter-example. Let f ∈ D where D = ((N → N ) → N ). Then the total equivalence class K of f , as is well known and easy to verify, doesn’t have a minimal element, and hence cannot be compact in the Scott topology. But it is exhaustible with ∀K (p) = p(f ). However, although exhaustible sets fail to be compact in the Scott topology, if they consist of total elements then they are compact in the Kleene–Kreisel topology. In order to formulate and prove this, we need some definitions. As in the previous sections, D = Dσ for some unspecified σ, and, additionally T = Tσ , C = Cσ and ρ = ρσ : Tσ → Cσ . Definition 3.3.2. 1. By the shadow of a set K ⊆ T we mean its ρ-image in C. 2. A set K ⊆ T is called Kleene–Kreisel compact if its shadow is compact. 3. The Cantor space is the set of strict total elements of B ω = (N → B), that is those total elements α with α(⊥) = ⊥, under the relative Scott topology. Notice that: 1. The Cantor space is a proper subset of the set of total elements of B ω , because it excludes precisely the two non-strict elements λi. tt and λi. ff. However, it still includes their strict versions. 2. The Cantor space (under the relative Scott topology) is homeomorphic to its shadow 2N (under the Kleene–Kreisel topology) and hence is a compact Hausdorff space. 14
3. The set of maximal elements of B ω is not homeomorphic to the Cantor space. This is because the two elements λi. tt and λi. ff are finite (or order compact), and hence isolated in the relative Scott topology (meaning that the two corresponding singletons are open), and hence the maximal elements have a topology strictly finner than that of the Cantor space, as there are no isolated points in the Cantor space. As is well known in topology, no compact Hausdorff topology can have another compact Hausdorff topology as a strict refinement. Every (computationally) exhaustible set is topologically exhaustible in the sense of the following definition, because computable maps are continuous. Definition 3.3.3. 1. We say that a set K ⊆ D is topologically exhaustible if there is a continuous map ∀K ∈ ((D → B) → B) satisfying the conditions of Definition 2.2.2. 2. Similarly, we say topologically decidable etc. taking the continuous versions of Definition 2.2.7. The following is our main tool in the constructions and proofs of correctness and termination of the algorithms developed in this section. Lemma 3.3.4. 1. Any topologically exhaustible set of total elements is Kleene–Kreisel compact. 2. Any non-empty, Kleene–Kreisel compact entire set is an entire continuous image of the Cantor space and hence is topologically exhaustible. (The empty set is trivially exhaustible, as we have already seen, and hence all Kleene– Kreisel compact entire sets are topologically exhaustible.) 3. Any Kleene–Kreisel compact entire set has a Scott compact subset with the same shadow. Proof. (1): Let K ⊆ T be exhaustible. By Lemma 3.2.5 and the fact that clopen sets are closed under finite unions, to establish compactness of ρ(K), it is enough to consider a directed clopen cover U. By Lemma 3.2.4, for every U ∈ U there is a total pU ∈ (D → B) with −1 (†) ρ−1 (U ) ⊆ pU (tt) and ρ−1 (C \ U ) ⊆ p−1 U (ff).
Define predicates qU , r ∈ (D → B) by −1 qU (tt) = p−1 U (tt),
r−1 (tt) =
[
−1 −1 p−1 (ff) = ∅. U (tt) and qU (ff) = r
U ∈U
F Then qU v pU , the set {qU | U ∈ U } is directed, and r = U ∈U qU . Because ρ(K) ⊆ S U, we have that K ⊆ r−1 (tt) and hence ∀K (r) = tt. So, by continuity of ∀K , there is U ∈ U with ∀K (qU ) = tt, and hence with ∀K (pU ) = tt by monotonicity. Let x ∈ K. Then pU (x) = tt by specification of ∀K and the fact that pU is total and hence defined on K. But then ρ(x) ∈ U , for otherwise (†) would entail pU (x) = ff. This shows that ρ(K) ⊆ U , and so ρ(K) is compact. (2): By e.g. [9], any compact subset of C is countably based (even though C is not). But any non-empty compact Hausdorff countably based space is a continuous image of the Cantor space. Hence there is a continuous map BN → C with image ρ(K) for any entire set K ⊆ D. Then the entire image of the Cantor space under any realizer B ω → D is K. (3): This follows from the argument given in (2). 15
Remark 3.3.5. In particular, this gives a topological view of the computational fact stated in the introduction that exhaustible sets of natural numbers must be finite: all compact sets are finite in a discrete space. (There is a natural on topology on D, coarser than the Scott topology, in which all exhaustible sets are compact. Part of the argument of Lemma 3.3.4(1) shows that any exhaustible set K is compact in the coarsest topology on D such that all predicates p ∈ (D ∈ B) defined on K are continuous. This is generated by directed unions of basic open sets of the form p−1 (tt) with p as above, because such sets are closed under finite unions and intersections. This construction is analogous to the zero-dimensional reflection of a topology, and happens to coincide with it in the case considered in Lemma 3.3.4(1), modulo quotienting.)
3.4
Some computational properties of exhaustible sets
We already know that every searchable set is exhaustible (Lemma 2.2.4). This implication is uniform, in the sense that there is a computable functional ((D → B) → D) → ((D → B) → B) that transforms search operators into exhaustion operators, namely ε 7→ (λp.p(ε(p))). We now establish the converse for non-empty entire sets, and some additional results. Definition 3.4.1. We say that a set S ⊆ D = Dσ is a total retract if there is a function r ∈ (D → D) such that 1. r(x) ∈ S for all total x ∈ D, 2. r(s) ∼ s for all s ∈ S. In this case, r is total, all elements of S are total, and r(r(x)) ∼ r(x) for all total x. Notice that a total retract is not necessarily a retract. But r is a total retract iff it is a total function and its Kleene–Kreisel shadow ρ(r) : C → C is a retract in the usual topological sense, where C = Cσ . Theorem 3.4.2. If K ⊆ Dσ is a non-empty, exhaustible entire set then, uniformly in any exhaust or of K: 1. K is searchable. 2. K is a computable entire image of the Cantor space. 3. The shadow of K is (computably) homeomorphic to the shadow of some entire exhaustible subset of the Baire domain N ω . 4. K is a computable total retract. 5. K is co-semi-decidable. In particular, after the theorem is proved, one can w.l.o.g. work with total predicates rather than predicates defined on K, as for any predicate p ∈ (Dσ → B) defined K one can uniformly find a total predicate that agrees with p on K, by composition with the total retraction.
16
Proof. We proceed by cases, of increasing generality, on the type of K. The case K ⊆ N is trivial and is implicitly used in the case K ⊆ N ω , which in turn is used in the next case K ⊆ (D → N ). The general case K ⊆ D is reduced to this last case via retracts using Lemma 3.2.1. (1) Case K ⊆ N : We can define ( µn.∃m ∈ K.n = m ∧ p(n) if ∃n ∈ K.p(n), εK (p) = µn.∃m ∈ K.n = m otherwise. Notice that this construction defines εK uniformly in ∃K . We could now easily show that K satisfies the other conditions of the theorem, but this won’t be required for our proof, as this will follow in later cases. For future use, notice that if K ⊆ N is entire and exhaustible, then the supremum of the finite set K (which is zero if K is empty and the largest element of K otherwise) can be computed uniformly in any exhaustor of K as sup K = µm.∀n ∈ K.n ≤ m. Hence, the finite enumeration en of the elements of K, in ascending order, for 0 ≤ n < cardinality(K), is uniformly computable as en = µy.∃m ∈ K.∀i < n.m 6= ei ∧ m = y. We stop when we find n such that en = sup K, and we include en if and only if ∃m ∈ K.m = sup K. (2) Case K ⊆ N ω . We first argue that we can find some α ∈ K, uniformly in ∃K , by the following algorithm defined by course-of-values induction on n: αn = µk.∃β ∈ K.α =n β ∧ βn = k. Recall that we defined α =n β ⇐⇒ ∀i < n.βi = αi in the paragraph preceding Theorem 2.3.6. By construction, for every n there is β ∈ K with α =n β, and in particular α is total. Because the shadow of K is compact, it is closed, and because K is entire, α ∈ K, as required. Then we can define, using Proposition 2.3.1 and the above algorithm to construct α is both cases, ( some α ∈ K ∩ p−1 (tt) if ∃α ∈ K.p(α), εK (p) = some α ∈ K otherwise. Again, this construction defines εK uniformly in ∃K . We now show that K is a computable total retract, uniformly in any exhaustor of K. Define r = rK : N ω → N ω by course-of-values induction on n: ( αn r(α)(n) = µm.∃β ∈ K.β =n r(α) ∧ βn = m
if ∃β ∈ K.β =n r(α) ∧ βn = αn , otherwise.
Because the shadow of K is closed, the finite prefices of its members form a tree whose infinite paths correspond to the elements of K. The above algorithm follows the infinite path α through the tree, either for ever (always following the first case) or until the path 17
exits the tree (reaching the second case). If and when α exits the tree, we replace the remainder of α by the left-most infinite branch of the subtree at which α exits the tree. Then r clearly satisfies the required conditions. A semi-decision procedure for the complement of K is given by α 6∈ K ⇐⇒ r(α) 6= α, using the fact that apartness of total elements of N ω is semi-decidable. (This is a computational version of the topological fact and proof that retracts of Hausdorff spaces are closed.) We now show that K is a computable entire image of the Cantor space. For any i, the set exhaustible by Proposition 2.3.2 as evaluation at i is computable. It Ki = {αi | α ∈ K} is Q is enough to show that i {0, 1, . . . , sup Ki } ⊆ N ω is an entire image of the Cantor space by a computable map t : B ω → N ω , because then r ◦ t has K as its entire image since K is contained in that product. But this is straightforward: at each stage j of the computation of t(α), look at the next dlog2 (sup Kj )e digits of the input α (regarding ff and tt as digits 0 and 1), compute the natural number f (j) represented by this finite sequence, and let t(α)(j) = min(sup Kj , f (j)). (3) Case K ⊆ (D → N ) where D = Dσ for an arbitrary type σ: In order to reduce this to case (2), we invoke the Kleene–Kreisel density theorem (see e.g. [6]), which gives a computable sequence d ∈ Dω such that the shadow sequence hρ(dn ) | n ∈ Ni is dense in C = Cσ . Define P : (D → N ) → N ω P (f ) = λn.f (dn ). We will define a total function E = EK in the other direction, E : N ω → (D → N ) such that R = E ◦ P : (D → N ) → (D → N ) exhibits K as a total retract of (D → N ). This means that, for f ∈ K, one can recover the behaviour of f at total elements from its behaviour on the dense sequence d. Because this implies that K is the entire image of P (K), and because P (K) is searchable by case (2), it will follow that K is searchable and an entire image of the Cantor space. For α ∈ N ω and n ∈ N, define Fnα = {f ∈ (D → N ) | ∀i < n.f (di ) = αi },
Knα = K ∩ Fnα .
Then Fnα is decidable on K uniformly in α and n, and hence Knα is uniformly exhaustible by Proposition 2.3.1. Lemma 3.4.3. If K ⊆ (D → N ) is a Kleene–Kreisel compact entire set, then for all total α ∈ N ω and x ∈ D there is n such that f (x) = f 0 (x) for all f, f 0 ∈ Knα . Proof. For any g ∈ Cτ →ι the set Bg = {f ∈ Cτ →ι | f (x) = g(x)} is T clopen, where α we write x = ρ(x). By density of the shadow of sequence dn , the set T n K n has at α α most element, where K n denotes the shadow of Kn . Hence if g ∈ n K α n then T one α α K = {g} ⊆ B . Because C is Hausdorff and because each K is compact g τ →ι n n n α and Bg is open, there is n such that already K α n ⊆ Bg . So for all f ∈ K n one has 0 0 α f (x) = g(x), and hence for all f , f ∈ K n one has f (x) = f (x).
18
By Proposition 2.3.2, the entire P -image L ⊆ N ω of K is exhaustible. Let r = rL be defined as in case (2), and define E(α)(x) = µy.∃f ∈ Knr(α) .f (x) = y, r(α)
where n is the least number such that ∀f, f 0 ∈ Kn .f (x) = f 0 (x). By exhaustibility r(α) of Kn , this can be found uniformly in α, and hence E is computable uniformly in K. Proof of correctness of E. (i) E is total and maps L into K. Let α ∈ N N be total. Then r(α) ∈ L, by construction r(α) of r, and hence there is g ∈ K with r(α) ∼ P (g), and so with g ∈ Kn . Let x ∈ D r(α) be total and n be the least number such that f (x) = f 0 (x) for all f, f 0 ∈ Kn . Then r(α) f (x) = g(x) for all f ∈ Kn , and hence E(α)(x) = g(x). Therefore E(α) ∼ g ∈ K, and hence E(α) ∈ K as K is entire, and in particular E is total. By construction E ∼ E ◦r, and hence, because r exhibits L as a total retract and K is entire, the E-image of L is K. (ii) If f ∈ (D → N ) is total then R(f ) = E(P (f )) ∈ K. Because P (f ) ∈ L. (iii) If f ∈ K then R(f ) ∼ f . Continuing from the proof of (i), for α = P (f ) we have r(α) ∼ α by construction of r, and hence for any g ∈ K such that P (g) = r(α) we have g(di ) = P (g)(i) = r(α)(i) = αi = P (f )(i) = f (di ) and so g ∼ f by density, which shows that R(f ) = E(P (f )) ∼ f , as required. A semi-decision procedure for the complement of K is given as in case (2), f 6∈ K ⇐⇒ R(f ) 6= f, because f 6= f 0 ⇐⇒ ∃n ∈ N.f (dn ) 6= f 0 (dn ), for total functions f 0 and f since K is entire and d is dense. Because E and P are total, they induce computable Kleene–Kreisel functionals E = ρ(E) : NN → NC and P = ρ(P ) : NC → NN where C = Cσ . If K ⊆ NC is the shadow of K, then the restriction of P to K followed the co-restriction to its image is a homeomorphism K → P (K): abstractly because any continuous bijection of compact Hausdorff spaces is a homeomorphism, and concretely because the bi-restriction of E is a continuous inverse. Hence the shadow of any exhaustible subset of (D → N ) is computably homeomorphic to the shadow of some exhaustible subset of the Baire domain N ω . (4) General case. We derive this from the case (3). By Lemma 3.2.1, for any D = Dσ there are D0 = Dτ →ι and computable P : D0 → D and E : D → D0 such R = E ◦ P is a total retraction and Tσ is the entire image of P . Let K ⊆ D be a non-empty, exhaustible entire set, and let K 0 be the entire E-image of K. Then K is the entire image of P (K 0 ), and, because K is entire, a predicate p0 ∈ (D0 → B) defined on K 0 holds for all x0 ∈ K 0 if and only if p0 ◦ E holds for all x ∈ K. Hence K 0 is exhaustible with ∀K 0 (p0 ) = ∀K (p0 ◦ E). By case (3) above, K 0 is searchable. Therefore K is searchable by Proposition 2.3.2. Similarly, the other properties we need to establish are closed under the formation of retracts and hence are inherited from case (3). This concludes the proof of Theorem 3.4.2.
3.5
Arzela–Ascoli type characterizations of compact sets
We reformulate a theorem of Gale’s [10] that characterizes compact subsets of function spaces (Theorem 3.5.1). The reformulation suggests a characterization of exhaustible entire sets (Theorem 3.6.1), whose topological version is developed here (Theorem 3.5.3). The main idea is to replace a condition in Gale’s theorem by a continuity condition, and then further replace it by a computability condition in Section 3.6. This method of transforming 19
topological theorems into computational theorems is the main thrust of the paper [7], which develops many instances of computational manifestations of topological theorems. The Heine–Borel theorem characterizes the compact subsets of Euclidean space Rn as those that are closed and bounded. The Arzela–Ascoli theorem generalizes this to subsets of RX , where X is a compact metric space and RX is the set of continuous functions endowed with the metric defined by d(f, g) = max{d(f (x), g(x)) | x ∈ X}. A set K ⊆ RX is compact if and only if it is closed, bounded and equi-continuous. Equicontinuity of K means that the functions f ∈ K are simultaneously continuous, in the sense that for every x ∈ X and every > 0, there is δ > 0 such that d(x, x0 ) < δ =⇒ d(f (x), f (x0 )) for all x0 ∈ X and all f ∈ K. The Heine–Borel theorem is the particular case in which X is the discrete space {1, . . . , n}, for equi-continuity holds automatically for any subset of RX in this case. The above metric on RX induces the compactopen topology. More general Arzela–Ascoli type theorems characterize compact subsets of spaces Y X of continuous functions under the compact-open topology, for a variety of kinds of spaces X and Y , with a number of generalizations or versions of the notion of equi-continuity, notably even continuity in the sense of Kelley [14]. Among a multitude of generalizations of the Arzela–Ascoli theorem, that of Gale [10, Theorem 1] proves to be relevant concerning exhaustibility: Let X and Y be Hausdorff k-spaces with Y regular. A set K ⊆ Y X is compact if and only if 1. K is closed, 2. the set K(x) = {f (x) | f ∈ K} is compact for every x ∈ X, T 3. the set f ∈K∩F f −1 (V ) is open for every closed set F ⊆ Y X and every open set V ⊆ Y . Gale didn’t assume Y to be a k-space and formulated this for the compact-open topology, but his theorem holds for the exponential topology if we require Y to be a k-space. Regarding compactness, we have already mentioned that a Hausdorff space has the same compact sets as its k-coreflection, and that the exponential topology is the k-reflection of the compact-open topology. Although there are more closed sets in the exponential topology, Gale’s argument works with closedness of K in the exponential topology. This follows from the general considerations of Kelley [14, Chapter 7]. The last condition is a version of equi-continuity. Because X is not assumed to be compact, the set K cannot be globally bounded in any sense, but it is pointwise bounded in the sense of the second condition. This gives a characterization of compact subsets of Kleene–Kreisel spaces of the form NC and in particular of Kleene–Kreisel spaces of pure type, because N is regular. However, it is an open problem whether arbitrary Kleene– Kreisel spaces are regular [3, 22], and this is partly why we are able to formulate and prove characterizations of exhaustible entire sets only for particular kinds of types in Section 3.6. Notice that when X = Y = N, this amounts to the well known characterization of compact subsets K of the Baire space NN as finitely branching trees. The equi-continuity condition, as in the case of the Heine–Borel theorem, is superfluous, because any set is equi-continuous in this case as the topology of the exponent is discrete. Condition (1) says that the elements of K are the paths of a tree, and (2) says that the tree is finitely branching, because the compact subsets of the base space are finite. Lemma 3.1.1 and the remarks preceding it allow one to consider continuity of functions involving points of a k-space X, open sets and closed sets (using the function space S X and representing open sets and closed sets by their characteristic functions), and compact
20
X
sets (using the function space S S and representing compact sets by their universal quantification functionals). We now reformulate Gale’s theorem by expressing condition (3) as a continuous version of a slight strengthening of condition (2). Theorem 3.5.1. If X and Y are Hausdorff k-spaces with Y regular, a set K ⊆ Y X is compact if and only if 1. K is closed, and 2. (K ∩ F )(x) is compact, continuously in F and x, for any closed set F ⊆ Y X and any x ∈ X. The dependence of (K ∩ F )(x) in the parameters F and x is given by the functional X Y Φ : S Y × X → S S defined by Φ(χ ¯F , x) = ∀(K∩F )(x) , where we write χ ¯F = χF c . Proof. (⇒): The set K is closed because Y X is Hausdorff. The set K ∩ F is compact because F is closed. Because the evaluation map is continuous and because (K ∩ F )(x) is the continuous image of K ∩ F under evaluation at x, it is compact. To see that Φ is continuous, let v ∈ S Y . Then v(y) = > for all y ∈ (K ∩ F )(x) ⇐⇒ v(f (x)) = > for all f ∈ K ∩ F ⇐⇒ f ∈ F c or v(f (x)) for all f ∈ K. Hence Φ(w, x) = λv.∀f ∈ K.w(f ) ∨ v(f (x)), where (∨) : S ×S → S is defined by a∨b = > iff a = > or b = >. Because the functional ∀K is continuous as K is compact, and because the category of k-spaces is cartesian closed and the above is a λ-definition from continuous maps, Φ is continuous. (⇐): It suffices to show that Gale’s conditions (1)-(3) hold. Condition (1) is the same as ours, and Gale (2) follows from our condition (2) with F = Y X . To prove Gale (3), let F ⊆ Y X be closed and V ⊆ Y be open. Then the set U = {x ∈ X | Φ(χ ¯F , x)(χV ) = >} is open because Φ is continuous, and x∈U
⇐⇒
∀(K∩F )(x) (v) = > ⇐⇒ χV (y) = > for all y ∈ (K ∩ F )(x)
⇐⇒ ⇐⇒
χV (f (x)) = > for all f ∈ K ∩ F ⇐⇒ f (x) ∈ V for all f ∈ K ∩ F T x ∈ f ∈K∩F f −1 (V ),
which shows that the set
T
f ∈K∩F
f −1 (V ) is the same as U and hence is open.
We now formulate and prove an analogue of Theorem 3.5.3, which replaces (i) the Sierpinski space S by the boolean domain B, (ii) Hausdorff k-spaces by Scott domains, (iii) compact subsets by topologically exhaustible entire subsets, (iv) closed subsets by topologically decidable sets (cf. Definitions 2.2.2 and 3.3.3). We again apply Gale’s theorem, exploiting Hyland’s characterization of Kleene–Kreisel spaces as k-spaces. The proof follows the same pattern as that of Theorem 3.5.1, but there are a number of additional steps. Firstly, using Gale’s theorem, we get continuous maps defined on Kleene–Kreisel spaces. These are extended to continuous maps on domains using the Kleene–Kreisel density theorem and Scott’s injectivity theorem, as in Lemma 3.2.4. (In Theorem 3.6.1, such an extension will be instead defined by an algorithm, but still relying on the density theorem.) Secondly, the set F in condition (2) is closed in Theorem 3.5.1 but is neither open nor closed in Theorem 3.5.3, although it has clopen shadow, because the Sierpinski space has been replaced by the boolean domain. To overcome this difficulty, we rely on the following version of Gale’s theorem: 21
Remark 3.5.2. An inspection of the proof of Gale’s theorem shows that it also holds if, in condition (3), the set F ranges over subbasic closed sets in the compact-open topology: T 30 . the set f ∈K∩N (Q,B) f −1 (V ) is open for every compact set Q ⊆ X, every closed set B ⊆ Y , and every open set V ⊆ Y . In one direction this is clear: if condition (3) holds for all closed F , then it holds for F = N (Q, B). For the other direction, notice that condition (3) is used only in the “Lemma” [10, page 305] for F of this form (the sets Wx in the second last line of that page, and the set T of page 306). Let D = Dσ and C = Cσ for an arbitrary type σ, and recall the concepts and notation introduced in Definitions 2.2.7 and 3.3.3. Theorem 3.5.3. An entire set K ⊆ (D → N ) is topologically exhaustible if and only if the following two conditions hold: 1. K is topologically co-semi-decidable. 2. The set (K ∩ F )(x) is topologically exhaustible for any F that is topologically decidable on K, and any x ∈ D total, continuously in F and x. Here the dependence of (K ∩F )(x) in F and x is to be given by a functional Γ : ((D → N ) → B) × D → ((N → B) → B) such that Γ(ψF , x) = ∀(K∩F )(x) . Proof. (⇒): (1): By Lemma 3.3.4, the shadow K = ρ(K) ⊆ NC of K is compact and hence closed. Hence the map NC → B that sends f ∈ K to ⊥ and f 6∈ K to tt is continuous. By composition with the quotient map ρ : T → NC , where T = Tσ→ι , we get a map T → B. Because T is dense in (D → N ) and B is densely injective, the domain Dσ→ι under the Scott topology is injective over dense embeddings, which means that this map extends to a continuous map (D → N ) → B. By construction, this exhibits K as a topologically co-semi-decidable subset of (D → N ). (2): Define Γ(ψF , x) = λp.∀f ∈ K.ψF (f ) =⇒ p(f (x)). The result then follows from the fact that the category of Scott domains under the Scott topology is cartesian closed, and hence functions that are λ-definable from continuous maps are themselves continuous. (⇐): We apply Gale’s theorem to show that the shadow K = ρ(K) is compact. Then it is topologically exhaustible by Lemma 3.3.4. Gale (1): If K is topologically co-semi-decidable, then, by definition, we have a continuous function (D → N ) → B that maps f ∈ K to ⊥ and f 6∈ K to tt. Hence K is closed in T because it is the inverse image of the closed set {⊥} restricted to T . Because K is entire, it is closed under total equivalence by definition, and hence, because ρ : T → NC is a quotient map, K is closed. Gale (2): The assumption gives that for any x ∈ D total, K(x) is exhaustible, considering F = (D → N ). Because K is entire and x is total, K(x) ⊆ N. Hence by Lemma 3.3.4, K(x) is compact in N ⊆ N . Gale (3): Let F ⊆ NC be a subbasic open set of the form N (Q, V ) with Q ⊆ C compact and V ⊆ N (necessarily) clopen. Then the set Q = ρ−1 (Q) is entire and Kleene– Kreisel compact, and hence, by Lemma 3.3.4, it is topologically exhaustible. Also, V is a topologically decidable subset of N . So the predicate p : (D → N ) → B defined by p(f ) = ∀x ∈ Q.χV (f (x)) is continuous and defined on K, and p = ψF for F = T ∩ p−1 (tt). Now define u : D → B by u(x) = Γ(ψF , x)(χV ). Then u is continuous and u(x) = ∀(K∩F )(x) (χV ) = ∀f ∈ K ∩ F.χV (f (x)). −1 Hence the set = {x ∈ T | ∀f ∈ K ∩ F.χV (f (x)) = tt} is open. Therefore T U = u (tt) its shadow f ∈K ∩ F f −1 (V ) is open, because it is closed under total equivalence and because ρ is a quotient map.
22
3.6
Arzela–Ascoli type characterization of exhaustible sets
At this stage of our investigation, such a characterization is available only for certain types, which include pure types, and for entire sets (for the reasons explained in Section 3.5). Let D = Dσ and C = Cσ for an arbitrary type σ. We establish the computational version of Theorem 3.5.3. Theorem 3.6.1. An entire set K ⊆ (D → N ) is exhaustible if and only if the following two conditions hold: 1. K is co-semi-decidable. 2. The set (K ∩ F )(x) is exhaustible for any F decidable on K, and any x ∈ D total, uniformly in F and x. Moreover, the equivalence is uniform. A few remarks are in order before embarking into the proof. The claim holds, with the same proof, if conditions (1) and (2) are replaced by any of the following conditions, respectively: 10 . K is topologically co-semi-decidable. 100 . K has closed shadow. 1000 . The shadow of K is closed in the topology of pointwise convergence. 10000 . K has compact shadow. 20 . The set (K ∩ Fnα )(x) is exhaustible, uniformly in n ∈ N , α ∈ N ω and x ∈ D total. Recall (Section 3.4, proof of Theorem 3.4.2) that we defined Fnα = {f ∈ (D → N ) | ∀i < n.f (di ) = αi }. In the formulation of the theorem, the fact that conditions (1) and (2) uniformly imply the exhaustibility of K is in principle given by a computable functional of type χK c
ψF
∀K∩F (x)
x
∀K
z }| { z }| { z}|{ z }| { z }| { ((D → N ) → B) × (((D → N ) → B) × D → ((N → B) → B)) → (((D → N ) → B) → B) . | {z } | {z } | {z } condition 1
condition 2
conclusion
However, the computational information given by condition (1) is not used in the construction of the conclusion (although the topological information is used in its correctness proof). Moreover, the information given by condition (2) is not fully used in the construction. Replacing it by (20 ) we get α
n
α (x) ∀K∩Fn
x
}| { z z}|{ z}|{ z}|{ ( N ω × N × D ) → ((N → B) → B)) → (((D → N ) → B) → B) . | {z } | {z } condition 20
conclusion
Additionally the pair α, n is really coding a finite sequence, and, as we have seen, exhaustible sets of natural numbers are uniformly equivalent to finite enumerations of natural numbers. Hence the above can be written as α,n
x
z}|{ z}|{ (( N ∗ × D ) → | {z
α (x) ∀K∩Fn
condition 20
z}|{ N ∗ ) → (((D → N ) → B) → B) . } | {z } conclusion
23
Therefore the above characterization reduces the type level of ∀K by two. The last step of the proof of this theorem mimics topological proofs of Arzela–Ascoli type theorems (which we haven’t included): to show that K ⊆ Y X is compactQunder assumptions such as those of Gale’s theorem (Section 3.5), one first concludes that x∈X K(x) is compact by the Tychonoff theorem, then shows that the relative topology of K is the topology of pointwise convergence, and that it is pointwise closed, and hence concludes that it is homeomorphically embedded into the product as a closed subset, and therefore that it must be compact. In the proof below, we have replaced the Tychonoff theorem by its countable computational version given by Theorem 2.3.5, using a dense sequence of the exponent. The previous steps of the proof are needed in order to make this replacement possible, and they are modifications of the constructions developed in Section 3.4, Proof. (⇒) (1): Theorem 3.4.2. (2): Define ∀K∩F (p) = ∀f ∈ K.ψF (f ) =⇒ p(f (x)). (⇐): By Theorem 3.5.3, the set K is topologically exhaustible, and hence is Kleene– Kreisel compact by Lemma 3.3.4. We apply this to establish the correctness of the algorithms defined below. Define P : (D → N ) → N ω by P (f )(i) = f (di ), as in the proof of Theorem 3.4.2, where d ∈ Dω is a computable dense sequence, and let L = P (K). Because Fnα is decidable on K, the set Knα = K ∩ Fnα is exhaustible by Proposition 2.3.1, and Knα (x) is exhaustible uniformly in α, n and x ∈ K by Proposition 2.3.2 applied to evaluation at x. Now modify the definition of r : N ω → N ω given in Theorem 3.4.2 as follows: ( r(α) αn if ∃y ∈ Kn (dn ).αn = y, r(α)(n) = r(α) µy.∃y 0 ∈ Kn (dn ).y = y 0 otherwise. Then r is computable, and satisfies ( αn if ∃f ∈ K.f (dn ) = αn ∧ ∀i < n.f (di ) = r(α)(i), r(α)(n) = µy.∃f ∈ K.f (dn ) = y ∧ ∀i < n.f (di ) = r(α)(i) otherwise. Hence it also satisfies ( αn if ∃β ∈ L.βn = αn ∧ βi =n r(α), r(α)(n) = µy.∃β ∈ L.βn = y ∧ β =n r(α) otherwise. This shows that r = rL for rL as defined in Theorem 3.4.2. But notice that, although the second and third equations hold, the algorithm is not the same as in Theorem 3.4.2. In fact, the second and third equations don’t establish computability of r, because exhaustibility of K and L are not known at this stage of the proof. In any case, the last equation shows that r exhibits L as a total retract, using the fact that L, being the continuous P -image of K, is topologically exhaustible and hence is Kleene–Kreisel compact, as in Theorem 3.4.2 Similarly, modify the definition of E : N ω → (D → N ) in Theorem 3.4.2 as follows E(α)(x) = µy.∃y 0 ∈ Knr(α) (x).y = y 0 , r(α)
where n is the least number such that ∀y, y 0 ∈ Kn (x).y = y 0 . Because this condition r(α) is equivalent to ∀f, f 0 ∈ Kn .f (x) = f 0 (x), such a number exists by Lemma 3.4.3 and r(α) the compactness of the shadow of K. By uniform exhaustibility of the set Kn (x), this can be found uniformly in α and x, and hence E is computable. Moreover, although the definition of E is not the same, as before, we again have E = EL for EL defined as in Theorem 3.4.2.
24
Finally, because K = K ∩F for F = (DQ→ N ), the set K(x) is exhaustible uniformly in x ∈ D total, and hence the set M = i K(di ) ⊆ N ω is searchable uniformly in x 7→ ∀K(x) . In fact, each K(di ) is searchable uniformly in i, by Theorem 3.4.2, and hence M is searchable by Theorem 2.3.5. Now L ⊆ M and hence the entire r-image of M is L, and hence L is searchable by Proposition 2.3.2. In turn K is the entire E-image of L and hence is also searchable. Therefore it is exhaustible.
4
Time-complexity considerations
We now discuss the experimental and theoretical run-time behaviour of algorithms for exhaustive search. The main contributions of this section are: (1) The development of an algorithm for exhaustive search that is experimentally fast in surprising instances, and that seems to be asymptotically faster than all known search algorithms, and possibly all search algorithms. (2) The introduction of a semantical notion of size for predicates so that the run-time of search algorithms can be expressed as a function of the size of the input predicate. (3) Conjectures about the run time of several algorithms. (4) Experimental evidence. While the considerations of the previous sections are conclusive and rigorously developed, those of this one are necessarily tentative: we lack mathematical techniques for reasoning about higher-type computational resources, in particular time. Whether or not our conjectures hold, the algorithm developed in this section is faster, in practice, than untrained intuition would concede to be possible.
4.1
The model of computation
We consider call-by-need evaluation [15, 18, 30]. For the language PCF, this is semantically equivalent to the call-by-name computational model. The essential operational difference of the call-by-name and the call-by-need computational models is that the latter avoids re-evaluations that take place in the former when the same variable is occurs twice in a program, which speeds up computations. Another crucial difference of call-by-need evaluation regarding speed is that substitutions are computed in constant time by pointer manipulation. One of the algorithms discussed here turns out to work directly in the call-by-value model, but (1) it is experimentally exponentially slower in this model, (2) we don’t understand its run-time behaviour in this model at present, not even intuitively, and (3) the other algorithms diverge in this model (although, of course, they can be adapted to this model by the application of standard translation techniques of call-by-name into call-by-value [24]). These three issues regarding call-by-value computation should certainly be investigated in future research.
4.2
The experimental setup
The main emphasis of Section 4 is asymptotic rather than actual run-time behaviour. But it is also interesting to experimentally determine whether non-trivial instances of search over infinite sets can be computed fast in practical standards (say in a few seconds). Moreover, experiments can disprove our asymptotic conjectures or confirm them to some extent. For such experimental purposes, we need to consider a particular programming language, under a particular implementation, running in a particular architecture/ machine/ operating system. This is what we describe here. The programming language. We use the language Haskell [13], which includes PCF as a subset, and whose computation mechanism, known as lazy evaluation, coincides with call-by-need apart from a few minor and inessential differences [18]. The Haskell code for the experiments is included in Section 4.8 below. 25
The language implementation. We use the Glasgow Haskell interpreter ghci. For experiments regarding asymptotic behaviour, the use of an interpreter makes no difference. For practical considerations of speed, the compiler ghc of course gives a linear speedup, perhaps by an order of magnitude. The machine, architecture and operating system. The technological details given here are included for the sake of reproducibility of our experiments. What is important is that there is a machine that, at the time of writing, is capable of running the experiments with the reported speeds. We have used a Dell D410 Latitude laptop computer (manufactured in 2005), with 1Gb of memory running at 1.73 GHz (Intel Pentium M processor) under the operating system Gnu/Linux Debian/Ubuntu 7.10.
4.3
A first experiment
Consider the three total functions u, v, w : B ω → B defined (in mathematical vernacular or PCF) by u(α)
= α(19α(220 ) + 399α(520 ) + 9177α(320 )),
v(α)
= α(19α(220 ) + 399α(620 ) + 9177α(320 )),
w(α)
= αk ,
where, for the definition of w, we take i = j k
if α(320 ) then 483 else 0
= 19( if α(220 ) then 1 + i else i) = j + 19( if α(520 ) then 21 else 0).
In the above definitions, for the purposes of multiplication and addition, we identify the booleans ff, tt with the numbers 0, 1. Then clearly u 6= v, and it is not hard to see that u = w, although this was deliberately designed not to be immediate. Given an algorithm ∀ : (B ω → B) → B for exhaustive search, one can define a totalequality functional (==) by (f == g) = ∀total α ∈ B ω .f (α) == g(α). For all total f, g, we have that f ∼ g if and only if (f == g) = tt. Experiment 4.3.1. How long does it take, in practice, to determine that u 6= v and u = w using exhaustive search over the Cantor space? Of course, this depends on the exhaustivesearch algorithm ∀ that one uses. Using any of the algorithms discussed above, the experiment runs out of memory before one gets the answer (and we predict that with unbounded memory one would have to wait for longer than the universe has existed to get an answer). But using an algorithm for exhaustive search defined in Section 4.7 below, the expressions u == v and u == w evaluate to ff and tt in respectively 0.03 secs and 0.15 secs. How can this be so fast, given that the algorithm uses the functions u, v and w as black boxes, without any knowledge of their syntactical representations? It is easy to see that the moduli of uniform continuity of u, v and w are 520 +1, 620 +1 and 520 +1 respectively (see Corollary 2.3.7). Hence the moduli of continuity of the predicates λα.u(α) = v(α) and λα.u(α) = w(α) are 620 + 1 and 520 + 1 respectively. Thus, in principle, if the moduli of 20 20 continuity are known, it is enough to check 26 +1 and 25 +1 cases respectively in order to determine that u and v are not equal, and that u and w are equal, respectively. Obviously, it is not possible to check that many cases in the reported time. 26
4.4
Run-time as a function of the size of a functional input
We wish to mathematically predict the run time of ∀(p) where ∀ : (B ω → B) → B is an exhaustive search algorithm and p : B ω → B is total. Moreover, we wish to have a mathematical guideline to get better algorithms if the current algorithms are not optimal. In fact, all algorithms considered so far in this paper, and that we know from the literature, are far from optimal, and we will propose an algorithm in Section 4.7, already alluded to in Section 4.3, which may be asymptotically optimal and in any case is experimentally faster than all the known algorithms. In order to predict the run time of ∀(p), it is natural to attempt to define a notion of size for p, and express the run time of ∀(p) a function of the size of p. At first sight, it may seem that the size of p can be measured syntactically only, or at least only from semantic models with enough intensional information, such as games models without extensional collapse. For example, we can have two equivalent predicates p and p0 such that p(α) “uses” α17 once, but p0 (α) uses α17 twice. While the run time of ∀(p) and ∀(p0 ) will be different, we argue that use-repetition doesn’t affect the asymptotic run-time behaviour of the algorithms considered here, and perhaps all possible algorithms. Moreover, we argue that it is possible to define a natural notion of size based on Scott semantics that allows to formulate predictions of the run-times. We also argue that, although p is assumed to be total in our considerations, a sensible notion of size cannot be defined in terms of Kleene–Kreisel semantics. The size of a total predicate defined on the Cantor space. We propose a modification of the notion of modulus of uniform continuity as a notion of size for total predicates on the Cantor space. Two notions of modulus of uniform continuity on the Cantor set arise often. The first one, as in Corollary 2.3.6, says that there is a smallest number n = fan(p) such that for all total α, β ∈ Bω , if α =n β then p(α) = p(β). The second one says that there is a smallest number m = m(p) such that p(α) = p(α|m ) for all total α ∈ B ω . Then fan(p) ≤ m(p) holds. Example 4.4.1. Define p, p0 : B ω → B by p(α) = α17 ,
p0 (α) = α17 ∧ (α100 ∨ ¬α100 ).
Then p ∼ p0 but p 6= p0 , and fan(p) = m(p) = fan(p0 ) = 18 but m(p0 ) = 101. Moreover, notice that and p0 v p, because for any total α, one has p(α|17 ) = tt but p0 (α|17 ) = ⊥. One can say that p0 may look at position 100 of its argument α (this happens if α17 = tt), but p doesn’t look at it. Notice that the fan-modulus is zero iff the predicate is constant on total elements, but that again the m-modulus of such a predicate can be bigger. We consider refined versions of these two notions. Definition 4.4.2. Given a set I ⊆ ω, define, for α, β ∈ B ω , α =I β iff αi = βi for all i ∈ I, and
( αi α|I (i) = ⊥
if i ∈ I, otherwise.
Then there is a smallest set I = FAN(p) such that α =I β implies p(α) = p(β) for all total α, β ∈ B ω , and there is a smallest set I = M (p) such that p(α) = p(α|I ) for all total α ∈ B ω . These sets exist and are finite by uniform continuity, and moreover FAN(p) ⊆ M (p) ⊆ {i | i < m(p)}. 27
Example 4.4.3. For p and p0 as defined in Example 4.4.1, FAN(p) = M (p) = FAN(p0 ) = {17},
M(p) = {17, 100}.
Similarly, FAN(λα. tt) = M (λα. tt) = ∅. For the predicates u, v, w defined in Section 4.3, we have that their FAN- and M -moduli agree and X M (u) = M (w) = {220 , 520 , 320 } ∪ { S | S ⊆ {19, 399, 9177}}, X M (v) = {220 , 620 , 320 } ∪ { S | S ⊆ {19, 399, 9177}}. The M -moduli of the predicates λα.u(α) = v(α) and λα.u(α) = w(α) are respectively M (u) ∪ {620 } and M (u), which have cardinalities 12 and 11. The set FAN(p) is uniformly decidable in any total p, with decision procedure i ∈ FAN(p) ⇐⇒ ∃α, β.αi 6= βi ∧ p(α) 6= p(β). But the set M (p) is not, because its characteristic function depends non-monotonically on p (consider Example 4.4.1). But it is the set M (p) that arises in the time-complexity considerations of this section. One notion of size for p that we’ll consider is the cardinality |M (p)| of the set M (p). The other is the number m(p).
4.5
The constant-cost hypothesis
We refer to the following, for either PCF or a subset of Haskell corresponding to PCF, as the constant-cost conjecture: Conjecture 4.5.1. For any total program p : B ω → B there is a number n such that, for any program or oracle α, the evaluation of p(α) takes n steps or fewer. Here we are deliberately confusing syntax with semantics, but we hope the reader will both excuse us and be able to resolve or ignore the confusion. The idea is that, operationally speaking, the computation of p(α) ought not to be able to look at αi for i outside M (p), and hence there is a bound for how much of α the predicate p can look at, which is determined by p independently of α, even if αi is evaluated for the same i ∈ M (p) more than once. Although we are officially working with the call-by-need model, our intuitive argument applies to the call-by-name model too, for which it should be easier to establish the conjecture, perhaps relying on computational adequacy of the Scott model or a related technique. Then the result would follow from the belief that call-by-need is bounded by call-by-name in number of evaluation steps, which should be true but, at the time of writing, we don’t know whether has been rigorously formulated or proved. Hypothesis 4.5.2. We assume that such a number n exists, and we denote the smallest such number by E(p).
4.6
Berger’s algorithm run time
Recall Berger’s algorithm εBerger defined in Remark 2.3.8, and consider the following simple modification, which avoids re-evaluation of l in the call-by-need model of computation: ( l if p(l), Berger0 ε (p) = r otherwise, where 0
l
=
tt #εBerger (λα.p.p(tt #α))
r
=
ff #εBerger (λα.p(ff #α)).
0
28
0
0
Denote by ∃Berger and ∃Berger the exhaustors derived from εBerger and εBerger respectively 0 as in Lemma 2.2.4 and by ∀Berger and ∀Berger the universal quantifiers derived from De Morgan’s law. 0
Experiment 4.6.1. ∀Berger (λα.αn ) runs in time exponential in n, but ∀Berger (λα.αn ) runs 0 in time linear in n. However, ∃Berger (λα.αn == αn ) and ∃Berger (λα.αn == αn ) run in time exponential in n and roughly equal. For example, for n = 12, 13, 14, 15, 16 Vk the times are 0.22, 0.45, 0.93, 1.97, 4.19 secs. Both ∀Berger (λα. j=1 αn == αn ) and V 0 k ∀Berger (λα. j=1 αn == αn ) run in time linear in k and exponential in n. For n = 12 and k = 1, 2, 3, 4, 5 one gets 0.23, 0.39, 0.55, 0.72, 0.86 seconds. Thus, apart from costs in evaluating the predicate, k doesn’t affect the run time. Similar examples lead to: Conjecture 4.6.2. Exhaustive search ∀(p) with both Berger’s algorithm and the above variation run in time O 2m(p) · E(p) . Recall that E(p) is defined in Section 4.5 and m(p) in Section 4.4. The intuition behind the conjecture is that Berger’s algorithm and its variation explicitly construct a witness from left to right, and all witnesses of length smaller than the modulus of continuity are constructed in the worst case. In order to prove this conjecture and others, one could attempt to establish a connection between the denotational notion of modulus of continuity and the operational notion of call-by-need evaluation, similar to the notion of computational adequacy, or perhaps even generalizing the notion of computational adequacy. Notice that all our conjectures are for ground terms with a higher-type parameter.
4.7
Product-algorithms run time
Q We consider the product algorithm (Definition 2.3.4) and two variations that are intended to asymptotically improve its run time, the first of which showed up in the proof of Theorem 2.3.5. The algorithm Π. As in Remark 2.3.8, use the product algorithm Π to define exhaustive search over the Cantor space. V Experiment 4.7.1. We consider the computation of ∀(λα. i Bool) -> d Quantifier d = (d -> Bool) -> Bool
Berger’s functional and a variation: (x # a)(i) = if i == 0 then x else a(i-1) tl a = \i -> a(i+1) berger, berger’ :: Searcher Cantor berger p = if p(True # berger(\a -> p(True # a))) then True # berger(\a -> p(True # a)) else False # berger(\a -> p(False # a)) berger’ p = if p l then l else r where l = True # berger’(\a -> p(True # a)) r = False # berger’(\a -> p(False # a))
The three product functionals: prod, prod’, prod’’ :: (N -> Searcher d) -> Searcher(N -> d) prod e p n = e n (\x->q n x(prod(\i->e(i+n+1))(q n x))) where q n x a = p(\i -> if i < n then prod e p i else if i == n then x else a(i-n-1)) prod’ e p = x#(prod’(tl e)(\a->p(x#a))) where x = e 0(\x->p(x#(prod’(tl e)(\a->p(x#a))))) branch x l r n = if n == 0 then x else if odd n then l ((n-1) ‘div‘ 2) else r ((n-2) ‘div‘ 2) root t = t 0 left t = \n -> t(2 * n + 1) right t = \n -> t(2 * n + 2) prod’’ t p = branch x l r where findx = root t findl = prod’’(left t) findr = prod’’(right t) forsomel p = p(findl p) forsomer p = p(findr p) x = findx(\x -> forsomel(\l -> forsomer(\r -> p(branch x l r)))) l = findl(\l -> forsomer(\r -> p(branch x l r))) r = findr(\r -> p(branch x l r))
A search operator for booleans: findbool :: Searcher Bool findbool p = p True
32
We have five versions of search operators for the Cantor space and of each quantifier. The version of the chosen algorithm is ranged over by v. type Version = Int find find find find find find
:: Version -> 0 = berger 1 = berger’ 2 = prod (\i 3 = prod’ (\i 4 = prod’’(\i
Searcher Cantor
-> findbool) -> findbool) -> findbool)
forsome, forevery :: Version -> Quantifier Cantor forsome n p = p(find n p) forevery v p = not(forsome v (not.p))
Function equality again comes in five versions: equal :: Eq y => Version -> (Cantor -> y) -> (Cantor -> y) -> Bool equal v f g = forevery v(\x -> f x == g x)
The first experiment: coerce :: Bool -> N coerce x = if x then 1 else 0 u,v,w :: Cantor -> Bool u a = a(19*a’(2ˆ20)+399*a’(5ˆ20)+9177*a’(3ˆ20)) where a’ i = coerce(a i) v a = a(19*a’(2ˆ20)+399*a’(6ˆ20)+9177*a’(3ˆ20)) where a’ i = coerce(a i) w a = a k where i = if a(3ˆ20) then 483 else 0 j = 19 * if a(2ˆ20) then 1+i else i k = j + 19 * if a(5ˆ20) then 21 else 0
The other experiments are coded similarly.
5
Concluding remarks
The role of topology in computability theory. The algorithms developed in this work have purely computational specifications, which allow them to be applied without knowledge of specialized mathematical techniques in the theory of computation. However, the correctness proofs of some of the algorithms crucially rely on topological techniques. In this sense, this work is a genuine application of topology to computation: theorems formulated in the language of computation, proofs developed in the language of topology. But there is another sense in which topology proves to play a crucial role. Compact sets in topology are advertised as sets that behave, in many important respects, as if they were finite. Then exhaustively searchable sets ought to be compact. And compact sets are known to be closed under continuous images and under finite and infinite products. Moreover, for countably based Hausdorff spaces, they are the continuous images of the Cantor space. Hence searchable sets ought to have corresponding closure properties and characterization, which is what this work establishes, motivated by these considerations. Thus, in a more abstract level, topology is applied as a paradigm for discovering unforeseen notions, algorithms and theorems in computability theory. The role of topology in complexity. Moreover, topology also plays a role in highertype complexity: we have applied the notion of uniform continuity to measure the size of functional inputs in the formulation of run times for higher-type algorithms. 33
Operational perspective. The correctness proofs of Section 2 can be directly interpreted in the operational setting [7, 8]. But a development of operational counter-parts for those of Section 3 is left as an open problem. This requires an operational reworking of Section 3.3, which seems challenging. Programming languages for total and partial computation . As we have seen, Kleene– Kreisel functionals and Scott domains live in the cartesian closed category of compactly generated spaces. Additionally, the inclusion of Scott domains (the Scott-topology functor) into the category of compactly generated spaces preserves the cartesian-closed structure [9, 2]. Hence total and partial higher-type functionals coexist in the same cartesian closed category. One can envisage a higher-type system that simultaneously incorporates total and partial objects, and corresponding PCF-style languages. Among the formation rules, it would make sense to stipulate that σ → τ is a partial type whenever τ is a partial type. In its simplest form, such a language could include G¨odel’s system T for total types and PCF for partial types. Such a language would have simplified, and made more transparent, much of the development of Section 3, where we could have benefited from functionals that take total inputs and produce potentially partial outputs. Such functionals are actually total, but their construction uses modes of definition that belong to the realm of partial computation. The system-T fragment could be further extended with computable functionals such as bar recursion and some of those developed here, once one has shown they are indeed total. Semantics of practical programming languages. For experimental considerations regarding theoretical conjectures, it is important that the programming language under consideration has precisely specified denotational and/or operational semantics. For correctness proofs, a denotational semantics is enough, but, for complexity considerations, one also needs a sufficiently concrete operational semantics. Very often, and particularly in higher-type computation, an abstract operational semantics is useless for complexity considerations. While it may be rather hard in practice to prove that an implementation of the language respects a given semantics, one should expect the implementer of the language to at least promise that they intend the language to satisfy a precisely defined semantical specification, as is the case of the language ML. In Section 4, we could have been more precise in our experiments regarding correctness and complexity if the Haskell definition had included such a semantics. Run-time behaviour in the higher-type setting. Continuing from the previous paragraph, of course, it is natural to suppose that such definitions of a language will be backed by theoretical work, such as [25, 15, 18, 30]. But, even if the specifications of practical languages such as Haskell were based on this kind of work, this wouldn’t have been enough to prove the conjectures of Section 4. At present, we lack mathematical technology to reason about the run-time behaviour of higher-type programs under call-by-need evaluation (the Haskell programming community is well aware of this). For higher-type computability theory to have a tangible influence in computer science, these questions have to be investigated and elucidated.
References [1] S. Abramsky and A. Jung. Domain theory. In S. Abramsky, D.M. Gabbay, and T.S.E. Maibaum, editors, Handbook of Logic in Computer Science, volume 3 of Oxford science publications, pages 1–168. 1994. [2] I. Battenfeld, M. Schr¨oder, and A. Simpson. A convenient category of domains. In Computation, meaning, and logic: articles dedicated to Gordon Plotkin, volume 172 of Electron. Notes Theor. Comput. Sci., pages 69–99. Elsevier, Amsterdam, 2007. 34
[3] A. Bauer, M.H. Escard´o, and A.K. Simpson. Comparing functional paradigms for exact real-number computation. volume 2380 of Lect. Not. Comp. Sci., pages 488– 500, 2002. [4] M.J. Beeson. Foundations of Constructive Mathematics. Springer, 1985. [5] U. Berger. Totale Objekte und Mengen in der Bereichstheorie. PhD thesis, Mathematisches Institut der Universit¨at M¨unchen, 1990. [6] U. Berger. Total sets and objects in domain theory. Ann. Pure Appl. Logic, 60(2):91– 117, 1993. [7] M.H. Escard´o. Synthetic topology of data types and classical spaces. Electron. Notes Theor. Comput. Sci., 87:21–156, 2004. [8] M.H. Escard´o and W.K. Ho. Operational domain theory and topology of a sequential programming language. In Proceedings of the 20th Annual IEEE Symposium on Logic In Computer Science, pages 427–436, 2005. [9] M.H. Escard´o, J. Lawson, and A. Simpson. Comparing Cartesian closed categories of (core) compactly generated spaces. Topology Appl., 143(1-3):105–145, 2004. [10] D. Gale. Compact sets of functions and function rings. Proc. Amer. Math. Soc., 1:303–308, 1950. [11] R. O. Gandy and J. M. E. Hyland. Computable and recursively countable functions of higher type. In Logic Colloquium 76 (Oxford, 1976), pages 407–438. Studies in Logic and Found. Math., Vol. 87. North-Holland, Amsterdam, 1977. [12] G. Gierz, K.H. Hofmann, K. Keimel, J.D. Lawson, M. Mislove, and D.S. Scott. Continuous Lattices and Domains. Cambridge University Press, 2003. [13] G. Hutton. Programming in Haskell. Cambridge University Press, 2007. [14] J.L. Kelley. General Topology. D. van Nostrand, New York, 1955. [15] J. Launchbury. A natural semantics for lazy evaluation. In Proceedings of the Twentieth Annual ACM SIGPLAN–SIGACT on Principles of Programming Languages, pages 144–154. ACM Press, 1993. [16] J.R. Longley. Notions of computability at higher types. I. In Logic Colloquium 2000, volume 19 of Lect. Notes Log., pages 32–142. Assoc. Symbol. Logic, Urbana, IL, 2005. [17] J.R. Longley. On the ubiquity of certain type structures. Mathematical Structures in Computer Science, 17:841–953, 2007. [18] J. Maraist, M. Odersky, and P. Wadler. The call-by-need lambda calculus. J. Funct. Programming, 8(3):275–317, 1998. [19] D. Normann. Recursion on the countable functionals, volume 811 of Lec. Not. Math. Springer, 1980. [20] D. Normann. Computability over the partial continuous functionals. J. Symbolic Logic, 65(3):1133–1142, 2000. [21] D. Normann. Comparing hierarchies of total functionals. Logical Methods in Computer Science, 1(2):1–28, 2005. [22] D. Normann. Comparing hierarchies of total functionals. Log. Methods Comput. Sci., 1(2):2:4, 28, 2005. 35
[23] D. Normann. Computing with functionals—computability theory or computer science? Bull. Symbolic Logic, 12(1):43–59, 2006. [24] G. D. Plotkin. Call-by-name, call-by-value and the λ-calculus. Theoret. Comput. Sci., 1(2):125–159, 1975. [25] G.D. Plotkin. LCF considered as a programming language. Theoret. Comput. Sci., 5(1):223–255, 1977. [26] G.D. Plotkin. Tω as a universal domain. J. Comput. System Sci., 17:209–236, 1978. [27] G.D. Plotkin. Full abstraction, totality and PCF. Math. Structures Comput. Sci., 9(1):1–20, 1999. [28] D.S. Scott. Data types as lattices. SIAM J. Comput., 5:522–587, 1976. [29] D.S. Scott. A type-theoretical alternative to CUCH, ISWIM and OWHY. Theoret. Comput. Sci., 121:411–440, 1993. Reprint of a 1969 manuscript. [30] F.-R. Sinot. Call-by-need in token-passing nets. Math. Structures Comput. Sci., 16(4):639–666, 2006. [31] M.B. Smyth. Topology. In S. Abramsky, D.M. Gabbay, and T.S.E. Maibaum, editors, Handbook of Logic in Computer Science, volume 1 of Oxford science publications, pages 641–761. 1992.
Contents 1
Introduction
1
2
Topologically inspired algorithms 2.1 Background on higher-type computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Exhaustible and searchable sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Building new searchable sets from old . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 5 6
3
Topologically based algorithms 3.1 Background on compactly generated spaces . . . . . . 3.2 Background on Kleene–Kreisel functionals . . . . . . 3.3 Compactness of exhaustible sets . . . . . . . . . . . . 3.4 Some computational properties of exhaustible sets . . . 3.5 Arzela–Ascoli type characterizations of compact sets . 3.6 Arzela–Ascoli type characterization of exhaustible sets
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
10 11 12 14 16 19 23
Time-complexity considerations 4.1 The model of computation . . . . . . . . . . . . . . 4.2 The experimental setup . . . . . . . . . . . . . . . . 4.3 A first experiment . . . . . . . . . . . . . . . . . . 4.4 Run-time as a function of the size of a functional input 4.5 The constant-cost hypothesis . . . . . . . . . . . . . 4.6 Berger’s algorithm run time . . . . . . . . . . . . . 4.7 Product-algorithms run time . . . . . . . . . . . . . 4.8 Haskell code for the experiments . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
25 25 25 26 27 28 28 29 32
4
5
. . . . . . . .
Concluding remarks
33
This file, a Haskell file, and more experimental results can be found at: http://www.cs.bham.ac.uk/∼mhe/papers/exhaustive-journal.pdf http://www.cs.bham.ac.uk/∼mhe/papers/exhaustive-journal.hs http://www.cs.bham.ac.uk/∼mhe/papers/exhaustive-experimental-results.txt
36