A HAUSDORFF DIMENSION FOR FINITE SETS

Report 6 Downloads 69 Views
arXiv:1508.02946v1 [cs.DM] 12 Aug 2015

A HAUSDORFF DIMENSION FOR FINITE SETS JUAN M. ALONSO Abstract. The classical Hausdorff dimension, denoted dimH , of finite or countable sets is zero. We define an analog for finite sets, called finite Hausdorff dimension and denoted dimfH , which is non-trivial. It turns out that a finite bound for dimfH (F ) guarantees that every point of F has ”nearby” neighbors. This property is important for many computer algorithms of great practical value, that obtain solutions by finding nearest neighbors. We also define dimfB , an analog for finite sets of the classical box-counting dimension, and compute examples. The main result of the paper is a Convergence Theorem. It gives conditions under which, if Fn → X (convergence of compact subsets of Rn under the Hausdorff metric), then dimfH (Fn ) → dimH (X).

1. Introduction The initial motivation for this work was concentration of distance. This is a particular instance of the curse of dimensionality, a term coined by Richard Bellman in [2], to refer to various phenomena that arise in high-dimensional vector spaces. When searching for nearest neighbors, concentration of distance usually refers to the following: as the dimensionality of the data increases, the longest and shortest distance between points tend to become so close that the distinction between ”near” and ”far” becomes meaningless. The lack of a clear contrast between distances to a query point compromises the quality of the search. The problem is a long standing one in database research [1, 3, 18]. Awareness of this threat is spreading to other domains; in particular, major concerns have been raised in Cancer Research [5]. This has prompted quite a bit of research aimed at better understanding both the problem and its implications [7, 9, 11, 12, 14, 15, 16, 18, 19]. In the papers cited above, concentration of distance is studied probabilistically. Data (a finite metric space embedded in a high-dimensional vector space) typically has some sort of structure that varies quite a lot depending on the data source (the different domains of application). We thought it worthwhile to try to understand this structure, however subtle it might be, in a more geometric vein. The ideas inherent in the study of fractals and fractal dimension seemed particularly appealing to us. In this regard [17] was inspiring: the authors study ”the form” of Word Space (one of the finite metric spaces mentioned above) using Statistics and a fractal dimension defined by considering ”the underlying space as a continuum and randomly making a finite number of observations from which one tries to obtain a 2010 Mathematics Subject Classification. 68R99 (primary), 28A78, 68P10 (secondary). Keywords and phrases. Finite metric spaces, finite Hausdorff dimension, finite Box-counting dimension, Hausdorff metric, concentration of distance, nearest point search. Acknowledgements. The author would like thank the support of Secretar´ıa de Ciencia, T´ ecnica y Posgrado (SeCTyP) at UN Cuyo. 1

2

JUAN M. ALONSO

maximum likelihood approximation of the underlying dimension”[13]. This is the usual way, when estimating dimension, of coping with finiteness [6]. In a radical departure from the classical theory, we decided to start directly with finite metric spaces. The problem, of course, is that the Hausdorff dimension of finite sets is zero. In this paper we define finite Hausdorff dimension, a non-trivial analog for finite spaces of the Hausdorff dimension. For the classical theory of Hausdorff and other, fractal, dimensions, see [8], and the bibliography therein. Throughout the paper we use X, Y , etc., to denote arbitrary metric spaces, and reserve F, F ′ , etc., to denote finite ones. Here is a summary of the contents of the paper. We begin Section 2 by recalling the definition of the classical Hausdorff measure and dimension. We then introduce 2-coverings. This key modification of the classical notion is responsible for making dimfH (F ) non-zero on most finite F (in fact, dimfH (F ) = 0 if and only if F has a single point). Two basic notions for this work, covering diameter and focal points, are also defined. The section ends with a brief discussion of how the results of the paper apply to the motivating problem: concentration of distance and the search for nearest neighbors. Section 3 deals with the definition of finite Hausdorff dimension. Following Hausdorff’s steps, we introduce H s (F ), an analog of Hausdorff’s outer measure Hs (X). In contrast to the classical case, H s (F ) is not a measure. In Section 3.2 we study the behaviour of H s (F ) under H¨ older equivalences. The definition of dimfH (F ) proper is given in Section 3.3. In the classical case, Hs (X) has a ”natural” break-point s0 := dimH (X), with the property that Hs (X) = 0, for all s > s0 , and Hs (X) = ∞, for all s < s0 . There is no such break-point in the finite case, hence the need to ”manufacture” one. For this purpose, we consider the equation: H s (F ) = ∆(F )s ,

(1.0.1)

and solve for s, where ∆(F ) denotes the diameter of F . It turns out that (1.0.1) has a unique solution s0 if and only if F has no focal points. The solution is a positive real number, and we set s0 := dimfH (F ) (cf. Theorem 3.15). In Section 4 we introduce, following the same pattern, finite box-counting dimension, dimfB (F ). As in the classical case, there is an explicit formula to compute dimfB (F ). In Section 5 we show that dimfH (F ) ≤ dimfB (F ), just as in the classical case. We define locally uniform spaces, and show this is a class of spaces where finite Hausdorff and finite box-counting dimensions coincide. Both finite dimensions are easier to compute for these spaces, and we even have an explicit formula. Several examples are computed in Section 6, together with results of a more general nature. For instance, we show that every non-negative extended real number is the finite Hausdorff [resp. finite box-counting] dimension of some finite space. In Section 7 we prove the Convergence Theorems, the main results of the paper. Any compact space X ⊆ Rn can be approximated, in the Hausdorff metric, by a sequence {Fk } of locally uniform spaces (Proposition 7.13). In Section 7.3 we prove, under some extra conditions on X (cf. Theorem 7.17), that lim dimfB (Fk ) = dimB (X),

k→∞

and when, moreover, X is the attractor of an IF S (cf. Theorem 7.18), that: lim dimfH (Fk ) = dimH (X).

k→∞

A HAUSDORFF DIMENSION FOR FINITE SETS

3

2. 2-coverings and focal points. For the benefit of the reader, we start by reviewing the classical definitions of Hausdorff measure and dimension (see [8] for details). All subsets of Rn are metric spaces with the distance d induced from Rn . Recall that the diameter of a non-empty subset U of Rn is defined as ∆(U ) := sup{d(x, y)|x, y ∈ U }. Let U = {Ui }∞ i=1 be a countable family of non-empty subsets of Rn . We say that U has diameter at most δ, denoted ∆(U) ≤ δ, if ∆(Ui ) ≤ δ for ∞ all i. The family U is called a covering of a subset X of Rn , if X ⊆ ∪P i=1 Ui . For a ∞ covering U of X, and a number s ≥ 0, we use the notation HUs (X) := i=1 ∆(Ui )s . n Given a subset X ⊆ R and numbers s ≥ 0 and δ > 0, we define n o Hδs (X) := inf HUs (X) U is a cover of X, and ∆(U) ≤ δ . For fixed s, Hδs (X) is clearly an increasing function of δ, hence the limit as δ → 0 exists, and we define: Hs (X) := lim Hδs (X) = sup Hδs (X). δ→0

δ>0

s

Note that H (X) is defined for any subset X of Rn , and it is an extended number in [0, ∞]; it is called the s-dimensional Hausdorff (outer) measure of X. It turns out that there exists a critical value s0 where Hs (X) jumps from ∞ to 0. More precisely, for all s > s0 , Hs (X) = 0, and for all s < s0 , Hs (X) = ∞. The Hausdorff dimension of X is defined to be this critical value: dimH (X) := s0 , and we have s

H (X) =



∞ 0

if 0 ≤ s < dimH (X) if s > dimH (X).

2.1. Finite metric spaces. Let F denote a finite metric space. Unless explicit mention to the contrary, F will be assumed to contain at least two elements. We usually assume F is contained in some metric space from which it inherits its metric. Although the finite dimensions are strongly dependent on the metric (cf. Example 6.7), we sometimes refer to F as a set. The separation of F , i.e. the minimum distance between different points of F , will be denoted δ(F ). Note that 0 < δ(F ) ≤ ∆(F ) < ∞. We let |F | denote the number of elements in F . The next definition is basic for this work. Definition 2.1. A 2-covering of F is a family U = {Ui |i ∈ I} of subsets of F satisfying S (i) F = i∈I Ui (ii) |Ui | ≥ 2, ∀i ∈ I Remark 2.2. In condition (ii) we depart from the classical definition. It is thanks to this condition that non-trivial dimensions can be assigned to finite spaces. Note that (ii) is equivalent to Ui having positive diameter. Finally, note that I is finite, since U ⊆ P(F ), the power set of F . We denote the set of all 2-coverings of F by K(F ). There is exactly one 2covering which consists of one element, denoted U0 = {F }. Notice that U0 ∈ K(F ) because |F | ≥ 2.

4

JUAN M. ALONSO

Definition 2.3. Let U ∈ K(F ). The diameter of U, denoted ∆(U), is defined by: ∆(U) := max{∆(Ui )|Ui ∈ U}.

Definition 2.4. Let U ∈ K(F ). The covering diameter of F , denoted ∇(F ), is defined by: ∇(F ) := min{∆(U)|U ∈ K(F )}. Remark 2.5. Note that 0 < δ(F ) ≤ ∇(F ) ≤ ∆(F ). Given δ > 0, we let Kδ (F ) denote the set of 2-coverings of F with diameter ≤ δ: Kδ (F ) := {U ∈ K(F )|∆(U) ≤ δ}.

Lemma 2.6. Suppose F is a finite set, and δ > 0. The following conditions are equivalent: (i) Kδ (F ) 6= ∅, (ii) δ ≥ ∇(F ). Proof. Obvious.



Corollary 2.7. ∇(F ) = min{δ|Kδ (F ) 6= ∅}. Definition 2.8. Let νF : F → R, denote the function that gives the distance of a point to a nearest neighbor (in F ). It is defined by νF (a) := min{d(a, x)|a, x ∈ F, x 6= a}. Lemma 2.9. Given a finite space F , suppose r > 0 satisfies the condition that νF (a) ≤ r, for all a ∈ F . Then Kr (F ) is not empty. Proof. Given a ∈ F , choose xa ∈ F such that d(a, xa ) = νF (a), and define Ua := {a, xa }. It follows that U := {Ua |a ∈ F } is a 2-covering, and ∆(Ua ) ≤ r. In other words, Kr (F ) 6= ∅, as desired.  Remark 2.10. We note, for further reference, that the 2-covering constructed in 2.9 has |F | elements. The reader can easily modify the construction and show that F has a 2-covering with at most |F | − 1 elements. In general this number is optimal: for example, F := {(0, 0), (1, 0), (0, 1), (−1, 0), (0, −1)} ⊆ R2 , cannot be 2-covered with less than 4 sets. Proposition 2.11. Let F and νF be as above. Then: (i) min {νF (a)|a ∈ F } = δ(F ). (ii) max{νF (a)|a ∈ F } = ∇(F ). Proof. We simplify the notation by setting m(F ) := min{νF (a)|a ∈ F }, and M (F ) := max{νF (a)|a ∈ F }. To prove (i), notice that, νF (a) ≥ δ(F ), for all a ∈ F , so that m(F ) ≥ δ(F ). On the other hand, δ(F ) = d(a0 , a1 ), for some a0 , a1 ∈ F . Then, m(F ) ≤ νF (a0 ) ≤ d(a0 , a1 ) = δ(F ), as desired. To prove (ii), take r = M (F ) in Lemma 2.9, and let U be the 2-cover constructed in that lemma. By Lemma 2.6, M (F ) ≥ ∇(F ). To prove the reverse inequality, let ∇(F ) = ∆(V), for some 2-covering V. Take an arbitrary a ∈ F , and suppose that a ∈ Vi ∈ V. Then, for all b ∈ Vi , b 6= a, we have: νF (a) ≤ d(a, b) ≤ ∆(Vi ) ≤ ∆(V) = ∇(F ).

It follows that M (F ) ≤ ∇(F ), as desired. This concludes the proof.



A HAUSDORFF DIMENSION FOR FINITE SETS

5

2.2. Focal Points. In this section we introduce focal points, an important notion for the rest of the paper. We start by decomposing K(F ) into three disjoint subsets K(F ) = K 0 (F ) ∪ K 1 (F ) ∪ K 2 (F ),

as follows: K 0 (F ) := {U0 },

K 1 (F ) := {U ∈ K(F )| |U| ≥ 2 ∧ ∀Ui ∈ U, ∆(Ui ) < ∆(F )},

and K 2 (F ) := K(F ) \ (K 0 (F ) ∪ K 1 (F )). Thus, U ∈ K 2 (F ) if and only if |U| ≥ 2, and ∃Ui ∈ U, such that ∆(Ui ) = ∆(F ). It is easy to see that: K 1 (F ) := {U ∈ K(F )|∆(U) < ∆(F )}.

Finally, we set: Kδ1 (F ) := K 1 (F ) ∩ Kδ (F ).

Definition 2.12. Let F be a finite space. We call a0 ∈ F a focal point of F if νF (a0 ) = ∆(F ). Explicitly, d(a0 , a) = ∆(F ), for all a ∈ F, a = 6 a0 . Remark 2.13. In other words, all neighbors of a focal point are ”far away” (equally so, at diameter distance). A point is not focal when it has ”nearby” neighbors (i.e. neighbors at distances strictly less than ∆(F )). The next result characterizes the existence of focal points in terms of 2-coverings, covering diameter, and diameter. Note that condition (i) implies that F has at least three points. Theorem 2.14. Let F be a finite space. Then the following conditions are equivalent: (i) F has no focal points, (ii) K 1 (F ) 6= ∅, (iii) ∇(F ) < ∆(F ). Proof. We assume that (i) holds and prove (iii). By definition, (i) means that νF (a) < ∆(F ), for all a ∈ F . By Proposition 2.11, ∇(F ) = maxa∈F νF (a) < ∆(F ), as desired. We now show that (iii) implies (ii). Recall K 1 (F ) can also be defined as the set of 2-covers U with ∆(U) < ∆(F ). By Lemma 2.6, K∇(F ) (F ) 6= ∅, so that we can find a 2-cover U with ∆(U) = ∇(F ) < ∆(F ). Hence U ∈ K 1 (F ), as required. Finally, suppose (ii) holds. Let U ∈ K 1 (F ), and suppose, for contradiction, that p ∈ F is a focal point. Let p ∈ Ui ∈ U. For a ∈ Ui , a 6= p, we have ∆(F ) = d(a, p) ≤ ∆(Ui ) ≤ ∆(U) < ∆(F ),

a contradiction. This proves (i), and the Theorem.



2.3. Application to Nearest Neighbors. Finding nearest neighbors in finite metric spaces is a method used to solve many important problems. The fields of application include Databases, Pattern Recognition, Computer Vision, DNASequencing, Coding Theory, Data Compression, Text Analysis in real-time, Recommendation Systems, Spell Checking, Data Mining, etc. Typically, one represents the objects of interest (and one’s knowledge of them) by points in a vector space, and finds solutions by searching for a point nearest a given query point. The whole concept is based on the assumption that nearby points have properties similar to those of the query point. By the curse of dimensionality in the case of nearest neighbors, one usually means the phenomenon of concentration of distance: the longest and shortest distance

6

JUAN M. ALONSO

between points in the space are so close that the distinction between ”near” and ”far” becomes meaningless. In terms of the parameters we have introduced, this means that the quotient ∆(F )/δ(F ) is close to one. Concentration of distance poses an obvious threat to solution methods based on finding nearest points. Hence the need to determine if the sets of points you usually obtain in your specific field of application, suffer from concentration of distance, and whether or not the problem is severe enough to defeat the assumption that ”nearby” points have properties similar to those of the query point. In this section we discuss concentration of distance in light of the results obtained so far. Actually, we contend that rather than looking at the quotient ∆(F )/δ(F ) or, equivalently, δ(F )/∆(F ), one should look at ∇(F )/∆(F ) instead. Indeed, we consider the question of how meaningful a nearest neighbor is, and express the answer in term of this quotient. Our results are summarized in Theorem 2.29 below (observe that this theorem includes notions and results obtained later in the paper). As usual, let (F, d) denote an arbitrary finite metric space with at least two points. Definition 2.15. Given arbitrary x, x′ ∈ F , we say x′ is a point nearest x, if x 6= x′ , and d(x, x′′ ) ≥ d(x, x′ ), for all x′′ 6= x. Remark 2.16. The reader should be aware of the fact that we distinguish between ”nearest point ”, defined above, and ”nearest neighbor ” to be defined presently. The difference lies with the notion of ”neighbor” which, for us, excludes points lying ”far away” (cf. Defs. 2.21 and 2.23). Lemma 2.17. For any x, x′ ∈ F , x′ is nearest x iff ν(x) = d(x, x′ ). Definition 2.18. An arbitrary function n : F → F is called a nearest point function, if n(x) is a point nearest x, for all x ∈ F . Lemma 2.19. Every finite metric space has a nearest point function. Remark 2.20. It follows that the existence of a nearest point function imposes no condition on F : such a function always exists. This raises the question of meaningfulness (cf. Def. 2.25). Consider the definition of a nearest point function n(x) at a focal point x. At x, we have exactly |F | − 1 possible choices for n(x), and no metric criterion to distinguish between them. So, any such choice will give us a definition of a nearest point function but, clearly, distance gives no help to find points with properties similar to those of the query point. Definition 2.21. Let x, x′ denote arbitrary points of F . We say that x′ is a neighbor of x if x 6= x′ , and d(x, x′ ) < ∆(F ). Intuitively, a neighbor of x is a point different from x, and not far away from it. Clearly, a focal point has no neighbors. In fact: Lemma 2.22. A point is focal iff it has no neighbors. Definition 2.23. An arbitrary function n : F → F is called a nearest neighbor function, if n(x) is a nearest neighbor of x, for all x ∈ F . Lemma 2.24. A function n : F → F is nearest neighbor iff the following conditions hold:

A HAUSDORFF DIMENSION FOR FINITE SETS

7

(i) d(x, n(x)) = νF (x), for all x ∈ F . (ii) νF (x) < ∆(F ), for all x ∈ F . Consider now the important question of when a nearest neighbor is meaningful. We believe this notion depends on the specific field of application: what is meaningful for databases need not be meaningful for, say, DNA-sequencing. This is why, instead of considering an absolute notion of meaningfulness, we introduce the following relative notion. Definition 2.25. Let λ denote a real number. An arbitrary function n : F → F is a λ-meaningful nearest neighbor function, abbreviated λ-MNN function, if (i) n is a nearest neighbor function, (ii) 0 < λ < 1, (iii) λ is the smallest positive real number with d(x, n(x)) ≤ λ · ∆(F ), ∀x ∈ F. Remark 2.26. Note that there is always a point x0 ∈ F , satisfying d(x0 , n(x0 )) = ∇(F ). It follows that, if n is a λ-MNN function, then λ ≥ ∇(F )/∆(F ). Lemma 2.27. Suppose n : F → F is a λ-MNN function. Then λ = ∇(F )/∆(F ). Remark 2.28. Note that λ = ∇(F )/∆(F ) is a sharp bound, since for some x0 ∈ F , d(x0 , n(x0 )) = λ · ∆(F ) = ∇(F ).

Taking advantage of results that will be proved in later sections, we can summarise the discussion in the following omnibus theorem: Theorem 2.29. Let F be a finite metric space and n : F → F and arbitrary nearest point function. Then the following conditions are equivalent: (i) n is a nearest neighbor function. (ii) n is a ∇(F )/∆(F )-MNN function. (iii) n is a λ-MNN function. (iv) F has no focal points. (v) K 1 (F ) is not empty. (vi) ∇(F )/∆(F ) < 1. (vii) dimfH (F ) is finite. (viii) dimfB (F ) is finite. Remark 2.30. (a) The equivalences (iv)-(vi) constitute Theorem 2.14. The last two equivalences follow from Theorems 3.15 and 4.11 below. It follows from the theorem that a nearest neighbor function is always λ-meaningful for a unique λ ∈ (0, 1), namely for λ = ∇(F )/∆(F ). Hence, both the existence of a nearest neighbor function, as well as its meaningfulness, depend on the quotient λ = ∇(F )/∆(F ): the function exists if λ < 1, and it is more meaningful the smaller λ is. The question for those working in a given field of application of nearest point search, then, is to decide whether or not λ = ∇(F )/∆(F ) is small enough so that, knowing that d(x, n(x)) ≤ λ · ∆(F ), will guarantee that the ”similarity” between x and n(x) is strong for their particular field. We now consider our contention that concentration of distance is only partially relevant to the nearest neighbor method. We begin by observing that 0
0 such that d′ (η(x1 ), η(x2 )) = rd(x1 , x2 )β for all x1 , x2 ∈ X. We say that η is (r, β)-H¨ older, or an (r, β)-H¨ older equivalence. In the special case when β = 1, we say that η is a similarity, or an r -similarity. Example 3.6. This example is obtained by ”folding” an equally spaced linear set. Let Fn := {x0 , . . . , xn−1 } ⊆ R consist of the following n points: xi = i, (i =

10

JUAN M. ALONSO

0, . . . , n−1). Then d(xi , xi+j ) = |j|. Consider the space Fn′ := {y0 , . . . , yn−1 } ⊆ Rn , where i

z }| { yi := (1, . . . , 1, 0, . . . , 0).

Then for i, j ≥ 0,

i

j

z }| { z }| { p d (yi , yi+j ) = k(0, . . . , 0, 1, . . . , 1, 0, . . . , 0)k = j. √ Define η : Fn → Fn′ , by η(xi ) := yi . Then d′ (η(xi ), η(xi+j )) = j, and d(xi , xi+j ) = j. In other words, η is (1, 1/2)-H¨older. ′

Lemma 3.7. Suppose that η : X → X ′ is (r, β)-H¨ older. Then:

(i) For all Y ⊆ X, η : Y → η(Y ) is a bijection, and its inverse η ′ is (r−1/β , 1/β)-H¨ older. (ii) If F ⊆ X is finite, then ∆(η(F )) = r∆(F )β . (iii) If F ⊆ X is finite and has focal points, then so does η(F ) ⊆ X ′ . (iv) Let F ⊆ X be finite. Then F has focal points iff η(F ) has focal points.

Proof. (i) It is obvious from the definition that η is injective. Let η ′ : η(Y ) → Y be the inverse of η, and let x′ , y ′ ∈ η(Y ). Then x′ = η(x), y ′ = η(y) for unique x, y ∈ Y , and d′ (η ′ (x′ ), η ′ (y ′ )) = d(x, y) = [(1/r)d′ (η(x), η(y))]1/β . (i) follows immediately. (ii) Suppose ∆(η(F )) = d′ (η(x1 ), η(x2 )), for x1 , x2 ∈ F . Then ∆(η(F )) = rd(x1 , x2 )β ≤ r∆(F )β . For the reverse inequality, assume ∆(F ) = d(u1 , u2 ), ui ∈ F . Then rd(u1 , u2 )β = d′ (η(u1 ), η(u2 )) ≤ ∆(η(F )), as desired. (iii) Suppose x0 ∈ F is focal. Then d(x0 , x) = ∆(F ), for all x 6= x0 . Hence d′ (η(x0 ), η(x)) = rd(x0 , x)β = r∆(F )β = ∆(η(F )), by (ii). This shows that η(x0 ) is a focal point of η(F ), because every x′ ∈ η(F ) different from η(x0 ), is of the form η(x) for some x ∈ F , x 6= x0 . Finally, (iv) follows immediately from (i) and (iii).  Lemma 3.8. Let η : X → X ′ be (r, β)-H¨ older, and F ⊆ X a finite set. Then: (i) η induces bijections: (a) η∗ : K(F ) → K(η(F )), (b) η∗ : Kδ (F ) → Krδβ (η(F )), for all δ ≥ ∇(F ), 1 (c) η∗ : Kδ1 (F ) → Krδ β (η(F )), for all δ ≥ ∇(F ). (ii) |η∗ (U)| = |U|. (iii) ∆(η∗ (U)) = r ∆(U)β . (iv) ∇(η(F )) = r ∇(F )β .

Proof. (i) For U = {U1 , . . . , Un } ∈ K(F ), define η∗ (U) := {η(U1 ), . . . , η(Un )}. Then η∗ (U) is aS2-covering because S η|F : F → η(F ) is bijective, with inverse η ′ . Indeed, if F = Ui , then η(F ) = η(Ui ), and |η(Ui )| = |Ui | ≥ 2, as required. To see that η∗ is bijective, recall that η ′ is (r−1/β , 1/β)-H¨ older, by Lemma 3.7(i). We then have η∗′ : Kδ′ (η(F )) → K(δ′ /r)1/β (F ) and, clearly, η∗ and η∗′ are inverse to each other. This completes the proof of (a). Using Lemma 3.7(ii), if U ∈ Kδ (F ), then ∆(η(Ui )) = r∆(Ui )β ≤ rδ β . Hence η∗ (U) ∈ Krδβ (F ), as desired. Similarly, η∗ (K 1 (F )) ⊆ K 1 (η(F )), because ∆(U) < ∆(F ) implies ∆(η(U)) = r∆(U)β < r∆(F )β = ∆(η(F )).

A HAUSDORFF DIMENSION FOR FINITE SETS

11

(ii) is obvious from the definition of η∗ , and (iii) follow immediately from (i) and Lemma 3.7. To prove (iv), we use (i) and (iii): ∇(η(F ))

= min {∆(V)|V ∈ K(η(F ))} = min {∆(η∗ (U))|U ∈ K(F )} = min {r(∆(U))β |U ∈ K(F )} = r∇(F )β .

This proves (iv), and concludes the proof of the lemma.



Proposition 3.9. Let η : X → X ′ be (r, β)-H¨ older, and F ⊆ X a finite space. Then, for all s ∈ [0, ∞): (i) Hηs∗ (U ) (η(F )) = rs H sβ U(F ), for all U ∈ K(F ). s s sβ (ii) Hrδ δ(F ), for all δ ≥ ∇(F ). β (η(F )) = r H s s sβ (iii) H (η(F )) = r H (F ).

Proof. (i) Let U = {Ui } ∈ K(F ). By Lemma 3.8, η∗ (U) ∈ K(η(F )), and: X X Hηs∗ (U ) (η(F )) = ∆(η(Ui ))s = rs ∆(Ui )sβ = rs H sβ U(F ).

(ii) Given s, δ, suppose H sβ δ(F ) = H sβ U(F ), where (a) U ∈ Kδ1 (F ), when F has no focal points, and (b) U ∈ K(F ), otherwise. We consider (a) first. By (i) and 1 s Lemma 3.8, Hηs∗ (U ) (η(F )) = rs H sβ δ(F ). Since η∗ (U) ∈ Krδ β (F ), Hrδ β (η(F )) ≤ s s Hηs∗ (U ) (η(F )) = rs H sβ δ(F ). For the reverse inequality, let Hrδ β (η(F )) = HV (η(F )), 1 1 ′ for some V ∈ Krδβ (η(F )). Using Lemma 3.8, η∗ (V) ∈ Kδ (F ), and 1 s 1 s H (η(F )) = s Hrδ β (η(F )). rs V r This completes the proof of (ii) in case (a). The proof in case (b) is similar: we need only use the fact that now η∗ (U) ∈ K(η(F )) and, if V ∈ K(η(F )), then η∗′ (V) ∈ K(F ). (iii) is a special case of (ii). Here are the details. Suppose first that F has no focal points. By Lemma 3.7(iv), the same is true of η(F ). According to Lemma 3.3, s H sβ (F ) = H sβ ∇(F )(F ), and H s (η(F )) = H∇(η(F )) (η(F )). By Lemma 3.8(iv), β ∇(η(F )) = r∇(F ) . Using (ii), H sβ δ(F ) ≤ H sβ η∗′ (V)(F ) =

H sβ ∇(F )(F ) =

1 s 1 s Hr(∇(η(F )))β (η(F )) = s H∇(η(F )) (η(F )) s r r

Hence, rs H sβ (F ) = H s (η(F )), as desired. The case when F has focal points will be left to the reader. This completes the proof.  Recall the following relaxations of H¨ older equivalence and of similarity, defined here for arbitrary metric spaces. Definition 3.10. Let X, X ′ be metric spaces, η : X → X ′ a function, and r, β > 0. Then: (i) η satisfies a H¨ older condition, or an (r, β)-H¨ older condition, if d′ (η(x), η(y)) ≤ rd(x, y)β . (ii) η is Lipschitz if it satisfies an (r, 1)-H¨older condition, for some r > 0.

12

JUAN M. ALONSO

(iii) η is bi-Lipschitz if r1 d(x, y) ≤ d′ (η(x), η(y)) ≤ r2 d(x, y),

for some r1 , r2 > 0. We say that X and η(X) are Lipschitz equivalent. It turns out that these relaxations are not so interesting in the finite case as they are in the classical case. This is shown by the following lemma, whose easy proof we leave to the reader. Lemma 3.11. Suppose η : F → X ′ is a function defined on a finite F ⊆ X. Then (i) Any such η is Lipschitz. (ii) η is bi-Lipschitz iff it is injective. (iii) F and η(F ) are Lipschitz equivalent iff |F | = |η(F )|. 3.3. Definition of dimfH (F ). We define finite Hausdorff dimension, dimfH (F ), by solving the equation: (3.3.1)

H s (F ) = ∆(F )s .

Equation (3.3.1) has exactly one solution s0 ∈ (0, ∞), precisely when F has no focal points. More generally, we have: Proposition 3.12. Consider the following equations: (i) ∆(F )s = HUs (F ), for all U ∈ K(F ), (ii) ∆(F )s = Hδs (F ), for all δ ≥ ∇(F ), (iii) ∆(F )s = H s (F ). Then, in each of these cases, the equation has a unique solution iff F has no focal points. When this is the case, the solutions are positive real numbers, and will be denoted, respectively, sU , sδ , and s0 . Proof. Suppose F is a subspace of (X, d). The identity map: 1 idX : (X, d) → (X, d) r s is an r−1 -similarity. To prove (i), note that by Proposition 3.9 (i), H(id (idX (F )) = X )∗ (U ) −s s s s r HU (F ). Taking r = ∆(F ), we see that ∆(F ) = HU (F ) is equivalent to: 1 s (3.3.2) ∆(idX (F ))s = 1 = (idX (F )). H s (F ) = H(id X )∗ (U ) ∆(F )s U It follows from (3.3.2) that in the proof of (i) we may assume, without loss of generality, that ∆(F ) = 1. Consider first the reverse implication. If F has no focal points, Lemma 3.4 guarantees the existence of a unique sU ∈ (0, ∞) such that H sU U(F ) = 1, as desired. To prove the direct implication, suppose that K 1 (F ) = ∅. Using Def. 3.1 we see that the equation in (i) has infinitely many solutions when U ∈ K 0 (F ), and no solution when U ∈ K 2 (F ). This completes the proof of (i). The proof of (ii) is completely analogous, except, perhaps, for the last part. So assume K 1 (F ) = ∅. Then Hδs (F ) = H s (F ) = ∆(F )s , by Lemma 3.3, and the equation has infinitely many solutions, as before. Finally, (iii) is a special case of (ii). This completes the proof.  Lemma 3.13. Suppose F has no focal points. Then (i) s0 = s∇(F ) . (ii) s0 = max{sδ |δ ≥ ∇(F )}.

A HAUSDORFF DIMENSION FOR FINITE SETS

13

(iii) sδ = min{sU |U ∈ Kδ1 (F )}, for all δ ≥ ∇(F ). s s Proof. (i) is obvious, since H s (F ) = H∇(F ) (F ). (ii) If δ ≥ ∇(F ), then Hδ (F ) ≤ s H∇(F ) (F ) by Lemma 3.2. Hence sδ ≤ s∇(F ) = s0 . Thus, max{sδ |δ ≥ ∇(F )} = s0 , as required. To prove (iii), note that Def. 3.1 implies Hδs (F ) ≤ HUs (F ), for all U ∈ Kδ1 (F ); hence, sδ ≤ min{sU |U ∈ Kδ1 (F )}. To prove the reverse inequality, recall that, given sδ there is U ∈ Kδ1 (F ) such that ∆(F )sδ = H sδ δ(F ) = H sδ U(F ). By Proposition 3.12, sδ = sU . This completes the proof. 

Definition 3.14. For a finite, non-empty   0 ∞ dimfH (F ) :=  s0

subset F ⊆ Rn , we define if |F | = 1, if K 1 (F ) = ∅, if K 1 (F ) 6= ∅.

We can summarize our results so far as follows

Theorem 3.15. Let F be a non-empty, finite set. Then dimfH (F ) is a positive real number if and only if F has no focal points; it is infinity if and only if F has focal points, and it is zero when F has one element. Theorem 3.16. Let η : X → X ′ be (r, β)-H¨ older, and F ⊆ X a finite space. Then β · dimfH (η(F )) = dimfH (F ). In particular, dimfH is preserved by similarities. Proof. Suppose first that F has no focal points. Then, dimfH (η(F )) is the unique solution of the equation H s (η(F )) = ∆(η(F ))s .

(3.3.3)

Using Proposition 3.9 and Lemma 3.8, we see that (3.3.3) is equivalent to H sβ (F ) = ∆(F )sβ , whose only solution is dimfH (F ), as desired. By Lemma 3.7, F has focal points [resp. |F | = 1] if and only if η(F ) has focal points [resp. |η(F )| = 1]. Hence, the dimension of F is infinity [resp. zero] if and only if the dimension of η(F ) is infinity [resp. zero]. This completes the proof.  Theorem 3.17. Let F be a finite space with no focal points. Suppose U ∈ Kδ1 (F ), for some δ ≥ ∇(F ), and let a1 [resp. ak ] denote the smallest [resp. largest] diameter of elements of U. Then (i)

(3.3.4)

ln |U|

ln

∆(F ) δ(F )



ln |U|

ln

∆(F ) a1

≤ sU ≤

ln |U|

ln

∆(F ) ak



ln |U|

) ln ∆(F δ

(ii) (3.3.5)

∆(F ) ∆(F ) ∆(F ) ∆(F ) ≤ |U|1/sU ≤ ≤ ≤ δ ak a1 δ(F )

Proof. By equation (3.1.1), HUs (F ) = m1 as1 + · · · + mk ask . Hence, (3.3.6)

|U| δ(F )s ≤ |U| as1 ≤ HUs (F ) ≤ |U| ask ≤ |U| δ s

14

JUAN M. ALONSO

We introduce the following definitions, as shorthand: f1 (s, U) := |U| δ(F )s , f2 (s, U) := |U| as1 , f3 (s, U) := HUs (F ), f4 (s, U) := |U| ask , f5 (s, U) := |U| δ s . If η is an rsimilarity, fi (s, η∗ (U)) = rs fi (s, U), by the results of Section 3.2. It follows that the equations fi (s, η∗ (U)) = ∆(η(F ))s ,

(3.3.7)

fi (s, U) = ∆(F )s ,

and

are equivalent, i.e. have the same solutions. As in the proof of Proposition 3.12, we may assume, without loss of generality, that ∆(F ) = 1. When this is the case, all five functions fi are decreasing, fi (0, U) = |U|, and they all tend to zero when s goes to infinity. Since every number in (3.3.4) is the solution of an equation (3.3.7), and these can be computed solving fi (s, U) = 1, we see that (3.3.4) follows from (3.3.6). (ii) (3.3.5) is an immediate consequence of (3.3.4). This proves the theorem.  1 Corollary 3.18. Suppose δ = ∇(F ), and U ∈ K∇(F ) (F ). Then

(i)

ln |U|

ln

∆(F ) δ(F )

≤ sU ≤

(ii)

ln |U|

∆(F ) ln ∇(F )

∆(F ) ∆(F ) ≤ |U|1/sU ≤ · ∇(F ) δ(F )

The first upper bound in the next corollary follows from Remark 2.10. Corollary 3.19. Suppose F has no focal points. Then (i) ln(|F | − 1) ln 2 ≤ dimfH (F ) ≤ ∆(F ) ) ln δ(F ) ln ∆(F ∇(F ) (ii)

 dimfH (F ) 2 ≤ ∆(F )/δ(F ) ,

and

 dimfH (F ) ∆(F )/∇(F ) ≤ |F | − 1.

4. Finite Box Dimension, dimfB (F ).

The classical box-counting (or Minkowski-Bouligand) dimension will be denoted dimB (−). In this section we define an analog for finite metric spaces, denoted dimfB (−), and called finite box dimension. We follow the same pattern we used to define finite Hausdorff dimension. The proofs for finite box-dimension are similar, but usually simpler, than those for finite Hausdorff dimension, and will be left to the reader. Definition 4.1. For U ∈ K(F ), set For δ ≥ ∇(F ), set: Bδs (F ) := Finally,

BUs (F ) := |U| ∆(U)s .



min{BUs (F )|U ∈ Kδ1 (F )}, min{BUs (F )|U ∈ K(F )},

when K 1 (F ) 6= ∅, when K 1 (F ) = ∅.

B s (F ) := max{Bδs (F )|δ ≥ ∇(F )}.

Lemma 4.2. Suppose ∇(F ) ≤ δ ≤ δ ′ . Then Bδs (F ) ≥ Bδs′ (F ).

A HAUSDORFF DIMENSION FOR FINITE SETS

15

Lemma 4.3. For any finite space F we have:  s B∇(F ) (F ) = min{BUs (F )|∆(U) = ∇(F )}, when K 1 (F ) 6= ∅, B s (F ) = ∆(F )s = Bδs (F ), for all δ ≥ ∇(F ), when K 1 (F ) = ∅. Definition 4.4. Let F be a finite metric space with no focal points, and suppose δ ≥ ∇(F ). Define: Tδ (F ) := min {|U| | U ∈ Kδ1 (F )}.

Note that Tδ (F ) ≤ |F | − 1, by Remark 2.10. Also, Tδ (F ) ≥ 2. Corollary 4.5. If F has no focal points, then B s (F ) = T∇(F ) (F ) ∇(F )s . Lemma 4.6. Suppose that ∆(F ) = 1, and K 1 (F ) 6= ∅. Let f (s) denote any of the following functions: (i) BUs (F ), for all U ∈ K 1 (F ), (ii) Bδs (F ), for all δ ≥ ∇(F ), (iii) B s (F ). Then f is a positive, strictly decreasing function, f (0) ≥ 2, and lims→∞ f (s) = 0. Proposition 4.7. Let η : X → X ′ be an (r, β)-H¨ older equivalence, and F ⊆ X a finite space. Then, for all s ∈ [0, ∞): (i) Bηs∗ (U ) (η(F )) = rs B sβ U(F ), for all U ∈ K(F ). s s sβ (ii) Brδ δ(F ), for all δ ≥ ∇(F ). β (η(F )) = r B (iii) B s (η(F )) = rs B sβ (F ).

4.1. Definition of dimfB (F ). We define finite box-dimension, dimfB (F ), by solving the equation: B s (F ) = ∆(F )s ,

(4.1.1)

which has exactly one solution sb0 ∈ (0, ∞), precisely when F has no focal points. More generally, we have: Proposition 4.8. Consider the following equations: (i) ∆(F )s = BUs (F ), for all U ∈ K(F ), (ii) ∆(F )s = Bδs (F ), for all δ ≥ ∇(F ), (iii) ∆(F )s = B s (F ). Then, in each of these cases, the equation has a unique solution iff F has no focal points. When this is the case, the solutions are positive real numbers, and will be denoted, respectively, sbU , sbδ , and sb0 . Moreover, (4.1.2)

sb0 =

ln T∇(F ) (F ) ∆(F ) ln ∇(F )

,

and

sbU =

Lemma 4.9. Suppose F has no focal points. Then (i) sb0 = sb∇(F ) . (ii) sb0 = max{sbδ |δ ≥ ∇(F )}. (iii) sbδ = min{sbU |U ∈ Kδ1 (F )}, for all δ ≥ ∇(F ).

ln |U|

) ln ∆(F ∆(U )

.

16

JUAN M. ALONSO

Definition 4.10. For a finite, non-empty   0 ∞ dimfB (F ) :=  b s0

subset F ⊆ (X, d), we define if |F | = 1, if K 1 (F ) = ∅, if K 1 (F ) 6= ∅.

We can summarize our results so far as follows:

Theorem 4.11. Let F be a non-empty, finite set. Then dimfB (F ) is a positive real number if and only if F has no focal points; it is infinity if and only if F has focal points, and it is zero when F has one element. When F has no focal points, (4.1.3)

dimfB (F ) =

ln T∇(F ) (F ) ∆(F ) ln ∇(F )

·

Theorem 4.12. Let η : X → X ′ be an (r, β)-H¨ older equivalence, and F ⊆ X a finite space. Then β · dimfB (η(F )) = dimfB (F ).

(4.1.4)

5. Bounds. In this section we collect technical results that are useful when computing finite dimension. Most of the results are classical ones adapted to the present situation. We start with the relationship between finite Hausdorff and finite box dimension. Lemma 5.1. Suppose F has no focal points, δ ≥ ∇(F ), and s ∈ [0, ∞). Then: (i) Tδ (F )∇(F )s ≤ BUs (F ) ≤ |U| δ s , ∀ U ∈ Kδ1 (F ). (ii) Tδ (F )∇(F )s ≤ Bδs (F ) ≤ Tδ (F ) δ s . (iii) HUs (F ) ≤ BUs (F ), ∀ U ∈ K 1 (F ). (iv) Hδs (F ) ≤ Bδs (F ). (v) H s (F ) ≤ B s (F ). Proof. (i) This follows from the definitions and the inequalities ∇(F ) ≤ ∆(U) ≤ δ, valid for all U ∈ Kδ1 (F ). (ii) Both inequalities follow from (i) and the definition of Bδs . (iii) Let U = {U1 , . . . , Un } ∈ K 1 (F ). Then HUs (F ) =

n X i=1

∆(Ui )s ≤

n X i=1

∆(U)s = |U| ∆(U)s = BUs (F ), as desired.

(iv) Using (iii) and the fact that K 1 (F ) 6= ∅, we have

Hδs (F ) := min {HUs (F )| U ∈ Kδ1 (F )} ≤ min {BUs (F )| U ∈ Kδ1 (F )} := Bδs (F ),

as was to be proved. The proof of (v) is similar, but one uses (iv) instead of (iii). This completes the proof of the lemma.  1 s s Corollary 5.2. For any U ∈ K∇(F ) (F ), H (F ) ≤ |U| ∇(F ) .

Proof. Given that F has no focal points, for any such U, H s (F ) ≤ HUs (F ) ≤ BUs (F ), by (iii) of the lemma.  Proposition 5.3. Let F be a finite metric space. Then, (5.0.5)

dimfH (F ) ≤ dimfB (F ) ≤

ln(|F | − 1) ∆(F ) ln ∇(F )

·

A HAUSDORFF DIMENSION FOR FINITE SETS

17

Proof. Clearly, the first inequality holds (with equality) when F has only one element, or when it has focal points. So suppose F has no focal points. Since both H s , B s are invariant under similarities, we can assume, without loss of generality, that ∆(F ) = 1. In this case, the desired inequality follows from the fact that H s (F ) ≤ B s (F ), proved in Lemma 5.1. The last inequality follows from Theorem 4.11 and Definition 4.4. This completes the proof.  5.1. Locally uniform spaces. These are spaces for which the two finite dimensions we introduced coincide. Definition 5.4. A finite metric space is called locally uniform when δ(F ) = ∇(F ). Equivalently, when νF is constant. Proposition 5.5. If F has no focal points and is locally uniform, then: dimfH (F ) = dimfB (F )· Consequently, dimfH (F ) · T∇(F ) (F ) = ∆(F )/∇(F )

Proof. Recall the notation a1 , . . . , ak introduced just before equation (3.1.1). In general, δ(F ) ≤ a1 < ak ≤ ∇(F ). Our hypothesis imply k = 1, and δ(F ) = a1 = ∇(F ). The proposition follows from Corollary 3.18.  Example 5.6. Let F be an arbitrary finite metric space. Consider its double, Dx (F ), defined as follows. Abstractly, it is the product of F with {0, 1}, where d(0, 1) = x, with the product metric. More concretely, we can assume F ⊆ Rn . Then Dx (F ) := {(bi , ε) ∈ Rn+1 |bi ∈ F, ε = 0, x}·

It is easy to see that Dx (F ) is locally uniform when x < δ(F p ). In this case, T∇(Dx (F )) (Dx (F )) = |F |, ∇(Dx (F )) = x, and ∆(Dx (F )) = ∆(F )2 + x2 . By Proposition 5.5, ln |F | dimfH (Dx (F )) = q · ) 2 ln 1 + ∆(F x 5.2. Mass distributions. Mass distributions are used in the classical theory to obtain lower bounds to the Hausdorff dimension. A mass distribution is a function µ : F → [0, ∞). We extend to subsets F ′ ⊆ F by X µ(F ′ ) := µ(x). x∈F ′

Lemma 5.7. For any family {Ui }i = 1m of subsets of F , we have: µ(

m [

i=1

Proof. Obvious.

Ui ) ≤

m X

µ(Ui ).

i=1



18

JUAN M. ALONSO

Proposition 5.8. Let µ be a mass distribution on a finite set F with no focal points. Suppose there exist c > 0, s > 0, such that µ(U ) ≤ c ∆(U )s , for all U ⊆ F with ∆(U ) ≤ ∇(F ), and |U | ≥ 2. Then µ(F ) ≤ cH s (F ). If, moreover, c ∆(F )s ≤ µ(F ), then s ≤ dimfH (F ). 1 Proof. Let U = {U1 , . . . , Un } ∈ K∇(F ) (F ), be arbitrary. By hypothesis, µ(Ui ) ≤ s c ∆(Ui ) . Hence, n n X X [ ∆(Ui )s = cHUs (F ). µ(Ui ) ≤ c µ(F ) = µ( Ui ) ≤ i=1

i=1

This readily implies that µ(F ) ≤ cH s (F ), as was to be proved. If we also know that c ∆(F )s ≤ µ(F ), then ∆(F )s ≤ H s (F ), which shows that s ≤ dimfH (F ). This completes the proof.  6. Computations. We collect first results of a more or less general nature, and then compute several examples. We begin by showing that every positive real number is the dimension of some finite metric space. Theorem 6.1. For every t ∈ [0, ∞] there exist finite spaces Ft , such that dimfH (Ft ) = t = dimfB (Ft ). Proof. We construct a family Ft of locally uniform spaces, so that both dimensions coincide. For t = 0 [resp. t = ∞] we can take Ft to be any singleton [resp. any two-point set]. Suppose now that t is a positive real number, and consider first the 2 case where t ∈ [1, ∞). For ε ∈ (0, 1/2], define A(ε) :=  {a1 , a2 , a3 } ⊆ R , where √ a1 = (0, 0), a2 = (1, 0), and a3 = 12 , 12 3 − 8ε + 4ε2 . We have d(a1 , a2 ) = 1, and d(a1 , a3 ) = d(a2 , a3 ) = 1 − ε. A(ε) is locally uniform, since δ(A(ε)) = 1 − ε = ∇(A(ε)). Clearly, T∇(A(ε)) (A(ε)) = 2, and ∆(A(ε)) = 1. By Proposition 5.5, dimfH (A(ε)) = dimfB (A(ε)) =

ln 2 · 1 ] ln[ 1−ε

Setting εt := 1 − 2−1/t , and Ft := A(εt ), we have dimfH (Ft ) = t, as desired. Suppose now that t ∈ (0, 1). Set x(t) := (41/t − 1)−1/2 , and √ use the double Dx(t) (F ) of Ex. 5.6, for F := {0, 1} ⊆ R. Since x(t) ∈ (0, 1/ 3), the double is locally uniform, and dimfH (Dx(t) (F )) = dimfB (Dx(t) (F ))) = t, as desired. The proof is complete.



Example 6.2. Let Ln ⊆ R1 denote a set with n equally spaced points. Then Ln is locally uniform and, for n ≥ 3, we have: dimfH (L2k ) =

ln k ; ln(2k − 1)

dimfH (L2k+1 ) =

ln(k + 1) · ln(2k)

Indeed, if the distance between consecutive points is x > 0, then δ(Ln ) = x = ∇(Ln ), ∆(Ln ) = (n − 1)x, and T∇(Ln ) (Ln ) equals k [resp. k + 1], for n = 2k

A HAUSDORFF DIMENSION FOR FINITE SETS

19

[resp. n = 2k + 1]. Applying Proposition 5.5, the result follows. Note that limk→∞ dimfH (Fk ) = 1 (the sequence is strictly increasing for k ≥ 4). Lemma 6.3. Let F denote a finite metric space with at least 3 elements, and let t be a positive real number. Then the following conditions are equivalent: (i) dimfH (F ) < t, (ii) H t (F ) < ∆(F )t , 1 (iii) ∃ U = {U1 , . . . , Um } ∈ K∇(F ) (F ), such that m X

∆(Ui )t < ∆(F )t .

i=1

Proof. We may assume, without loss of generality, that ∆(F ) = 1. Set s0 = dimfH (F ), so that H s0 (F ) = 1. By Lemma 3.4, if s0 < t, then H t (F ) < 1, as 1 desired. We show now that (ii) implies (iii). Given t, we can find U ∈ K∇(F ) (F ) P t t t such that H (F ) = HU (F ) = ∆(Ui ) , as required. Suppose now that (iii) holds. 1 t t t Given U ∈ K∇(F ) (F ) satisfying HU (F ) < 1, we have H (F ) ≤ HU (F ) < 1. Hence, s0 < t. This completes the proof.  Theorem 6.4. Let F be a subset of R1 . Then (i) If |F | = 3, then dimfH (F ) = 1, and dimfB (F ) ≥ 1. (ii) If |F | ≥ 4, then dimfH (F ) < 1. Proof. Assume F = {a0 , a1 , . . . , an }, and, without loss of generality, that a0 = 0, and 0 < a1 < · · · < an . Let yi := ai − ai−1 > 0, for i = 1, . . . , n. We have ∆(F ) = an = y1 + · · · + yn . We prove (i). When |F | = 3, ∇(F ) = max{y1 , y2 } = y2 , say, and K 1 (F ) = 1 K∇(F ) (F ) = {U}, with U = {{a0 , a1 }, {a1 , a2 }}. Hence H 1 (F ) = y1 + y2 , and the equation H 1 (F ) = ∆(F ) yields dimfH (F ) = 1. On the other hand, B s (F ) = 2y2s , and dimfB (F ) = ln 2/ ln(1 + (y1 /y2 )) ≤ 1. The proof of (i) is complete. Consider (ii). Note that νF (a1 ) = y1 , νF (an ) = yn , and νF (ai ) = min {yi , yy+1 }. Suppose ∇(F ) = yk , and consider the 2-covering (n ≥ 3): U := {{a0 , a1 }, {a1 , a2 }, . . . , {an−1 , an }}. We distinguish two cases: (a) yk = max {yi |1 ≤ i ≤ n}, and (b) yk < yj , for some j ≤ n. In case (a), ∆(U) = max {yi } = yk = ∇(F ). Then U ′ := U \ {a1 , a2 } still 1 1 covers (because n ≥ 3), and U ′ ∈ K∇(F ) (F ). Clearly, HU ′ (F ) < ∆(F ), and the result follows from Lemma 6.3. In case (b), we set: U ′′ := U \ {Uℓ ∈ U|∆(Uℓ ) > yk },

where Uℓ stands for {aℓ , aℓ+1 }. Now, U ′′ would fail to cover F , only if: (1) two consecutive Ui are removed, or (2) U1 or Un are removed. In case (1), suppose we have removed Uℓ , Uℓ+1 , for some 1 < ℓ < n. Then νF (aℓ ) = min {yℓ , yℓ+1 } > ∇(F ), a contradiction. To deal with (2), notice that always y1 , yn ≤ ∇(F ), so these sets 1 1 will not be removed. In conclusion, U ′′ is in K∇(F ) (F ), and HU ′′ (F ) < ∆(F ), and Lemma 6.3 gives the result. This proves (ii) and completes the proof.  Remark 6.5. When |F | > 3, dimfB (F ) can be larger or smaller than 1.

20

JUAN M. ALONSO

Corollary 6.6. Let F be a three-point subset of Rn . Then dimfH (F ) = 1 if and only if F is collinear. Proof. Suppose F has dimension one. Let a ≤ b ≤ c denote the three pairwise distances between F ’s elements. Since c = ∆(F ), and b = ∇(F ), we see that b < c. It follows that H s (F ) = as + bs . By hypothesis, a + b = H 1 (F ) = ∆(F )1 = c, hence the points are collinear. The converse follows from Theorem 6.4. The proof is complete.  Example 6.7. Let F ⊆ R2 consist of the points (0, 0), (0, 3), (4, 0). We let F2 [resp. F1 , F∞ ] denote F with the Euclidean [resp. ℓ1 , ℓ∞ ] metric. Then dimfH (F2 ) = 2 < ln 2/ ln(5/4) = dimfB (F2 ); and dimfH (F1 ) = 1 < ln 2/ ln(7/4) = dimfB (F1 ). But dimfH (F∞ ) = ∞ = dimfB (F∞ ) because F∞ has a focal point. Example 6.8. (Continues Ex. 3.6). By Theorem 3.16, dimfH (Fn′ ) = 2 · dimfH (Fn ). Using Ex. 6.2, ′ dimfH (F2k )=

ln k 2 ; ln(2k − 1)

′ dimfH (F2k+1 )=

ln(k + 1)2 · ln(2k)

Note that the sequence dimfH (F3′ ) = 2, dimfH (F4′ ) ≈ 1.26, dimfH (F5′ ) ≈ 1.58, . . . , converges: dimfH (Fn′ ) ր 2, as n → ∞.

While the classical Hausdorff dimension is well-behaved with respect to H¨ older transformations, dimfH (−) is not. For instance, for a function η : X → X ′ , the following assertions hold (see Falconer [8]): (i) If η satisfies an (r, β)-H¨ older condition, then β · dimH (η(X)) ≤ dimH (X)

(ii) If η is bi-Lipschitz, then

dimH (η(X)) = dimH (X) Lemma 3.11 suggests that these results cannot hold for dimfH (−). The example below shows this for (i). We leave it to the reader to find examples where (ii) fails. Example 6.9. Let F := {x1 , . . . , x4 } ⊆ R, where x1 = 0, x2 = 1, x3 = b ≥ 1, x4 = b + 1. Let F ′ := {y1 , y2 , y3 } ⊆ R, where yi = xi (i = 1, 2, 3). Define η : F → F ′ by η(xi ) := yi , for i = 1, 2, 3, and η(x4 ) := y3 . Clearly η is a 1-similarity, and η(F ) = F ′ . However, dimfH (η(F )) = dimfH (F ′ ) = 1, while dimfH (F ) < 1, both claims by Theorem 6.4. Thus, dimfH (η(F )) > dimfH (F ), contrary to (i) above. Example 6.10. [Cantor set] This example is related to the classical Cantor set C. Define a sequence of finite spaces Ln ⊆ R, starting with L0 := {0}. Next, add a point to L0 , at distance 2/3, to obtain L1 . For Ln , start from Ln−1 , and add a point at distance 2/3n to the right of every point of Ln−1 . One can see that |Ln | = 2n , δ(Ln ) = 2/3n = ∇(Ln ), so that the Ln are locally uniform. Moreover, ∆(Ln ) = (3n − 1)/3n , and T∇(Ln ) (Ln ) = 2n−1 . Using Proposition 5.5, dimfH (Ln ) = dimfB (Ln ) =

ln 2n−1 · n ln 3 2−1

The attentive reader will have noticed the following convergence properties: Ln → C in the Hausdorff metric (see the next section for more details), and ln 2 lim dimfH (Ln ) = = dimH (C)· n→∞ ln 3

A HAUSDORFF DIMENSION FOR FINITE SETS

21

Example 6.11. [Square of Cantor set] Consider the cartesian √ square of√the previous example, L2n ⊆ I × I ⊆ R2 . Thus, |L2n | = 22n , ∆(L2n ) = 2∆(Ln ) = 2(3n − 1)/3n , δ(L2n ) = δ(Ln ) = 2/3n = ∇(L2n ). Finally, T∇(L2n ) (L2n ) = |Ln |T∇(Ln ) (Ln ) = 22n−1 . Again, using Proposition 5.5, dimfH (L2n ) = dimfB (L2n ) =

ln 22n−1 ln(



2(3n −1) ) 2

·

As in the previous example, L2n → C 2 , and lim dimfH (L2n ) = 2

n→∞

ln 2 = dimH (C 2 )· ln 3

Example 6.12. [Sierpinski triangle]. Construct a sequence of finite spaces Ln , as follows. L0 consists of a single point, say the origin of R2 . Choose two directions, one in the direction of the x-axis, the other forms a 60 degree angle with the xaxis, and points towards the first quadrant. To build L1 , start with L0 and add two points in the given directions, at distance 1/2. Inductively, construct Ln from Ln−1 , by adding two points in the given directions to each point of Ln−1 , at distance 1/2n. The following properties are easy to check: |Ln | = 3n , ∆(Ln ) = (2n − 1)/2n , and δ(Ln ) = 1/2n = ∇(Ln ). Finally, T∇(Ln ) (Ln ) = 3n−1 . Hence, dimfH (Ln ) = dimfB (Ln ) =

ln 3n−1 · ln(2n − 1)

The reader can check that Ln → ST , where ST stands for the Sierpinski triangle, and ln 3 = dimH (ST )· lim dimfH (Ln ) = n→∞ ln 2 Proceeding in a similar way, but starting with three directions in R3 , each forming a 60 degree angle with the other, one can construct a sequence of finite spaces Ln , related to the Sierpinski tetrahedron ST h. It is not difficult to check that: |Ln | = 4n , ∆(Ln ) = (2n −1)/2n , and δ(Ln ) = 1/2n = ∇(Ln ). Finally, T∇(Ln ) (Ln ) = 4n−1 . Hence, ln 4n−1 dimfH (Ln ) = dimfB (Ln ) = · ln(2n − 1) As before, Ln → ST h, and lim dimfH (Ln ) =

n→∞

ln 4 = 2 = dimH (ST h)· ln 2

Example 6.13. [Cantor carpet]. Recall the classical Cantor carpet CC, and let Q0 denote the unit square I 2 ⊆ R2 . Divide Q0 into nine subsquares of side 1/3, and remove the interior of the central one; call this set Q1 . Let Q1,i (i = 1, . . . , 8) denote the eight remaining subsquares of Q1 . To obtain Q2 , replace each Q1,i , by Q1 scaled by 1/3. Thus Q2 has nine holes: a central square of side 1/3, and eight squares of side 1/32, surrounding the large one. In general, Qn+1 is obtained from Qn by replacing each Qn ∩ Q1,i (i = 1, . . . , 8), by Qn scaled by 1/3. Thus, Qn+1 has 8n holes that are squares of side 1/3n (the holes of smallest size in Qn+1 ). We approximate CC by an increasing sequence of finite spaces Ln (see Figure 1). Let L0 consist of the four vertices of the unit square. To obtain L1 we replace each Q1,i (i = 1, . . . , 8), by L0 scaled by 1/3; so L1 consists of 16 points. In general,

22

JUAN M. ALONSO q⑦ q q q q q q q q qt q q q q q q q q qt q q q q q q q q qt②q q q q q q q q qt q q q q q q q q qt q q q q q q q q qt②q q q q q q q q qt q q q q q q q q qt q q q q q q q q q⑦ q q

qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt

q q q q

q q q q

q q q q q q q

q q q q q q q

q q q q q q q

q q q q q q q

q q q q

q q q q

qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt q q qt

q q q q q q q q q q q q q q q q q q q q q q q q q q q q

q q q q q q q q q q q q q q q q q q q q q q q q q q q q

q②q q q q q qt q q q q q qt q q q q q qt②q q q qt q q qt q q qt②q q q q q qt q q q q q qt q q q q q q②q

q q q q q q q q q q

qt q q qt q q qt q q qt

q q q q q q q q q q

qt q q qt q q qt q q qt

q q q q

q q q q

q q q q

q q q q

q q q q

q q q q

q q q q

q q q q

tq q q tq q q tq q q tq

q q q q q q q q q q

q q q q q q q q q q

qt q q qt q q qt q q qt

q q q q q q q q q q

q q q q q q q q q q

q q ② q q q q tq q q q q q tq q q q q q q q t② q q q q tq q q q q q tq q q q q q q q t② q q q q tq q q q q q tq q q q q q q q ②

q q q q q q q q q q q q q q q q q q q q q q q q q q q q

tq q q q q q tq q q q tq q q q q q tq q q q q q tq q q q tq q q q q q tq q q q q q tq q q q tq q q q q q tq q

q q q q q q q q q q q q q q q q q q q q q q

tq q q tq q q tq q q tq q q tq q q tq q q tq q q tq q q tq q q tq

q q q q q q q q q q q q q q q q q q q q q q q q q q q q

q q q q q q q q q q q q q q q q q q q q q q q q q q q q

q ⑦ q q tq q q tq q q ② tq q q tq q q tq q q ② tq q q tq q q tq q q q ⑦

Figure 1. Cantor Carpet: L0 ⊆ L1 ⊆ L2 ⊆ L3 . Ln+1 is obtained from Ln by replacing each Ln ∩ Q1,i (i = 1, . . . , 8), by Ln scaled by 1/3. Let’s compute recursively the cardinality of the Ln . The four corner portions of Ln+1 have 4|Ln | elements, and the rest (four portions which, together with the missing central part, build a ”cross”) contributes with 4(|Ln | − 2(3n + 1)); i.e. each of these four squares contributes with |Ln | minus two sides (which they have in common with each corner portion). So we have the recursive formula: (6.0.1)

|L0 | = 4,

We claim that (6.0.2)

|Ln+1 | = 4|Ln | + 4(|Ln | − 2(3n + 1)) (n ≥ 0) 8n ≤ |Ln | ≤ 2 · 8n ,

n ≥ 2.

for

n

To prove (6.0.2), write |Ln | = 8 + kn (n ≥ 0). We show that kn > 0 for all n, by establishing the following claims. First, 8kn ≥ 22(3n + 1), which is valid for n ≥ 2, and second, kn ≤ 8n , valid for n ≥ 1. The inequalities (6.0.2) follow directly from these claims. For the first claim, let n ≥ 2, and consider the inductive step. Using (6.0.1), kn+1 = 8[kn − (3n + 1)]. By the induction hypothesis, 8kn+1 ≥ 8 · 22 · (3n + 1) − 82 (3n + 1) = 112 · (3n + 1). The desired inequality 8kn+1 ≥ 22 · (3n+1 + 1) follows, since 22 · (3n+1 + 1) = 66 · (3n + 1) − 44, and 112 · (3n + 1) ≥ 66 · (3n + 1) − 44. The remaining inequality, kn ≤ 8n (n ≥ 1), follows easily by induction. For the inductive step, recall kn+1 = 8[kn − (3n + 1)] ≤ 8 · 8n − 8 · (3n + 1) ≤ 8n+1 . This establishes (6.0.2). √ The Ln have the following properties: ∆(Ln ) = 2, δ(Ln ) = 1/3n = ∇(Ln ). Moreover, T∇(Ln ) (Ln ) = |Ln |/2. Hence, dimfH (Ln ) = dimfB (Ln ) =

ln(|Ln |/2) √ · ln(3n 2)

Clearly, Ln → CC and, in view of (6.0.2), ln 8 lim dimfH (Ln ) = = dimH (CC)· n→∞ ln 3 7. Convergence Let Z be an arbitrary metric space. The set of all closed and bounded subsets of Z with the Hausdorff distance dH is a metric space, denoted (M(Z), dH ) or,

A HAUSDORFF DIMENSION FOR FINITE SETS

23

more simply, M(Z). In this section we prove two convergence theorems. They give conditions under which, if Fn is a sequence of finite spaces that converges in M(Z) to a space X, then dimfH (Fn ) → dimH (X) [resp. dimfB (Fn ) → dimB (X)]. 7.1. Preliminaries. The next approximation result is well-known, see e.g. [4]. Proposition 7.1. Every compact subset of a metric space Z is the limit, in (M(Z), dH ), of a sequence of finite subsets. Next, we extend previous definitions to spaces that are not necessarily finite. Definition 7.2. Let (X, d) be an arbitrary metric space. The diameter of X, ∆(X), is sup{d(x, y)|x, y ∈ X}. When ∆(X) > 0 (equivalently, X has at least two points), we define νX : X → R, by and the constants:

νX (x) := inf {d(x, y)|y ∈ X, x 6= y},

δ(X) := inf{νX (x)|x ∈ X} ∈ [0, ∞),

∇(X) := sup{νX (x)|x ∈ X} ∈ [0, ∞].

Remark 7.3. In contrast to the diameter, which is defined for any (non-empty) space, we define νX , δ(X), and ∇(X), only for spaces with at least two points. As before, notice that 0 ≤ δ(X) ≤ ∇(X) ≤ ∆(X). However, δ(X) may now be zero. When this is the case, X is infinite (the converse is false, as X = N shows). The next result clarifies the meaning of ∇(X) = 0. Recall that we say that X has no isolated points if, for all ε > 0, for all x ∈ X, there is y 6= x in X, such that d(x, y) < ε. Lemma 7.4. Let X be an arbitrary metric subspace of Z, satisfying ∆(X) > 0. Then the following conditions are equivalent: (i) ∇(X) = 0. (ii) X ⊆ X ′ , where X ′ denotes the derived set. (iii) X has no isolated points. (iv) X ∩ B(x, r) is infinite, for all x ∈ X, and all r > 0. (v) Let x ∈ X be arbitrary, and let {xk } be any sequence in X that converges to x. Then there is a sequence {x′k } in X, such that (a) {x′k } converges to x, (b) 0 < d(xk , x′k ) < 1/k, for all k ∈ N, (c) For all k, ℓ ∈ N, if x′k = x′ℓ , then k = ℓ. If, in addition, X is closed, then (ii) can be replaced by the condition that X is perfect (i.e. X = X ′ ). In particular, X is uncountably infinite.

Proof. The equivalences (i)–(iv) will be left to the reader. Given (iv) and xk → x, we define {x′k } inductively. Set x′1 := x1 . Suppose x′1 , . . . , x′k are constructed, and satisfy (b)–(c). Since X ∩ B(xk+1 , 1/(k + 1)) is infinite, we can find x′k+1 ∈ X ∩ B(xk+1 , 1/(k + 1)) such that x′k+1 ∈ / {x′1 , . . . , x′k , xk+1 }. Clearly (b)–(c) are satisfied, and the constructed sequence satisfies (a) too, as desired. We show that (v) implies (iv). Let x ∈ X be arbitrary. From the constant sequence xk := x, for all k, we obtain, by (v), a sequence satisfying (a)–(c). Then the intersection of X with any ball centered at x, will contain a tail of the sequence, which consists of distinct points. Thus, the intersection must be infinite. When X is closed we have X ′ ⊆ X. This, together with (ii), gives X = X ′ , i.e. X is perfect. The converse is obvious. The proof is complete. 

24

JUAN M. ALONSO

Lemma 7.5. Let (X, d) be a metric space with ∆(X) > 0. Then νX is continuous. Proof. Suppose ε > 0, and x0 ∈ X are given. Then there is x′0 ∈ X such that 0 < d(x0 , x′0 ) < νX (x0 ) + ε. Let δ satisfy: 0 < δ < min{d(x0 , x′0 ), ε}. Then, for any x ∈ B(x0 , δ) ∩ X, d(x, x′0 ) ≤ d(x, x0 ) + d(x0 , x′0 ) < δ + νX (x0 ) + ε. Note that x is different from x′0 , for if x = x′0 , then δ > d(x, x0 ) = d(x′0 , x0 ) > δ, a contradiction. It follows that νX (x) ≤ d(x, x′0 ) < νX (x0 ) + ε + δ < νX (x0 ) + 2ε

To prove νX (x0 ) < νX (x) + 2ε, choose x′ ∈ X such that 0 < d(x, x′ ) < νX (x) + ε. If x′ 6= x0 , then νX (x0 ) ≤ d(x0 , x′ ) ≤ d(x0 , x) + d(x, x′ ) < δ + νX (x) + ε ≤ νX (x) + 2ε, as desired. If x′ = x0 , then νX (x0 ) ≤ d(x, x0 ) < νX (x) + ε. The proof is complete.  Lemma 7.6. Suppose that X, Y ⊆ Z, are arbitrary metric spaces, and δ > 0. If dH (X, Y ) < δ, then |∆(X) − ∆(Y )| ≤ 2δ. Proof. For arbitrary x1 , x2 ∈ X, one can find y1 , y2 ∈ Y such that d(yi , xi ) < δ. It follows that d(x1 , x2 ) < 2δ + d(y1 , y2 ) ≤ 2δ + ∆(Y ). Hence, ∆(X) ≤ 2δ + ∆(Y ). The reverse inequality is proved similarly. The proof is complete.  Lemma 7.7. Suppose that X, Y ⊆ Z, are arbitrary compact metric spaces, and ∇(X) = 0. Then, for all ε > 0, there exists δ > 0, such that, if dH (X, Y ) < δ, then ∇(Y ) < ε. Proof. Suppose, for contradiction, that we can find ε > 0, a compact space X with ∇(X) = 0, and, for all δ > 0, a compact Yδ , such that dH (X, Yδ ) < δ

and

∇(Yδ ) ≥ ε.

In particular, for all k ∈ N, there exist compact spaces Yk ⊆ Z, such that (7.1.1)

dH (X, Yk ) < 1/k

and

∇(Yk ) ≥ ε.

By compactness of Yk and continuity of νYk (Lemma 7.5), we can find points yk ∈ Yk (k ∈ N), such that

(7.1.2)

νYk (yk ) = ∇(Yk ).

Since dH (X, Yk ) < 1/k, Yk ⊆ SN1/k (X), the tubular neighborhood of X of radius 1/k, defined by N1/k (X) := x∈X B(x, 1/k). Hence, we can find xk ∈ X, such that, for all k ∈ N, (7.1.3)

d(xk , yk ) < 1/k.

∞ By compactness of X, there exists a subsequence of {xk }∞ k=1 , still denoted {xk }k=1 , and a point x0 ∈ X, such that

(7.1.4)

xk −→ x0 .

Observe that, since ∇(X) = 0, by Lemma 7.4(v), we can further assume that xk 6= xj , whenever k 6= j. In particular, xk = x0 , for at most one k ≥ 1. From (7.1.3) and (7.1.4), we get: (7.1.5)

yk −→ x0 .

By (7.1.4) and the remark following it, we can choose ℓ so large that (7.1.6)

0 < d(x0 , xℓ ) < ε/100.

A HAUSDORFF DIMENSION FOR FINITE SETS

25

We set ε′ := d(x0 , xℓ ). Choose M > 3, and define r := ε′ /M . By (7.1.6), r > 0. By (7.1.5), there is m so large that (7.1.7)

1/m < r,

and

d(x0 , ym ) < r.

Thus, by (7.1.1) dH (X, Ym ) < 1/m < r, so that X ⊆ Nr (Ym ). Hence, there is a ′ point ym ∈ Ym , such that ′ d(xℓ , ym ) < r.

(7.1.8)

′ ′ Now, d(ym , ym ) ≤ d(ym , xℓ ) + d(xℓ , x0 ) + d(x0 , ym ) < ε′ (1 + 2/M ) < 2ε/100, where we have used (7.1.6)–(7.1.8). Hence, ′ d(ym , ym ) < ε

(7.1.9)

′ Moreover, we claim that d(ym , ym ) > 0. Indeed, using (7.1.7) and (7.1.8), we see ′ ′ ′ ′ that ε = d(x0 , xℓ ) ≤ d(x0 , ym ) + d(ym , ym ) + d(ym , xℓ ) < 2r + d(ym , ym ). Hence, ′ ′ d(ym , ym ) > ε − 2r = r(M − 2) > r > 0, as desired. It follows that ′ ). νYm (ym ) ≤ d(ym , ym

(7.1.10)

Then (7.1.1), (7.1.2), (7.1.9), and (7.1.10), yield:

′ ) < ε, ε ≤ ∇(Ym ) = νYm (ym ) ≤ d(ym , ym

a contradiction. This concludes the proof.



Lemma 7.7 is false without the hypothesis ∇(X) = 0, as the following example shows. Example 7.8. It follows from the lemma that Ys → X implies ∇(Ys ) → 0. We present an example where Ys → X, but ∇(Ys ) does not converge to ∇(X). Recall the double Ds (F ) defined in Example 5.6. Clearly, Ds (F ) → F , when s → 0. From Example 5.6, we know that for s small enough, ∇(Ds (F )) = s, so that ∇(Ds (F )) → 0. Hence, we obtain the desired counterexample by choosing any F with ∇(F ) > 0. For instance, F = {0, 1, 2, 3} ⊂ R. Lemma 7.9. Suppose X ⊆ Z is compact and ∇(X) = 0 < ∆(X). Let {Fk } be a sequence of finite subsets of Z that converges to X in (M(Z), dH ). Then limk→∞ T∇(Fk ) (Fk ) = ∞. Proof. Suppose, for contradiction, that (7.1.11)

∃M ≥1

∀N

∃ k ≥ N : T∇(Fk ) (Fk ) ≤ M.

Since ∆(X) > 0, we can choose distinct points x, x1 ∈ X. The condition ∇(X) = 0 implies, by Lemma 7.4, that x is not isolated. Applying Lemma 7.4(iv) repeatedly, we can construct, starting from x1 , a sequence {xi }∞ i=1 ⊆ X, such that:

d(x, xi ) . 2 For 1 ≤ i < j, we have d(xi , xj ) ≥ d(x, xi ) − d(x, xj ). Applying (7.1.12) repeatedly:

(7.1.12)

(7.1.13)

0 < d(x, xi+1 )
(2j−i − 1)d(x, xj ), for all i < j ∈ N.

If dH (Fk , X) < δ, there are points yik ∈ Fk that correspond to the xi , i.e. d(yik , xi ) < δ (both δ and k will be determined presently). It follows that d(xi , xj ) < 2δ + d(yik , yjk ) which, together with (7.1.13), yields: (7.1.14)

d(yik , yjk ) > (2j−i − 1)d(x, xj ) − 2δ, for all i < j.

26

JUAN M. ALONSO

For M in (7.1.11), set αM+1 := min{d(x, xi )|i = 1, . . . , M + 1} > 0. Choose δ so that: αM+1 (7.1.15) 0 < δ < M+1 . 2 By Lemma 7.4, there exists L ≥ 1 such that 0 < ∇(Fk ) < (αM+1 /2M ) − 2δ, for all k ≥ L. For this L we can find, by (7.1.11), ℓ ≥ L such that T∇(Fℓ ) (Fℓ ) ≤ M . So, for this ℓ, we have: αM+1 (7.1.16) ∇(Fℓ ) < M − 2δ, and T∇(Fℓ ) (Fℓ ) ≤ M. 2 ℓ Define S := {y1ℓ , . . . , yM+1 } ⊆ Fℓ . It follows from (7.1.15) and (7.1.14), that the elements of S are distinct, i.e. |S| = M + 1. Let U ∈ K∇(Fℓ ) (Fℓ ), U = {U1 , . . . , Un }, be a 2-covering of Fℓ , with n = T∇(Fℓ ) (Fℓ ). We claim that |Ui ∩ S| ≤ 1, for all Ui ∈ U. Otherwise, there would be two elements, say yiℓ , yjℓ ∈ Um ∩ S, for some i < j ≤ M + 1, 1 ≤ m ≤ n. Then d(yiℓ , yjℓ ) ≤ ∆(Um ) ≤ ∇(Fℓ ) < (αM+1 /2M ) − 2δ, by (7.1.16). Using (7.1.14) and simplifying, we have αM+1 /2M > (2j−i − 1)d(x, xj ) ≥ d(x, xj ) ≥ αM+1 ,

the last inequality by the definition Sn of αM+1 . This is a contradiction, Pnas desired. Now, since U covers Fℓ , S = i=1 Ui ∩ S, so that M + 1 = |S| ≤ i=1 |Ui ∩ S| ≤ n ≤ M , a contradiction. This establishes the lemma.  Lemma 7.10. Suppose given three convergent sequences of positive real numbers {Tk }, {∆k }, and {∇k }, such that ∇k < ∆k for all k, and Tk → ∞, ∆k → ∆ > 0, ∇k → 0. Then ln Tk ln Tk = lim lim ∆ k k→∞ − ln ∇k k→∞ ln ∇k Proof. The proof is obvious.



We can summarize these results in the following: Proposition 7.11. Suppose that X is a compact metric space satisfying ∇(X) = 0 < ∆(X), and limk→∞ Fk = X, for some sequence of finite metric spaces. Then lim dimfB (Fk ) = lim

k→∞

k→∞

ln T∇(Fk ) (Fk ) · − ln ∇(Fk )

Proof. Let Tk := T∇(Fk ) (Fk ), ∆k := ∆(Fk ), and ∇k := ∇(Fk ). By lemmas 7.6, 7.7, and 7.9, the hypothesis of Lemma 7.10 are satisfied for all large enough k. This gives the result.  7.2. Subsets of Rn . In this section we exploit special properties of Euclidean space to refine Proposition 7.1. For any ε > 0, consider the lattice ε · Zn ⊆ Rn . Using the hyperplanes parallel to the coordinate hyperplanes through each point of the lattice, we obtain a tiling of Rn by hypercubes. Let Q(ε) denote the set of all these hypercubes, which we simply call cubes. The cubes have side-length ε, and √ diameter ε n. Each cube is compact and has 2n corners that lie in ε · Zn . Each corner belongs to 2n cubes. Definition 7.12. For X ⊆ Rn , Q(ε, X) will denote the set of cubes of Q(ε) that have non-empty intersection with X.

A HAUSDORFF DIMENSION FOR FINITE SETS

27

Proposition 7.13. For every compact X ⊆ Rn (n ≥ 1), √ and ε > 0, there is a finite, locally uniform Fε ⊆ Rn , such that dH (Fε , X) ≤ ε n. In particular, given 0 < c < 1, there is a convergent sequence of finite, locally uniform Fk ⊆ Rn , such that Fk → X in (M(Rn ), dH ), and ∇(Fk+1 ) ≥ c · ∇(Fk ), for all k ∈ N. Proof. Define Fε := ε·Zn ∩Q(ε, X). Since X is bounded, Q(ε, X) is finite with, say, N elements. Hence, Fε contains no more than N ·2n points. Observe that if x ∈ Fε , then there is at least one Q ∈ Q(ε, X) such that x is one of the corners of Q. Hence, Fε contains not only x, but all 2n corners of Q. It follows that δ(F √ε ) = ∇(Fε ) = ε; in other words, Fε is locally uniform. Moreover, dH (X, Fε ) ≤ ε n. For the last part, given 0 < c < 1, the sequence Fk := Fck satisfies the required conditions. This completes the proof.  Corollary 7.14. Let X and {Fk } be as in Proposition 7.13, and suppose that ∇(X) = 0 < ∆(X). Then lim dimfH (Fk ) = lim

k→∞

k→∞

ln T∇(Fk ) (Fk ) · − ln ∇(Fk )

Proof. By Proposition 7.11, ln T∇(Fk ) (Fk ) · k→∞ − ln ∇(Fk )

lim dimfB (Fk ) = lim

k→∞

By Proposition 7.13, the Fk are locally uniform. Hence dimfH (Fk ) = dimfB (Fk ), by Proposition 5.5. The result follows.  S Lemma 7.15. Let C ⊆ Q(ε), and put F (C) := ε · Zn ∩ ( Q∈C Q), the set of all lattice points of the cubes of C. Then (7.2.1)

|F (C)| ≥ |C|.

Pn Proof. Consider the linear map f : Rn → R defined by f (x) := i=1 xi . Put M = max {f (x)|x ∈ F (C)}, and let a ∈ F (C) be a point with f (a) = M . We show that there is a unique cube Q(a) ∈ C, such that a ∈ Q(a). Let ei := (0, . . . , 0, 1, 0, . . . , 0), i = 1, . . . , n, denote the unit vectors of Rn , and define points pi = a + εei , and pn+i = a − εei , where i = 1, . . . , n. Observe that these points belong to ε · Zn , and f (pi ) = f (a) ± εf (ei ), which is equal to M + ε, when i = 1, . . . , n, and to M − ε, when i = n + 1, . . . , 2n. By the definition of M , only pi with i = n + 1, . . . , 2n, can belong to F (C). Exactly 2n cubes of the tiling Q(ε) contain a. Each of these is determined by a together with n lattice points chosen among those nearest to a, i.e. among the pi (i = 1, . . . , 2n). From the previous paragraph, we see that there is only one possible choice, namely B := {a, a − εe1 , . . . , a − εen }. Since, by definition, a must belong to some cube of C, we conclude that Q(a), the cube defined by B, is the only one that contains a and belongs to C, as desired. It is now straightforward to prove (7.2.1) by induction on m = |C|. Indeed, the inequality is obvious for m = 1. For the inductive step, suppose (7.2.1) holds for every set of no more than m cubes. For a subset C with m + 1 elements, find a and Q(a), and remove Q(a). The resulting C ′ has m cubes, a ∈ / F (C ′ ), and C = ′ ′ ′ ′ C ∪ Q(a). By induction, |F (C )| ≥ |C |, and |F (C)| ≥ |F (C) | + 1 ≥ |C ′ | + 1 = |C|, as desired. 

28

JUAN M. ALONSO

Proposition 7.16. Suppose that X ⊆ Rn is a compact space with ∇(X) = 0 < ∆(X), and Fk → X as in Proposition 7.13. If T ∇(Fk ) (X) := |Q(∇(Fk ), X)|, then (7.2.2) Hence also: (7.2.3)

(1/2) · T ∇(Fk ) (X) ≤ T∇(Fk ) (Fk ) ≤ 2n−1 · T ∇(Fk ) (X). 21−n · T∇(Fk ) (Fk ) ≤ T ∇(Fk ) (X) ≤ 2 · T∇(Fk ) (Fk ).

Proof. Apply Lemma 7.15 to C = Q(∇(Fk ), X), to obtain |Fk | ≥ T ∇(Fk ) (X). Notice that any subset U of Fk with three or more elements has diameter > ∇(Fk ), because all points of Fk belong to the lattice. As a consequence, every U in a 2-covering U of Fk , with ∆(U) = ∇(Fk ), contains exactly two elements. Hence, for any such U, 2 · |U| ≥ |Fk |. It follows that 2 · T∇(Fk ) (Fk ) ≥ |Fk | ≥ T ∇(Fk ) (X), which proves the first inequality of (7.2.2). The second inequality of (7.2.2) follows from the fact that every cube has a 2-covering by 2n−1 sets. To see this assume, without loss of generality, that we have a cube with base contained in the hyperplane xn = 0, and top contained in xn = ∇(Fk ). Then each element of the covering has the form {(x′ , 0), (x′ , ∇(Fk ))}, where (x′ , 0) runs over the 2n−1 corners of the base of the cube. Hence, Fk can be covered by at most 2n−1 · T ∇(Fk ) (X) sets of two elements each, as was to be proved. Finally, note that (7.2.3) follows immediately from (7.2.2), as desired.  7.3. The Convergence Theorems. In this section we complete the proofs of the convergence theorems. Theorem 7.17. Let X ⊆ Rn be compact, with ∇(X) = 0 < ∆(X), and Fk → X as in Proposition 7.13. Then (7.3.1)

lim dimfB (Fk ) = dimB (X).

k→∞

Proof. By Proposition 7.11: lim dimfB (Fk ) = lim

k→∞

k→∞

ln T∇(Fk ) (Fk ) · − ln ∇(Fk )

On the other hand, Falconer [8, p. 44-45] shows that when ∇(Fk+1 ) ≥ c · ∇(Fk ), the Box-counting dimension is given by: ln T ∇(Fk ) (X) · k→∞ − ln ∇(Fk )

dimB (X) = lim

Hence, to prove (7.3.1), it suffices to show: lim

k→∞

ln T ∇(Fk ) (X) ln T∇(Fk ) (Fk ) = lim · k→∞ − ln ∇(Fk ) − ln ∇(Fk )

Now, suppose that the left-hand side of (7.3.2) exists and equals s0 . Then the inequalities ln((1/2n−1 ) · T∇(Fk ) (Fk )) ln T ∇(Fk ) (X) ln(2 · T∇(Fk ) (Fk )) ≤ ≤ , − ln ∇(Fk ) − ln ∇(Fk ) − ln ∇(Fk ) ln T

(X)

∇(Fk ) obtained from (7.2.3), show that also limk→∞ − ln ∇(Fk ) exists and equals s0 , as desired. If we assume instead that it is the right-hand side of (7.3.2) that exists and equals, say, s0 , we proceed in the same way, starting this time from (7.2.2). The proof is complete. 

A HAUSDORFF DIMENSION FOR FINITE SETS

29

Theorem 7.18. Let X ⊆ Rn be compact, with ∇(X) = 0 < ∆(X), and Fk → X as in Proposition 7.13. Suppose, moreover, that X is the attractor of an iterated function system (IFS). Then (7.3.2)

lim dimfH (Fk ) = dimH (X).

k→∞

Proof. When X is the attractor on an IFS, we have dimB(X) = dimH (X) (see [8, p. 132]). By Proposition 5.5, dimfH (Fk ) = dimfB (Fk ). Hence, (7.3.2) follows from Theorem 7.17.  References [1] C.C. Aggarwal, A. Hinneburg, and D.A. Keim, ‘On the Surprising Behavior of Distance Metrics in High Dimensional Spaces’, Proc. Eighth Int’l Conf. Database Theory (ICDT ’01), vol. 1973 (2001) 420–434. [2] R. Bellman, Adaptive Control Processes: A Guided Tour. (Princeton University Press, 1961). [3] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, ‘¿When is nearest neighbor meaningful?’, Proc. of Int’l Conf. on Database Theory (ICDT) (1999) 217–235. [4] D. Burago, Y. Burago, and S. Ivanov, A Course in Metric Geometry. American Mathematical Society, Providence, Rhode Island, 2001. [5] R. Clarke, H.W. Ressom, A. Wang, J. Xuan, M.C. Liu, E.A. Gehan, and Y. Wang, ‘The properties of high dimensional data spaces: implications for exploring gene and protein expression data’, Nature Reviews Cancer 8 (2008) 37–49. [6] K.L. Clarkson, ‘Nearest-Neighbor Searching and Metric Space Dimensions’, in NearestNeighbor Methods for Learning and Vision: Theory and Practice, MIT Press (2006) 15–59. [7] R.J. Durrant, A. Kab´ an, ‘When is ‘nearest neighbour’ meaningful: a converse theorem and implications’, Journal of Complexity25 (2009) 385–397. [8] K. Falconer, Fractal Geometry. Mathematical foundations and applications. 2nd Ed. John Wiley and Sons, Chichester, 2003. [9] D. Fran¸cois, M-V. Wertz,and S. Verleysen, ’The Concentration of Fractional Distances’ IEEE Trans. Knowledge and Data Eng. 19 (2007) 873–886. [10] Gavagai AB, www.gavagai.se [11] C. Gianella, ‘New Instability Results for High Dimensional Nearest Neighbor Search’, Information Processing Letters 109 (2009) 1109–1113. [12] A. Hinneburg, C.C. Aggarwal, and D.A. Keim, ‘¿What is the Nearest Neighbor in High Dimensional Spaces?’, Proc. 26th Int’l Conf. Very Large Data Bases (VLDB ’00) (2000) 506–515. [13] A. Holst, Private communication (2014). [14] C-M. Hsu and M-S. Chen, ‘On the Design and Applicability of Distance Functions in HighDimensional Data Space’, IEEE Transactions on Knowledge and Data Engineering 21 (2009). [15] A. Kab´ an, ‘On the Distance Concentration Awareness of Certain Data Reduction Techniques’, Pattern Recognition 44 (2011) 265–277. [16] A. Kab´ an, ‘Non-parametric Detection of Meaningless Distances in High Dimensional Data’, Statistics and Computing 22 (2012) 375–385. [17] J. Karlgren, A. Holst, and M. Sahlgren, ‘Filaments of meaning in Word Space’, 30th European Conference on Information Retrieval (ECIR 08), Glasgow (2008). [18] S. Pramanik and J. Li, ‘Fast Approximate Search Algorithm for Nearest Neighbor Queries in High Dimensions’, Proc. ICDE (1999) 251. [19] M. Radovanovi´ c, A. Nanopoulos, M. Ivanovi´ c, ‘Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data’, Journal of Machine Learning Research 11 (2010) 2487–2531. ´ tica, Universidad Nacional de San Luis, and UN Juan M. Alonso, Dpto. de Matema de Cuyo, Argentina E-mail address: [email protected]