Sobolev spaces and approximation by affine spanning ... - CiteSeerX

Report 3 Downloads 28 Views
Mathematische Annalen manuscript No. (will be inserted by the editor)

Sobolev spaces and approximation by affine spanning systems H.-Q. Bui · R. S. Laugesen c Springer-Verlag 2007 Received: date / Revised version: date – ° Abstract. We develop conditions on a Sobolev function ψ ∈ W m,p (Rd ) such that if b ψ(0) = 1 and ψ satisfies the Strang–Fix conditions to order m−1, then a scale averaged approximation formula holds for all f ∈ W m,p (Rd ): f (x) = lim

J→∞

J 1XX cj,k ψ(aj x − k) J j=1 d

in W m,p (Rd ).

k∈Z

The dilations {aj } are lacunary, for example aj = 2j , and the coefficients cj,k are explicit local averages of f , or even pointwise sampled values, when f has some smoothness. For convergence just in W m−1,p (Rd )P the scale averaging is unnecessary and one has the simpler formula f (x) = limj→∞ k∈Zd cj,k ψ(aj x − k). The Strang–Fix rates of approximation are recovered. As a corollary of the scale averaged formula, we deduce new density or “spanning” criteria for the small scale affine system {ψ(aj x − k) : j > 0, k ∈ Zd } in W m,p (Rd ). We also span Sobolev space by derivatives and differences of affine systems, and we raise an open problem: does the Gaussian affine system span Sobolev space? Mathematics Subject Classification (2000): Primary 41A25,42C40,46E35. Secondary 42B35,42C30.

1. Introduction We seek conditions on ψ under which every Sobolev function f can be approximated explicitly by linear combinations of the integer translates and small-scale dilates of ψ, that is by linear combinations of ψ(aj x − k) for j > 0, k ∈ Zd . The dilations aj here are assumed to grow at least exponentially; for example aj = 2j . Our work on this approximation H.-Q. Bui Department of Mathematics, University of Canterbury, Christchurch 8020, New Zealand, e-mail: [email protected] R. S. Laugesen Department of Mathematics, University of Illinois, Urbana, IL 61801, U.S.A., e-mail: [email protected]

2

H.-Q. Bui, R. S. Laugesen

problem yields answers to the spanning problem of determining whether the ψ(aj x − k) span Sobolev space. We illustrate our results now by stating them in one dimension, for the special case of Sobolev functions possessing one derivative. Fix 1 ≤ p < ∞ with p1 + 1q = 1, take ψ ∈ W 1,p and suppose φ ∈ Lq has compact support, for the remainder of this Introduction. Also assume b =0 ψ(`) for all integers ` 6= 0, P or equivalently that k∈Z ψ(x − k) ≡ const. Approximation results Write fj (x) =

X µZ k∈Z

R

¶ f (a−1 j y)φ(y

− k) dy ψ(aj x − k)

for the quasi-interpolant of f with analyzer φ and synthesizer ψ. Define a local supremum operator Qf (x) = kf kL∞ (x−1,x+1) . (See Sections 2 and 3 for more general definitions of Q and fj .) Theorem 1 proves scale averaged convergence for f ∈ W 1,p : P b f = limJ→∞ J1 Jj=1 fj if Qψ, Q((1 + |x|)ψ 0 ) ∈ L1 and ψ(0) = b = 1. φ(0) (These Q-hypotheses say roughly that ψ and (1 + |x|)ψ 0 are bounded and decay integrably at infinity.) Theorem 2 implies the same for the pointwise quasi-interpolant fj• (x) = P −1 1,p ∩ C 1 with Qf, Q(f 0 ) ∈ Lp . k∈Z f (aj k)ψ(aj x − k), provided f ∈ W These two theorems give convergence rate o(1) in the W 1,p norm. For the Lp norm, the “Strang–Fix” rate of convergence O(|aj |−1 ) is obtained as expected, by Theorem 3: b b if Q((1 + |x|)ψ) ∈ L1 and ψ(0) = φ(0) = 1 and φb 0 (0) = 0, then kf − fj kp ≤ C|f |W 1,p |aj |−1 = O(|aj |−1 ) as j → ∞, for each f ∈ W 1,p . Spanning results Corollary 2 deduces that: b if ψ 0 decays like |x|−2−ε at infinity, and ψ(0) 6= 0, then the small scale affine system {ψ(aj x − k) : j > 0, k ∈ Z} spans W 1,p .

Sobolev spaces and approximation by affine spanning systems

3

Finally, taking derivatives and differences of known spanning systems will generate yet more spanning systems, as Proposition 1 and Theorem 4 explain. Outline of the paper The standing assumptions on dilations and translations are established in Section 2, along with some definitions. Section 3 gives approximation formulas for W m,p (Rd ), with relevant literature summarized in Section 3.5. Spanning results are deduced in Section 4. Spanning properties of the second difference of the Gaussian are determined in Section 5. Spanning properties of the second derivative of the Gaussian, a function known as the Mexican hat, remain mostly unknown. This open problem is related in Section 5 to a spanning conjecture for 2 the Gaussian ψ(x) = e−x /2 itself: does the small scale dyadic system {ψ(2j x − k) : j > 0, k ∈ Z} span W m,p ? The technical core of the paper is in Section 6, where discretized approximate identities are studied and scale averaging is introduced through formula (23). Then Theorems 1, 2 and 3 are proved in Sections 8, 9 and 11, after which appear the remaining proofs and an appendix about the Q-operator. Remark. This paper builds on our Lp results in [5]. The Hardy space 1 H was treated in [7]. Surjectivity of the synthesis operator is developed in the new paper [8], and [25] handles Lp for p < 1. 2. Definitions and notation 1. Fix the dimension d ∈ N and write C = [0, 1)d for the unit cube in Rd . 2. Let the dilations aj for j > 0 be nonzero real numbers with |aj | → ∞ as j → ∞. Define amin = minj>0 |aj |. Some of our results further assume the dilations grow exponentially, meaning |aj+1 | ≥ γ|aj | for all j > 0, for some γ > 1 (so that the dilation sequence is lacunary). 3. Fix a translation matrix b, assumed to be an invertible d × d real matrix. Some of our constants and operators in this paper will depend implicitly on b and the dimension d. 4. Write Lp = Lp (Rd ) for the class of complex valued functions with finite Lp -norm, and W m,p = W m,p (Rd ) for the Sobolev functions with m derivatives in Lp . Given a multiindex µ of order |µ| = µ1 + · · · + µd , we write f (µ) = Dµ f for the µ-th derivative of f .

4

H.-Q. Bui, R. S. Laugesen

5. Given ψ ∈ Lp and φ ∈ Lq , where by convention 1 1 + = 1, p q we define ψj,k (x) = |aj |d/p ψ(aj x−bk),

φj,k (x) = |aj |d/q φ(aj x−bk),

x ∈ Rd ,

for j > 0, k ∈ Zd . These rescalings satisfy kψj,k kp = kψkp and kφj,k kq = kφkq . 6. The periodization of a function f is P f (x) = | det b|

X

f (x − bk)

for x ∈ Rd .

k∈Zd

If f ∈ L1 , then this series for P f converges absolutely for almost every x, and P f is locally integrable. 7. Define a local supremum operator Qf (x) = ess. sup|y−x| 0, x ∈ Rd , where f is the signal, φ is the analyzer and ψ is the synthesizer. To understand fj , suppose φ is a delta function (like in Theorem 2 below); then with b = I we get the quasi-interpolant fj (x) = P −1 k∈Zd f (aj k)ψ(aj x − k). Our first theorem finds conditions under which the fj provide a good approximation to f . Theorem 1. Assume ψ ∈ W m,p for some 1 ≤ p < ∞, m ∈ N, and suppose one of the following conditions holds: (i) P |χ|µ| ψ (µ) | ∈ Lploc for all |µ| ≤ m, and χm φ ∈ L1 , and f ∈ Ccm ; (ii) Q(χ|µ| ψ (µ) ) ∈ L1 for all |µ| ≤ m, and φ ∈ Lq with φ having compact support, and f ∈ W m,p . Suppose b −1 ) = 0 Dµ ψ(`b

for all row vectors ` ∈ Zd \ {0},

(2)

whenever |µ|R < m. R Assume Rd ψ dx = 1 and Rd φ dx = 1. Then (a)–(d) hold: (a) [Strang–Fix approximation] If in addition (2) holds whenever |µ| = m, then f = lim fj in W m,p . (3) j→∞

(b) [Scale-averaged approximation] If the dilations aj grow exponentially, then J 1X f = lim fj in W m,p . J→∞ J j=1

(c) [Stability] If (ii) holds then kfj kW m,p ≤ C(ψ, φ, m, p)kf kW m,p for all j > 0. (d) [Span] fj ∈ W m,p -span{ψj,k : k ∈ Zd } for each j > 0. The proof is in Section 8. Examples. A decay condition near infinity guarantees hypothesis (i) on ψ:

6

H.-Q. Bui, R. S. Laugesen

Lemma 1. Let ψ ∈ W m,p for some 1 ≤ p < ∞, m ∈ N. If |ψ (µ) (x)| ≤ C|x|−d−m−²

for each |µ| = m and almost every |x| > R, (4) for some constants C, R, ² > 0, then P |χ|µ| ψ (µ) | ∈ Lploc for all |µ| ≤ m. Hypothesis (ii) holds if ψ and its derivatives are bounded and decay at infinity: Lemma 2. Let ψ ∈ W m,∞ for some m ∈ N. If decay condition (4) holds, then Q(χ|µ| ψ (µ) ) ∈ L1 for all |µ| ≤ m. Lemma 1 and 2 are proved in Appendix A. Notes on Theorem 1. The Lp result corresponding to Theorem 1 is [5, Theorem 1]. The hypotheses there are roughly the same as case (i) with m = 0, except that f need not be continuous with compact support. Precisely, the Lp result assumes that ψ ∈ Lp with P |ψ| ∈ Lploc , φ ∈ Lq with P |φ| ∈ L∞ , and f ∈ Lp . The reason our Sobolev result Theorem 1 can only handle f ∈ Ccm , in case (i), boils down to our inability to prove a stability estimate in Lemma 4 case (i) for the general function h(x, y). Case (ii) assumes more on ψ than case (i) does (because Q(·) ∈ L1 implies P | · | ∈ L∞ by [6, Lemma 23]). But case (ii) has the advantage of applying to all f ∈ W m,p and not just to f ∈ Ccm . Also, case (ii) yields a stability estimate in Theorem 1(c). We call condition (2) the Strang–Fix condition of order m − 1, in view of the work of Strang and Fix in [15, 33, 34] (although historically, Schoenberg [32, Theorem 2] seems to have been the first to use the condition, in the context of polynomial interpolation and smoothing in one dimension). The Strang–Fix condition can be satisfied formally by putting ψ = u ∗ · · · ∗ u ∗ ψ0

(with m factors of u)

(5)

where u has constant periodization P u = 1 a.e. (meaning the integer translates of u form a partition of unity). Indeed P u = 1 a.e. implies u b(0) = 1 and u b(`b−1 ) = 0 for all ` ∈ Zd \ {0}, by computing the Fourier coefficients of the bZd -periodic function P u, and thus the Strang–Fix c0 . For a different condition (2) follows from the fact that ψb = u b···u bψ interpretation of the Strang–Fix condition, in terms of periodizations of moments of ψ, see Section 7. Our methods for Theorem 1 extend to cover dilation matrices aj that −1 d expand both exponentially (supj>0 kaj a−1 j+1 k < 1) and nicely (kaj k ≤ C| det a−1 |); see [6, §7]. But our method breaks down for dilations like µ j ¶j 3 0 that do not expand nicely. 0 2j

Sobolev spaces and approximation by affine spanning systems

7

Relevant literature for Theorem 1 will be discussed in Section 3.5. 3.2. Properties of fj Observe that fj discretizes a classical approximation to the identity: f (x) = lim (f ∗ ψa−1 )(x) j j→∞ Z f (z)|aj |d ψ(aj (x − z)) dz = lim j→∞ Rd Z = lim f (a−1 by z = a−1 j y)ψ(aj x − y) dy j y j→∞ Rd ¶ X µZ ≈ lim f (a−1 y) dy ψ(aj x − k) j j→∞

k∈Zd

(6)

k+C

by a Riemann sum approximation. This last line (6) is exactly limj→∞ fj , with φ = 1C and b = I. Caution is required in the Riemann sum approximation step, because we discretize with fixed step size 1. Theorem 1(a)(b) nonetheless shows the approximation (6) is exact in the W m,p -norm as j → ∞ provided either ψ satisfies Strang–Fix conditions to order m or else ψ satisfies them to order m − 1 and the approximation formula is averaged over all dilation scales. R Second, we can express fj in terms of an integral kernel as fj (x) = Rd Kj (x, y)f (y) dy where Kj (x, y) = |aj |d K(aj x, aj y) and K(x, y) = | det b|

X

ψ(x−bk)φ(y−bk).

k∈Zd

The stability estimate in Theorem 1(c) says that Kj : W m,p → W m,p with a norm estimate that is independent of j, provided hypothesis (ii) holds. 3.3. Approximation using pointwise sampling Now we develop an analogue of Theorem 1 that uses pointwise sampling. Write X fj• (x) = | det b| f (a−1 (7) j bk)ψ(aj x − bk) k∈Zd

for the quasi-interpolant of f at scale j, sampled on the uniform grid d a−1 j bZ . The “•” notation refers to the pointwise nature of the sampling.

8

H.-Q. Bui, R. S. Laugesen

Theorem 2. Assume ψ ∈ W m,p for some 1 ≤ p < ∞, m ∈ N, and that Q(χ|µ| ψ (µ) ) ∈ L1 for all |µ| ≤ m, and f ∈ W m,p ∩ C m with Q(f (µ) ) ∈ Lp for all |µ| ≤ m. Suppose b −1 ) = 0 Dµ ψ(`b

R

for all row vectors ` ∈ Zd \ {0},

(8)

whenever |µ| < m. Assume Rd ψ dx = 1. Then (a)–(d) hold: (a) [Strang–Fix approximation] If in addition (8) holds whenever |µ| = m, then f = lim fj• in W m,p . j→∞

(b) [Scale-averaged approximation] If the dilations aj grow exponentially, then J 1X • fj in W m,p . f = lim J→∞ J j=1 P (c) [Stability] kfj• kW m,p ≤ C(ψ, m, p, amin ) |µ|≤m kQ(f (µ) )kp for all j > 0, where amin = minj>0 |aj |. (d) [Span] fj• ∈ W m,p -span{ψj,k : k ∈ Zd } for each j > 0. See Section 9 for the proof. The C m -smoothness of f in the theorem is convenient, but it could be weakened like in the corresponding Lp result [5, Theorem 2]. For simplicity, Theorem 2 is stated only with hypothesis (ii) from Theorem 1, although it can be proved under hypothesis (i) also. 3.4. Approximation rates The preceding two theorems can be adapted to give explicit rates of approximation of fj to f . But we must first construct analyzers and synthesizers with suitably normalized moments. Lemma 3. Suppose Rφ, ψ ∈ L1 with χm φ, χm−1 ψ ∈ L1 for some m ∈ N. R If Rd φ dx 6= 0 and Rd ψ dx 6= 0, then there exists a finite set K ⊂ Zd and coefficients αk , βk ∈ C for k ∈ K such that the linear combinations X X βk ψ(x + bk) Φ(x) = αk φ(x − bk) and Ψ (x) = k∈K

k∈K

satisfy the moment conditions ( Z 1 if µ = 0, µ x Φ(x) dx = 0 if 0 < |µ| ≤ m, Rd ( Z 1 if µ = 0, (−x)µ Ψ (x) dx = 0 if 0 < |µ| ≤ m − 1. Rd

(9)

(10)

Sobolev spaces and approximation by affine spanning systems

9

The proof is in Section 10, along with examples of how to construct the linear combinations for Φ and Ψ . Now we can determine the rate at which fj approximates f ∈ W m,p ³P ´1/p µ f kp in the W r,p -norm, for 0 ≤ r ≤ m. Recall |f |W r,p = kD is p |µ|=r the Sobolev seminorm. Theorem 3. Assume ψ ∈ W m−1,p for some 1 ≤ p < ∞, m ∈ N, with Q(χm ψ (µ) ) ∈ L1 for all |µ| < m, and take φ ∈ Lq with compact support. Suppose b −1 ) = 0 Dµ ψ(`b

for all row vectors ` ∈ Zd \ {0},

whenever |µ|R < m. R Assume Rd ψ dx 6= 0, Rd φ dx 6= 0, and that Ψ and Φ are as in Lemma 3. (a) [Average sampling] If f ∈ W m,p then for each r = 0, 1, . . . , m − 1, |Fj −f |W r,p ≤ C(ψ, φ, m, p)|f |W m,p |aj |r−m = O(|aj |r−m )

for all j > 0,

where Fj is defined by average sampling with analyzer Φ and synthesizer Ψ: ¶ X µZ Fj (x) = | det b| f (a−1 y)Φ(y − bk) dy Ψ (aj x − bk) j k∈Zd

= | det b|

X

 

k∈Zd

Rd



Z

X

αk1 βk2

k1 ,k2 ∈K

Rd

 f (a−1 j y)φ(y − b(k + k1 + k2 )) dy · ψ(aj x − bk).

(b) [Pointwise sampling] Suppose f ∈ W m,p ∩ C m and Q(f (µ) ) ∈ Lp for all |µ| ≤ m. Then for each r = 0, 1, . . . , m − 1, X kQ(f (µ) )kp |aj |r−m = O(|aj |r−m ) |Fj• − f |W r,p ≤ C(ψ, m, p, amin ) |µ|=m

for all j > 0, where amin = minj>0 |aj | and Fj• is defined by uniform pointwise sampling with synthesizer Ψ : X Fj• (x) = | det b| f (a−1 j bk)Ψ (aj x − bk) k∈Zd

= | det b|

X k∈Zd

 

 X k2 ∈K

 βk2 f (a−1 j b(k + k2 )) ψ(aj x − bk).

10

H.-Q. Bui, R. S. Laugesen

The theorem is proved in Section 11. Remarks on Theorem 3. 1. Theorems 1 and 2 can give further information on Fj and Fj• , such as stability estimates. 2. Theorem 3 does not consider case (i) of Theorem 1, because stability estimates underpin the proof and we only know stability in case (ii). 3. The scale averaging technique in Theorems 1(b) and 2(b) does not help obtain rates of approximation. The problem, when P one digs into the proofs, is that the scale averaged periodization J1 Jj=1 P ψ(aj x) will generally fail to converge uniformly to its mean value; in particular, convergence fails at x = 0 if P ψ(0) is not equal to the mean value and P ψ is continuous. 4. Theorem 3(a) implies that Fj → f in W m−1,p , so that the ψj,k span W m,p using the W m−1,p -norm. In fact the ψj,k span W m,p in its own norm, by Corollary 1. This illustrates the “gain of one order” provided by scale averaging, in our work. 3.5. The Sobolev approximation literature and prior results Overview. Our main contribution in Section 3 is the scale averaged approximation in Theorem 1(b), which is genuinely new. The pointwise sampling results in Theorem 2 and Theorem 3(b) seem also to be new. The average sampling results (big-O approximation rates) in Theorem 3(a) are essentially known. Detailed discussion. We now give a more complete account of the literature, and our contributions. R The original approximation results in W m,p with Rd ψ dx 6= 0 all assume that p = 2 and ψ has compact support. See Babuˇska [4, Theorem 4.1] and Strang and Fix [34, Theorem I]. These approximation formulas are not explicit, in the sense that they use sampled values of fˆ, rather than of f , to construct an approximation to f by Fourier transform methods. These indirect Fourier methods are characteristic of the work of Strang and Fix and most of the papers inspired by them. By contrast, we work with explicit quasi-interpolants in this paper, namely the functions fj (x). Di Guglielmo had earlier proved an explicit approximation result [18, Th´eor`eme 6] for p = 2, provided also ψ is a convolution like in (5) with u being the characteristic function of a unit cube. This means u b vanishes on the union of hyperplanes {ξ ∈ Rd : ξi ∈ Z \ {0} for some i = 1, . . . , d},

Sobolev spaces and approximation by affine spanning systems

11

and so ψb vanishes on all these hyperplanes too, instead of just vanishing at the lattice points (where hyperplanes intersect) like in the work of Babuˇska, Strang and Fix. For p = 2, these authors all prove big-O approximation rates that are analogous to our Theorem 3(a). That is, they show an arbitrary f ∈ W m,2 can be approximated in the W r,2 norm at rate O(|aj |r−m ) as j → ∞, for each r = 0, 1, . . . , m − 1. The best possible result of this kind is due to Jetter and Zhou [21, Theorem 1], who completely characterized the functions ψ and φ for which these approximation rates can hold, when p = 2. See also Holtz and Ron [20, Theorems 7,9]. For all 1 ≤ p < ∞, Jia [22, Theorem 3.1] has proved analogous approximation rates under the assumption that ψ and φ have compact support. Thus Theorem 3(a) is known already in the compactly supported case. Jia’s proof is different to ours, although both proofs avoid the Fourier transform and hence can treat p 6= 2 along with p = 2. Theorem 3(a) improves on all these results in a technical sense (except for Jetter–Zhou and Holtz–Ron when p = 2), because the hypothesis Q(χm ψ (µ) ) ∈ L1 can hold even when ψ does not have compact support. Much more attention has been paid in the literature to the case r = 0 of Theorem 3(a) (approximation of Sobolev functions in the Lp -norm) than to the case r > 0 (approximation of Sobolev functions in Sobolev norms). See for example [24, §7] for all p, and the references in [20] for p = 2. The approximation rates in Theorem 3(b), for pointwise sampling, seem to be new except when p = 2, which was considered by Jetter and Zhou [21, Theorem 5] provided m > d/2. When p = ∞ (uniform approximation), Strang and Fix [34, Theorem III] did use pointwise sampling to approximate f in the W r,∞ norm, and other authors have since extended those results. Pointwise sampling in Besov and Triebel–Lizorkin spaces is treated in the recent paper [30]. Turn now to Theorem 1. Part (a) was essentially proved by Di Guglielmo [18, Th´eor`eme 20 ] for p ≥ 2, under the strong “convolution” assumption on ψ mentioned above. Notice Theorem 1(a) only gives convergence at the rate o(1), although to its credit this is accomplished without the vanishing moment assumption on the analyzer and synthesizer needed in Theorem 3. Theorem 1(b) proves scale averaged convergence, which is new. We are aware of no precedents in the Strang–Fix tradition or in related approximation theory. Note that Theorem 1(b) gives convergence in the W m,p -norm in a situation where Strang–Fix type results like Theorem 3 can only prove convergence in the W m−1,p -norm (because the Strang– Fix condition is assumed only to order m − 1). The convergence rate in

12

H.-Q. Bui, R. S. Laugesen

Theorem 1 is merely o(1), but that will later suffice to yield interesting spanning results, in Section 4. Lastly, Theorem 2 is the analogue of Theorem 1 for approximation in W m,p by pointwise sampling. It seems not to have direct forbears in the literature. Additional remarks. Strang and Fix [34, Theorem I] proved a converse saying that the Strang–Fix condition to order m − 1 is necessary for approximating an arbitrary f ∈ W m,2 in the W r,2 -norm (for r = 0, 1, . . . , m − 1) at rate O(|aj |r−m ) in a “controlled” fashion by functions of the form P k∈Zd cj,k ψj,k . The point of Theorem 3(a) in this paper is to prove sufficient conditions under which the quasi-interpolant fj achieves this best possible rate of approximation. We do not consider necessary conditions. Mikhlin’s monograph [29] develops Strang–Fix type approximation results using “primitive functions”. Unfortunately the number of such generators must grow with m. Maz’ya and Schmidt [26, 27, 31] developed a theory of approximate approximations that can be viewed as Strang–Fix theory without the full Strang–Fix conditions. Their approximations possess inescapable saturation errors and thus do not actually converge. Nonetheless, Maz’ya and Schmidt make a case that the saturation errors can be negligible in practical situations. 4. Spanning results — synthesizers and their derivatives and differences First we deduce spanning results from our earlier approximation theorems. Corollary 1. Assume ψ ∈ W m,p for some 1 ≤ p < ∞, m ∈ N, and suppose P |χ|µ| ψ (µ) | ∈ Lploc for all |µ| ≤ m. Assume b −1 ) = 0 Dµ ψ(`b

for all row vectors ` ∈ Zd \ {0},

whenever R |µ| < m. If Rd ψ dx 6= 0, then {ψj,k : j > 0, k ∈ Zd } spans W m,p . Spanning means the finite linear combinations of the functions ψj,k are dense in W m,p . The analogous Lp spanning result [5, Corollary 1] holds when P |ψ| ∈ p Lloc . Proof (of Corollary 1). The dilations aj can be taken to grow exponentially, by passing to a subsequence if necessary. And we can require

Sobolev spaces and approximation by affine spanning systems

13

R

Rd ψ dx = 1, since multiplying ψ by a nonzero constant does not affect the span of the ψj,k . Let φ be the characteristic function of a unit cube. Then the W m,p -span of the ψj,k contains Ccm , by Theorem 1(b)(d) case (i). By density of Ccm , the ψj,k therefore span all of W m,p .

Next we conclude that a simple decay condition near infinity suffices for the ψj,k to span W m,p , in conjunction with the Strang–Fix vanishing of the Fourier transform at the lattice points, to order m − 1. Corollary 2. Assume ψ ∈ W m,p for some 1 ≤ p < ∞, m ∈ N, and that ψ decays according to |ψ (µ) (x)| ≤ C|x|−d−m−²

for each |µ| = m and all large |x|,

b −1 ) = 0 for all ` ∈ Zd \ {0} for some constants C, ² > 0. Suppose Dµ ψ(`b and |µ| R < m. If Rd ψ dx 6= 0, then {ψj,k : j > 0, k ∈ Zd } spans W m,p . To prove the corollary, just combine Corollary 1 with Lemma 1. The analogous Lp result (m = 0) is in [5, Corollary 2]. We are not aware of any previous spanning results of this kind for Sobolev space. Our next result spans by derivatives of a given spanning set. Proposition 1. Let H ⊂ W m,p for some 1 < p < ∞, m ∈ N, and suppose H spans W m,p . If ν is a multiindex of order 0 < |ν| ≤ m, then the collection {Dν h : h ∈ H} spans W m−|ν|,p . is proved in Section 12. Clearly it fails for p = 1, since RThe proposition ν Rd D h dx = 0 always. Example for Proposition 1. If ψ satisfies the hypotheses of Corollary 1 or 2, and 1 < p < ∞, then the (Dν ψ)j,k span W m−|ν|,p by Proposition 1. In particular, they span Lp when |ν| = m. Note the Fourier transform of our new affine generator Dν ψ vanishes at all lattice points, with ν ψ(`b−1 ) = 0 [ Dµ D

whenever ` ∈ Zd \ {0} and |µ| < m, and also whenever ` = 0 and µ < ν. Our final result shows that in most cases, the span of an affine system is not changed by taking differences of the generator. Our notation for first differences is ∆c,z ψ(x) = ψ(x) − cψ(x − z),

c ∈ C,

x, z ∈ Rd .

When c = 1 we simply write ∆z ψ(x) = ψ(x) − ψ(x − z).

14

H.-Q. Bui, R. S. Laugesen

Theorem 4. Suppose ψ ∈ W m,p for some 1 ≤ p ≤ ∞, m ∈ N ∪ {0}. Fix j > 0. Take c ∈ C and κ ∈ Zd \ {0}. If 1 < p < ∞ or |c| = 6 1, then W m,p -span{ψj,k : k ∈ Zd } = W m,p -span{(∆c,bκ ψ)j,k : k ∈ Zd }. See Section 13 for the proof. Notice Lp -spaces are covered by the theorem (when m = 0). Example for Theorem 4. Work in R dimension d = 1 for simplicity. If ψ ∈ ∞ L has compact support and R ψ dx 6= 0, then the small-scale affine system {ψj,k : j > 0, k ∈ Z} spans Lp (R) for each 1 ≤ p < ∞, by [5, Corollary 2]. Then Theorem 4 with c = 1 and κ = 1 implies that each Lp (R), 1 < p < ∞, is also spanned by the small-scale affine systems generated by each of ∆ψ(x) = ψ(x) − ψ(x − b), ∆2 ψ(x) = ψ(x) − 2ψ(x − b) + ψ(x − 2b), and so on. For example the Haar wavelet H = 1[0,1/2) − 1[1/2,1) can be written as a difference H = 21 ∆ψ of the function ψ = 21[0,1/2) , provided b = 1/2, and so the oversampled, small-scale Haar system 1 {H(2j+J x − k) : j > 0, k ∈ Z} 2 spans Lp for 1 < p < ∞, for each J ∈ N, by taking aj = 2j+J above. Recall that the Haar system {H(2j x − k) : j ∈ Z, k ∈ Z} with no oversampling also spans Lp [19], though it needs all dilation scales j ∈ Z to do so. Remark. The spanning by differences result in Theorem 4 is weaker (in the interesting case c = 1) than the spanning by derivatives result in Proposition 1. For suppose we want to span Lp . A difference of a function ψ ∈ Lp will have Fourier transform vanishing on infinitely many hyperplanes (e.g. the unit difference ψ(x) − ψ(x − e1 ) in the x1 direction has a factor of 1 − e−2πiξ1 in its Fourier transform, and this factor vanishes whenever ξ1 ∈ Z). If instead we started with ψ ∈ W 1,p and then took a derivative such as D1 ψ, we would introduce zeros only on the single hyperplane ξ1 = 0 through the origin in Fourier space; we would also need to impose a Strang–Fix condition ψb = 0 at the nonzero lattice points, to ensure that the ψj,k span W 1,p by our results (like Corollary 2) and hence that their derivatives span Lp . The upshot, though, is that when spanning by differences one needs ψb to vanish on infinitely many hyperplanes,

Sobolev spaces and approximation by affine spanning systems

15

whereas when spanning by derivatives one only needs ψb to vanish on one hyperplane and infinitely many lattice points. Of course in dimension d = 1 the two approaches are equivalent, because hyperplanes reduce to points. And anyway, differences can be more convenient to use than derivatives. Spanning by molecular and wavelet affine systems The work of Gilbert et al. [17], and earlier Frazier and Jawerth [16], gives ˙ m,p , 1 < an affine spanning result in the homogeneous Sobolev space W p < ∞. In particular, the result [17, Theorem 1.5] proves a frame decomposition using the full affine system {ψ(aj x − bk) : j ∈ Z, k ∈ Zd } provided ψ satisfies certain “molecular” decay and smoothness conditions. ˙ m,p . Strang–Fix conditions are not imposed. Hence the system spans W Unfortunately, [17, Theorem 1.5] holds only when the dilation step a is sufficiently close to 1 and the translation step b is sufficiently close to 0, depending on the synthesizer ψ. By contrast, in this paper our dilations and translations are independent of ψ. In a different direction, orthonormal wavelet systems {ψ(2j x − k) : j ∈ Z, k ∈ Zd } that satisfy some smoothness and decay conditions are known to provide unconditional bases for Sobolev space [19, p. 312], and hence span Sobolev space. See [12, 23] for recent developments. These molecular and wavelet results employ all the scales j ∈ Z, and b assume ψ(0) = 0. In contrast, this paper uses just the small scales j > 0 b b and assumes ψ(0) 6= 0. (The only generators with ψ(0) = 0 in this paper are those resulting from Proposition 1 when spanning by derivatives, and from Theorem 4 when spanning by differences.) 5. Open problems — the Gaussian and the Mexican hat Our work in this paper on Sobolev space, and our work on Lp and H 1 in [5, 7, 8, 25], are motivated by Y. Meyer’s unsolved “Mexican hat” problem. To describe it, consider now dyadic dilations aj = 2j in dimension d = 1, and for simplicity take b = 1 throughout this section. Write θ(x) = 2 (1 − x2 )e−x /2 for the Mexican hat function (whose graph resembles a sombrero). Meyer [28, p. 137] asked: does the full Mexican hat system {θj,k : j, k ∈ Z} span Lp for all 1 < p < ∞? (It cannot span all of L1 because the Mexican hat has integral zero.) The answer is Yes when p = 2, but the problem remains open for all other p-values. It is known that the Mexican hat system spans Lp provided the translations are sufficiently

16

H.-Q. Bui, R. S. Laugesen

oversampled [9], or the translations and dilations are both sufficiently oversampled [17]. We propose a different approach. The Mexican hat is the second deriv2 ative of the Gaussian −e−x /2 , and so we wonder whether the Gaussian system spans Sobolev space. 2 /2

Conjecture 1. If ψ(x) = e−x each 1 ≤ p < ∞, m ∈ N.

then {ψj,k : j > 0, k ∈ Z} spans W m,p for

If Conjecture 1 is true, then for all m ∈ N, the mth derivative of the Gaussian ψ would generate a small scale system spanning Lp , 1 < p < ∞, by Proposition 1. In particular by taking m = 2, the small scale Mexican hat system {θj,k : j > 0, k ∈ Z} would span Lp , answering Meyer’s question. Notice Conjecture 1 is true for m = 0 (the Lp case), by [5, Corollary 2] or earlier by [14,35, 36]. Also note Corollary 2 fails to resolve the conjecture for m > 0, because the Fourier transform of the Gaussian vanb −1 ) = 0 ishes nowhere and thus fails the Strang–Fix hypothesis Dµ ψ(`b imposed in Corollary 2. Conjecture 1 must be approached with caution, because not every reasonable ψ generates a system that spans Sobolev space. For example the tent function ψ(x) = 2x for x ∈ [0, 1/2] and ψ(x) = 2 − 2x for x ∈ [1/2, 1] does not generate a small scale dyadic spanning set for W 1,2 (R), because if it did then ψ 0 = 2H would generate a small scale dyadic spanning set for L2 (R) by Proposition 1, whereas spanning L2 (R) requires the full dyadic Haar system (involving j ∈ Z and not just j > 0), by orthonormality. We expect such counterexamples to be nongeneric, but they do show that small scales alone will not always suffice to span Sobolev space, or Lp . The second difference of the Gaussian. Although we cannot so far resolve the Mexican hat spanning problem for the second derivative of the Gaussian, we can easily resolve the analogous problem for the second 2 difference of the Gaussian. With ψ(x) = e−x /2 being the Gaussian, write σ(x) = ψ(x + 1) − 2ψ(x) + ψ(x − 1) = −∆−1 ∆1 ψ(x) for the symmetric second difference of the Gaussian with step size 1. As remarked above, the Gaussian system {ψj,k : j > 0, k ∈ Z} spans Lp (R) for 1 ≤ p < ∞, and so the second difference system {σj,k : j > 0, k ∈ Z} spans Lp (R) for each 1 < p < ∞, by two applications of Theorem 4 with m = 0, c = 1. Figure 1 shows that the second difference σ of the Gaussian and the second derivative θ (the Mexican hat) behave very much the same way, in both time and frequency domains.

Sobolev spaces and approximation by affine spanning systems

17

2

1

y

y

0

0 -4

0

4

-0.8

x

0

0.8

xi

Fig. 1. Left: The second derivative θ(x) and second difference σ(x) of the Gaussian (solid and dashed curves, respectively), after normalization to 1 at x = 0. Right: Their b Fourier transforms θ(ξ) and σ b(ξ).

Incidentally, the Mexican hat generates more than a spanning set for L2 (R): it generates a dyadic frame by [13, p. 987] or [11, p. 264], meaning constants 0 < A ≤ B < ∞ exist such that XX |hf, θj,k i|2 ≤ Bkf k22 for all f ∈ L2 (R), Akf k22 ≤ j∈Z k∈Z

when aj = 2j . The second difference function σ(x) also generates a dyadic frame: by numerically evaluating Casazza and Christensen’s frame criterion (see [10, Theorem 2.5], [11, Theorem 11.2.3]) we have obtained the estimate B/A ≤ 1.088 for the frame bounds, compared with B/A ≤ 1.095 for the Mexican hat. 6. Discretized approximations to the identity The basic approximation results of the paper are developed in this section. The key object is an operator Ij [ψ, φ] that acts on functions h(x, y) by ¶ X µZ −1 (Ij [ψ, φ]h)(x) = | det b| h(x, aj y − x)φ(y − bk) dy ψ(aj x−bk). k∈Zd

Rd

(11) Lemma 4 specifies properties of the synthesizer ψ and analyzer φ under which Ij is well defined, for j > 0. We will require h(x, y) to belong to the mixed-norm space L(p,∞) = {h : h is measurable on Rd × Rd and khk(p,∞) < ∞}

18

H.-Q. Bui, R. S. Laugesen

R where khk(p,∞) = ess. supy∈Rd ( Rd |h(x, y)|p dx)1/p . That is, khk(p,∞) takes the Lp norm of h with respect to x, and then the L∞ norm with respect to y. For example if h(x, y) = f (x + y) and f ∈ Lp then h ∈ L(p,∞) with khk(p,∞) = kf kp . This choice of h yields Ij [ψ, φ]h = fj , by comparing the definitions (1) and (11). Hence we call Ij a “discretized approximation to the identity” operator. Lemma 4. Assume ψ ∈ Lp for some 1 ≤ p < ∞, and that one of the following conditions holds: R (i) P |ψ| ∈ Lploc , φ ∈ L1 , and h(x, y) = [0,1] f (x+ty) dω(t) for some f ∈ Cc and some Borel probability measure ω on [0, 1]; (ii) Qψ ∈ L1 , φ ∈ Lq with φ having compact support, and h ∈ L(p,∞) . Then the series (11) defining Ij [ψ, φ]h converges pointwise absolutely a.e. to an Lp function. The series further converges unconditionally in Lp . And in case (ii) we obtain a stability estimate that is independent of j: kIj [ψ, φ]hkp ≤ C(p, spt φ)kQψk1 kφkq khk(p,∞) .

(12)

Remarks on Lemma 4. 1. Case (i) assumes less about ψ than case (ii) does, but on the other hand it assumes a special form for h, and it does not yield a stability estimate. 2. The assumption Qψ ∈ L1 in case (ii) lets us bound the values of ψ at nearby points, so that we can estimate certain Riemann sums involving ψ with integrals involving Qψ. See (18) below. 3. For h(x, y) = f (x + y), Lemma 4 and also Lemma 5 below were proved in our Lp paper [5, Lemmas 1 and 2]. (The hypotheses there are stronger on the analyzer φ, but that matters little.) This special choice h(x, y) = f (x + y) yields a stability estimate in both cases (i) and (ii). See that paper [5] for an account of earlier literature with h(x, y) = f (x + y), such as di Guglielmo [18, p. 288]. R Proof (of Lemma 4). The integral Rd h(x, a−1 j y−x)φ(y−bk) dy occurring in the definition of Ij is well defined, because in case (i) h is bounded and φ ∈ L1 , and in case (ii) we see y 7→ h(x, y) belongs to Lploc for almost every x and φ ∈ Lq has compact support.

Sobolev spaces and approximation by affine spanning systems

19

To start estimating Ij , notice |(Ij [ψ, φ]h)(x)|p  XZ ≤ | det b| k∈Zd

≤ | det b|

p

Rd

X µZ k∈Zd

Rd

 |h(x, a−1 j y − x)||φ(y − bk)| dy |ψ(aj x − bk)| |h(x, a−1 j y

¶p − x)||φ(y − bk)| dy p−1

 · | det b|

X

|ψ(aj x − bk)| (13)

|ψ(aj x − bk)|

k∈Zd

by H¨older’s inequality on the sum, when p > 1. (When p = 1 the last inequality is trivial.) Case (i). By applying H¨older’s inequality to the y-integral in (13) we find XZ p p |(Ij [ψ, φ]h)(x)| ≤ | det b| |h(x, a−1 j y − x)| |φ(y − bk)| dy k∈Zd

Rd

· kφkp−1 |ψ(aj x − bk)|(P |ψ|(aj x))p−1 . 1

(14)

After integrating (14) with respect to x and then substituting h(x, y) = R −1 f [0,1] (x + ty) dω(t) and making the changes of variable x 7→ aj (x + bk) and y 7→ y + bk, we deduce kIj [ψ, φ]hkpp Z Z Z ≤ [0,1]

Rd

Rd

Rj (x, y, t) |ψ(x)|(P |ψ|(x))p−1 |φ(y)| dxdydω(t) kφk1p−1 (15)

where Rj (x, y, t) = | det a−1 j b|

X

−1 p |f (a−1 j (x + bk) + taj (y − x))| .

(16)

k∈Zd

We claim Rj is bounded, independently of x, y and t. For if we write −1 −1 d w = a−1 j x + taj (y − x) and K = {k ∈ Z : aj bk ∈ (spt f ) − w}, then p Rj (x, y, t) ≤ | det a−1 j b| · #K · kf k∞

= kf kp∞ · | ∪k∈K a−1 j b(k + C)| ≤ kf kp∞ · |{z ∈ Rd : dist(z, (spt f ) − w) ≤ diam(a−1 j bC)}| = kf kp∞ · |{z ∈ Rd : dist(z, spt f ) ≤ diam(a−1 j bC)}|,

(17)

20

H.-Q. Bui, R. S. Laugesen

which gives a bound on Rj that is uniform in x, y and t. Also φ ∈ L1 by hypothesis in case (i), and |ψ|(P |ψ|)p−1 ∈ L1 because Z Z X p−1 |ψ|(P |ψ|) dx = |ψ(x − bk)|(P |ψ|(x − bk))p−1 dx Rd

bC

k∈Zd −1

= | det b|

kP |ψ|kpLp (bC) < ∞.

Therefore Ij [ψ, φ]h belongs to Lp by the estimate (15). Case (ii). By using the compact support of φ in the y-integral in (13), and then applying H¨older’s inequality, we find |(Ij [ψ, φ]h)(x)|p XZ ≤ | det b| k∈Zd

≤ | det b|

Rd

XZ k∈Zd

Rd

p |h(x, a−1 j y − x)| 1spt φ (y − bk) dy p−1 · kφkpq |ψ(aj x − bk)|kP |ψ|k∞ p |h(x, a−1 j y − x)| 1E (y − bk)QE ψ(aj x − y) dy p · kP |ψ|kp−1 ∞ kφkq

(18)

for almost every x, by Lemma 10 with f = ψ and E = spt φ, Z p p−1 p ≤ |h(x, −a−1 j y)| QE ψ(y) dy · kP 1E k∞ kP |ψ|k∞ kφkq Rd

by y 7→ aj x − y. Integrating with respect to x gives the norm estimate Z p p p−1 p kIj [ψ, φ]hkp ≤ kh(· , −a−1 j y)kp QE ψ(y) dy · kP 1E k∞ kP |ψ|k∞ kφkq Rd



(19)

C(E)khkp(p,∞) kQψk1

·

p kP |ψ|kp−1 ∞ kφkq ,

using here that kQE (·)k1 ≤ C(E)kQ(·)k1 by definition of QE in (65). Finally note kP |ψ|k∞ ≤ CkQψk1 by Lemma 11. Thus we have proved estimate (12) in case (ii). Unconditional convergence. The series defining Ij [ψ, φ]h converges unconditionally in Lp , because ¯ µZ ¶ X ¯¯ ¯ −1 ¯=0 ¯ | det b| h(x, a y − x)φ(y − bk) dy ψ(a x − bk) lim j j ¯ ¯ K→∞

|k|≥K

Rd

in Lp by dominated convergence (using the pointwise absolute convergence proved above).

Sobolev spaces and approximation by affine spanning systems

21

The next lemma proves convergence properties of Ij [ψ, φ] as j → ∞. Lemma 5. Assume ψ ∈ Lp for some 1 ≤ p < ∞, and that one of the following conditions holds: R (i) P |ψ| ∈ Lploc , φ ∈ L1 , and h(x, y) = [0,1] f (x+ty) dω(t) for some f ∈ Cc and some Borel probability measure ω on [0, 1]; (ii) Qψ ∈ L1 , φ ∈ Lq with φ having compact support, and h ∈ L(p,∞) with in Lp .

lim h(·, y) = h(·, 0)

y→0

(20)

Then (a)–(c) hold: (a) [Upper bound] ( C(p)kP |ψ|kLp (bC) kφk1 lim sup kIj [ψ, φ]hkp ≤ kh(·, 0)kp · C(p, spt φ)kQψk1 kφkq j→∞

in case (i), in case (ii). (21) R (b) [Constant periodization] If P ψ(x) = Rd ψ(y) dy for almost every x, then Z Z lim (Ij [ψ, φ]h)(x) = h(x, 0) ψ(y) dy φ(z) dz in Lp . (22) j→∞

Rd

Rd

(c) [Scale averaging] If the dilations aj grow exponentially, then Z Z J 1X (Ij [ψ, φ]h)(x) = h(x, 0) ψ(y) dy φ(z) dz J→∞ J Rd Rd lim

in Lp .

j=1

Hypothesis (20) says that y 7→ h(· , y) is continuous as a map Rd → Lp , at y = 0. Proof (of Lemma 5). Observe P |ψ| ∈ Lploc , by hypothesis in case (i) and from Lemma 11 in case (ii). Integrating P |ψ| over the period cell bC then shows ψ ∈ L1 . And the mean value of P ψ equals Z Z X Z 1 P ψ(y) dy = ψ(y − bk) dy = ψ(y) dy. |bC| bC bC Rd d k∈Z

R Thus the bZd -periodic function g(x) = P ψ(x) − Rd ψ(y) dy has mean value zero and belongs to Lploc . If the dilations aj grow exponentially then P [5, Lemma 3] tells us that limJ→∞ J1 Jj=1 g(aj x) = 0 in Lploc , or Z J 1X P ψ(aj x) = ψ(y) dy lim J→∞ J Rd j=1

in Lploc .

(23)

22

H.-Q. Bui, R. S. Laugesen

(Formula (23) is the source of all scale averaging in this paper. It is a concrete version of Mazur’s theorem, which says that the weak convergence g(aj x) * 0 implies norm convergence of suitable convex combinations of the g(aj x).) With these preliminaries taken care of, we begin to prove parts (a)–(c). Part (a). Case (i). The estimate (17) implies that Rj is bounded by a constant independent of x, y, t and j, for all large j (using that a−1 j → R p 0). Note also Rj (x, y, t) → Rd |f (z)| dz as j → ∞, for each x, y, t (by interpreting the definition of Rj in (16) as a Riemann sum and using that f ∈ Cc ). Thus we may apply dominated convergence to formula (15) to obtain that lim sup kIj [ψ, φ]hkpp j→∞ Z Z Z Z ≤ |f (z)|p dz |ψ(x)|(P |ψ|(x))p−1 |φ(y)| dxdydω(t)kφk1p−1 , [0,1]

Rd

Rd

Rd

which implies estimate (21). Case (ii). By dominated convergence, as j → ∞ the righthand side of (19) approaches the limiting value Z p kh(· , 0)kpp QE ψ(y) dy · kP 1E k∞ kP |ψ|kp−1 ∞ kφkq , Rd

because QE ψ ∈ L1 and kh(·, y)kp ∈ L∞ while h(·, y) → h(·, 0) in Lp as y → 0 by assumption (20). This proves (21) in case (ii), since we can now replace QE with Q like we did after (19). Before considering parts (b) and (c) of the lemma, we prove (21) for a useful variant of h from case (i). Lemma 6. Assume ψ ∈ LpR for some 1 ≤ p < ∞, and that P |ψ| ∈ Lploc , φ ∈ L1 , and h∗ (x, y) = [0,1] |f (x + ty) − f (x)| dω(t) for some f ∈ Cc and some Borel probability measure ω on [0, 1]. Then h∗ (x, 0) ≡ 0, and limj→∞ kIj [ψ, φ]h∗ kp = 0. Proof (of Lemma 6). We have kIj [ψ, φ]h∗ kpp Z Z Z ≤ [0,1]

Rd

Rd

Rj∗ (x, y, t) |ψ(x)|(P |ψ|(x))p−1 |φ(y)| dxdydω(t) kφkp−1 1 (24)

Sobolev spaces and approximation by affine spanning systems

23

by applying (15) to h∗ instead of to h, where X −1 −1 p Rj∗ (x, y, t) = | det a−1 |f (a−1 j b| j (x+bk)+taj (y−x))−f (aj (x+bk))| . k∈Zd

R Clearly Rj∗ (x, y, t) is a Riemann type sum, converging pointwise to Rd |f (z)− f (z)|p dz = 0 as j → ∞, since f is continuous with compact support. And like in the proof of Lemma 5(a) in case (i), one finds Rj∗ (x, y, t) is bounded by a constant independent of x, y, t and j, for all large j. Thus dominated convergence applied to (24) gives kIj [ψ, φ|]h∗ kp → 0 as j → ∞. Now we return to proving Lemma 5. Parts (b) and (c). Define H(x, y) = h(x, y) − h(x, 0) ∈ L(p,∞) . Then the definition of Ij in (11) implies Z (Ij [ψ, φ]h)(x) = (Ij [ψ, φ]H)(x) + h(x, 0)P ψ(aj x)

φ(z) dz.

(25)

Rd

R Case (i). Suppose h(x, y) = [0,1] f (x + ty) dω(t) for some f ∈ Cc and R some Borel probability measure ω, so that H(x, y) = [0,1] (f (x + ty) − f (x)) dω(t). Then lim Ij [ψ, φ]H = 0 in Lp (26) j→∞

by Lemma 6, because |Ij [ψ, φ]H| ≤ Ij [|ψ|, |φ|]h∗ pointwise.R To prove part (b) of the lemma, observe if P ψ(x) = Rd ψ(y) dy for almost every x that the desired limit (22) follows immediately from (26) and decomposition (25). For part (c) we just use (26) and (25) and observe that lim h(x, 0)

J→∞

Z J 1X P ψ(aj x) = h(x, 0) ψ(y) dy J Rd

in Lp ,

j=1

by the boundedness and compact support of h(x, 0) = f (x) ∈ Cc and using the Lploc convergence of the periodizations in (23). Case (ii). In this case limj→∞ Ij [ψ, φ]H = 0 in Lp by part (a) of the lemma, because H ∈ L(p,∞) and H(· , y) → H(· , 0) = 0 as y → 0 by hypothesis (20). Hence part (b) of the lemma again follows from the decomposition (25).

24

H.-Q. Bui, R. S. Laugesen

Part (c) follows like in the proof of part (i) above when h(x, 0) is bounded with compact support. But we can reduce part (c) to this situation by the stability estimate kIj [ψ, φ]hkp ≤ Ckhk(p,∞) (proved in Lemma 4, formula (12)) in conjunction with the following density ar˜ ∈ Cc (Rd ) with kh(· , 0) − hk ˜ p < ², and gument. Given ² > 0, choose h then define ( ˜ h(x) if |y| ≤ ², h² (x, y) = h(x, y) otherwise. Then trivially limy→0 h² (· , y) = h² (·, 0) in Lp , while ˜ p→0 kh − h² k(p,∞) ≤ max kh(· , y) − h(· , 0)kp + kh(· , 0) − hk |y|≤²

as ² → 0. That is, we can approximate h arbitrarily closely in L(p,∞) by a function satisfying the same hypotheses as h but which is also bounded with compact support when y = 0. 7. A preliminary result: Strang–Fix implies constant periodization Here we establish a lemma explaining Theorem 1’s hypotheses on the zeb Roughly, if the Fourier transform of ψ vanishes at every nonzero ros of ψ. lattice point, and so do its derivatives up to order n∗ , then the moments of ψ up to order n∗ must all have constant periodization. Recall X(x) = x is the identity function, and χ(x) = 1 + |x|. Lemma 7. Take integers 0 ≤ m∗ ≤ m, 0 ≤ n∗ ≤ n, and suppose ψ ∈ W m,1 with χn ψ (ρ) ∈ L1 for all multiindices of order |ρ| = m∗ . (a) Then ψb ∈ C n (Rd \ {0}), and ψb ∈ C n (Rd ) if χn ψ ∈ L1 . b −1 ) = 0 for all |σ| ≤ n∗ and all row vectors ` ∈ Zd \ {0}, (b) If Dσ ψ(`b then the periodization of (−X)σ ψ (ρ) is constant for each |σ| ≤ n∗ , |ρ| = m∗ , with ( R σ! σ−ρ ψ(y) dy if σ ≥ ρ, σ (ρ) (σ−ρ)! Rd (−y) P ((−X) ψ )(x) = a.e. 0 otherwise, Proof (of Lemma 7). Part (a). Suppose |σ| ≤ n and |ρ| = m∗ . Then (−2πiX)σ ψ (ρ) is integrable by the assumption χn ψ (ρ) ∈ L1 . So we can R d (ρ) −2πiξx dx through the (ρ) (ξ) = differentiate the transform ψ Rd ψ (x)e d (ρ) ∈ C n (Rd ) since σ was arbitrary. But integral σ times, obtaining that ψ d (ρ) (ξ) = (2πiξ)ρ ψ(ξ), b ψ and so ψb has n continuous derivatives away from the set {ξ : ξ ρ = 0}. By considering all pure multiindices (meaning ρ =

Sobolev spaces and approximation by affine spanning systems

25

(m∗ , 0, . . . , 0) and so on) we deduce that ψb has n continuous derivatives away from the origin. Part (b). The periodization x 7→ P ((2πi(−X))σ ψ (ρ) )(bx) is Zd -periodic and is locally integrable. Its `-th Fourier coefficient is Z P ((2πi(−X))σ ψ (ρ) )(bx)e−2πi`x dx C Z X −1 (2πi(−x + bk))σ ψ (ρ) (x − bk)e−2πi`b x dx = bC

k∈Zd

by x 7→ b−1 x and definition of P

Z

(−2πix)σ ψ (ρ) (x)e−2πi`b Rd ¯ Z ¯ σ (ρ) −2πiξx = Dξ ψ (x)e dx¯¯

=

Rd

¯ b ¯¯ = Dξσ (2πiξ)ρ ψ(ξ)

−1 x

dx

by x 7→ x + bk

(27)

ξ=`b−1

ξ=`b−1

by parts. This last expression is zero when |σ| ≤ n∗ and ` ∈ Zd \ {0}, by the hypothesis on the zeros of ψb and its derivatives. Thus all the Fourier coefficients of P ((2πi(−X))σ ψ (ρ) ) vanish except possibly the zeroth one, and so P ((2πi(−X))σ ψ (ρ) ) is a constant function. This constant value is given by the ` = 0 Fourier coefficient, which by (27) equals ( R Z σ! σ−ρ ψ(x) dx if σ ≥ ρ, (2πi)|σ| (σ−ρ)! σ (ρ) Rd (−x) (−2πix) ψ (x) dx = 0 otherwise, Rd after integrating by parts ρ times. 8. Proof of Theorem 1 First we show the hypotheses of the theorem make sense. b −1 ) makes To start with we show ψb ∈ C m (Rd \ {0}), so that Dµ ψ(`b sense whenever |µ| ≤ m and ` 6= 0. So let µ be a multiindex of order |µ| ≤ m. We have P |χ|µ| ψ (µ) | ∈ Lploc ⊂ L1loc (from hypothesis in case (i), and in case (ii) by using also Lemma 11). Hence χ|µ| ψ (µ) ∈ L1 , so that ψ ∈ W m,1 . Lemma 7(a) with n∗ = n = m∗ = m now tells us that ψb ∈ C m (Rd \ {0}). Note also that χ|µ| ψ (µ) ∈ Lp by Lemma 9.

26

H.-Q. Bui, R. S. Laugesen

Next, ψ ∈ L1 by above, while φ ∈ L1 from the hypotheses in cases (i) and (ii). Thus the normalizations on the integrals of ψ and φ (in the statement of Theorem 1) do make sense. Now we commence the proof, by showing fj ∈ W m,p . Fix a multiindex ρ of order r := |ρ| ≤ m. If we formally take the derivative through the sum over k in the definition of fj , in formula (1), we find that ¶ X µZ −1 ρ f (aj y)φ(y − bk) dy arj ψ (ρ) (aj x−bk). (28) D fj (x) = | det b| k∈Zd

Rd

To make this rigorous, let h(x, y) = f (x+y) and notice the righthand side of equation (28) equals arj Ij [ψ (ρ) , φ]h, which belongs to Lp by Lemma 4. That lemma proves the sum over k in (28) converges pointwise absolutely a.e. to an Lp function. Then it is straightforward to show Dρ fj exists weakly and is given by (28). Hence fj ∈ W m,p . Part (d). In fact fj ∈ W m,p -span{ψj,k : k ∈ Zd }, because the sum over k in (28) converges unconditionally in Lp by Lemma 4. Parts (a)(b)(c). Our first step is to add and subtract an appropriate Taylor polynomial inside the formula (28) for Dρ fj . Specifically, we will show Dρ fj = Mainj + Remj where Mainj (x) X µσ ¶ Z X f (σ) (x) r−s aj y σ−τ φ(y) dy · P ((−X)τ ψ (ρ) )(aj x), = σ! τ d R τ ≤σ

|σ|≤r

(29) Remj (x) = | det b|

  X f (σ) (x) σ  f (a−1 y) − (a−1 φ(y − bk) dy  j j y − x) σ! d R d 

X k∈Z

Z



|σ|≤r

· arj ψ (ρ) (aj x − bk), (30) ¡σ¢ ¡σ1 ¢ ¡σd ¢ with s = |σ| and with τ = τ1 · · · τd being a product of binomial coefficients. To see this, substitute the binomial identity X µσ ¶ −1 −s σ (aj y − x) = aj (y − bk)σ−τ (bk − aj x)τ τ τ ≤σ

Sobolev spaces and approximation by affine spanning systems

27

into Remj (x), which leads to cancellation with all the terms in Mainj (x) and thereby reduces us back to the known formula (28) for Dρ fj . Remainder term. We will show Remj → 0 in Lp as j → ∞. In fact we take absolute values in (30) and aim to show ¶ X µZ −1 r lim | det b| hr (x, aj y − x)|y − aj x| |φ(y − bk)| dy j→∞

k∈Zd

Rd

· |ψ (ρ) (aj x − bk)| = 0

(31)

in Lp , where hr (x, y) =

( ¯¯ P ¯f (x + y) − |σ|≤r

¯.

f (σ) (x) σ ¯ σ! y ¯

0

|y|r

when y 6= 0, when y = 0.

(32)

Taylor’s formula with integral remainder enables us to rewrite ¯ ¯, ¯Z ¯ X 1 ¯ ¯ (σ) (σ) σ ¯ hr (x, y) = ¯ [f (x + ty) − f (x)]y dωr (t)¯¯ |y|r ¯ [0,1] |σ|=r σ! ¯ for almost every (x, y) ∈ Rd × Rd , where ωr is the probability measure on [0, 1] defined by ( r(1 − t)r−1 dt if r > 0, dωr (t) = dδ1 (t) if r = 0. Hence

Z

hr (x, y) ≤ Hr (x, y) :=

X

[0,1] |σ|=r

|f (σ) (x + ty) − f (σ) (x)| dωr (t).

(33)

After putting hr ≤ Hr and the estimate |y − aj x| ≤ |y − bk| + |bk − aj x| ≤ (1 + |y − bk|)(1 + |bk − aj x|)

(34)

into (31), we see it’s enough to prove lim Ij [|χr ψ (ρ) |, φr ]Hr = 0

j→∞

in Lp

(35)

where φr = |χr φ|. Our hypotheses on φ guarantee in case (i) that φr ∈ L1 , and in case (ii) that φr ∈ Lq with compact support. Hence in case (i), the desired limit (35) follows from Lemma 6, because Hr has the form required of h∗ in that lemma and f (σ) ∈ Cc .

28

H.-Q. Bui, R. S. Laugesen

In case (ii), we see that (35) follows from Lemma 5(a) provided we show Hr ∈ L(p,∞) and Hr (·, y) → Hr (·, 0) = 0 in Lp as y → 0. But (33) implies kHr (·, y)kp XZ X ≤ kf (σ) (· + ty) − f (σ) kp dωr (t) (which is ≤ 2 kf (σ) kp ) |σ|=r

[0,1]

|σ|=r

(36) →0

as y → 0.

This completes our proof that the remainder term Remj vanishes in Lp in the limit as j → ∞. Main term. Next we examine Mainj (x). Since |τ | ≤ |σ| ≤ r = |ρ|, if either τ < σ or |σ| < r then 0 ≤ |τ | < |ρ| = r, and so P ((−X)τ ψ (ρ) ) = 0

a.e.

(37)

by Lemma 7(b) (with m∗ = n = r and n∗ = r − 1). It is here in Lemma 7 b that we employ the Strang–Fix hypothesis (2) on the zeros of ψ. Most terms in Mainj (x) vanish by (37). are left have ¡σ¢ The ones Rthat σ−τ |σ| = r and τ = σ, so that s = r and = 1 and y φ(y) dy = d τ R R φ dy = 1. Thus d R Mainj (x) =

X f (σ) (x) P ((−X)σ ψ (ρ) )(aj x). σ!

(38)

|σ|=r

Proof of Part (c). Assume (ii) holds, in this part of the proof. Let 0 ≤ |ρ| = r ≤ m. Then k Remj kp ≤ kIj [|χr ψ (ρ) |, φr ]Hr kp by the estimates leading up to (35) X (σ) kf kp by Lemma 4 and (36). ≤ C(ψ, φ, p) |σ|=r

And k Mainj kp ≤

X

kf (σ) kp kP ((−X)σ ψ (ρ) )k∞

by (38)

|σ|=r

≤C

X

kf (σ) kp kQ((−X)σ ψ (ρ) )k1

by Lemma 11.

|σ|=r

Combining these two estimates and summing over |ρ| = r gives the seminorm stability |fj |W r,p ≤ C(ψ, φ, r, p)|f |W r,p , and then summing over r = 0, . . . , m gives the norm stability kfj kW m,p ≤ C(ψ, φ, m, p)kf kW m,p .

Sobolev spaces and approximation by affine spanning systems

29

Proof of Parts (a) and (b). We need only consider f ∈ Ccm when proving part (b): in case (i) we already assume f ∈ Ccm , and in case (ii) we can reduce to f ∈ Ccm by the density of such functions in W m,p and the stability bound kfj kW m,p ≤ Ckf kW m.p proved in part (c). To prove parts (a) and (b), we will first show Dρ f = lim Dρ fj j→∞

in Lp

if |ρ| < m and f ∈ W m,p .

(39)

Then to complete the approximation formula in (a) we will show that if the Strang–Fix hypothesis (2) holds for all multiindices of order ≤ m (not just < m), then Dρ f = lim Dρ fj j→∞

in Lp

if |ρ| = m and f ∈ W m,p .

(40)

To complete the approximation formula in (b) we will show (if the dilations aj grow exponentially) that J 1X ρ D fj J→∞ J

Dρ f = lim

in Lp

if |ρ| = m and f ∈ Ccm .

(41)

j=1

Proof of limits (39) and (40). For proving the first limit (39) we suppose |σ| = |ρ| = r < m. Then ( ρ! if σ = ρ σ (ρ) P ((−X) ψ )(x) = 0 otherwise ∗ ∗ for R almost every x, by Lemma 7(b) with m = n ρ= n = r (and recalling Rd ψ dy = 1). Hence (38) simplifies to Mainj = D f , meaning (39) follows immediately from our remainder estimate Remj → 0. To prove the next limit (40), just apply the same reasoning with r = m.

Proof of limit (41). To prove the third limit (41), suppose |σ| = |ρ| = r = m. Define the function ( P ((−X)σ ψ (ρ) )/σ! if σ = 6 ρ, gσ;ρ = ρ (ρ) [P ((−X) ψ )/ρ!] − 1 if σ = ρ, so that gσ;ρ ∈ Lploc is bZd -periodic. Then Mainj (x) = Dρ f (x) +

X |σ|=m

f (σ) (x)gσ;ρ (aj x),

(42)

30

H.-Q. Bui, R. S. Laugesen

by comparing with the expression (38) for Mainj (x). Each function gσ;ρ has mean value zero, because ( Z Z ρ! if σ = ρ |bC|−1 P ((−X)σ ψ (ρ) ) dx = (−X)σ ψ (ρ) dx = 0 otherwise bC Rd by parts, recalling |σ| = |ρ|. If the dilations aj grow exponentially, as assumed for (41), then [5, Lemma 3] applies to each gσ;ρ and says that J 1X gσ;ρ (aj x) = 0 J→∞ J

lim

in Lploc .

j=1

Then lim f

(σ)

J→∞

J 1X (x) gσ;ρ (aj x) = 0 J

in Lp ,

j=1

because each f (σ) is bounded and has compact support (recalling f ∈ Ccm for (41)). Hence J 1X Mainj = Dρ f J→∞ J

lim

in Lp

(43)

j=1

by (42). Combining (43) with Remj → 0, we deduce the limit (41). 9. Proof of Theorem 2 Our initial task is to show fj• ∈ W m,p . Fix a multiindex ρ with r := |ρ| ≤ m. Like in Theorem 1, formally differentiating the definition (7) of fj• yields that X r (ρ) f (a−1 Dρ fj• (x) = | det b| (44) j bk)aj ψ (aj x − bk). k∈Zd

The righthand side of this equation is exactly arj fj• [ψ (ρ) ], where the temporary notation fj• [ψ (ρ) ] denotes the function obtained by replacing ψ with ψ (ρ) in the definition of fj• . Now to show rigorously that fj• is weakly differentiable with derivative given by (44), it is enough (like in the proof of Theorem 1) to observe that the series defining fj• [ψ (ρ) ] converges absolutely a.e. to an Lp function, which it does by [5, Theorem 2(e) and its Remark 3]. Note ψ (ρ) does satisfy the hypotheses of [5, Theorem 2(e)], by using Lemma 11.

Sobolev spaces and approximation by affine spanning systems

31

Hence fj• ∈ W m,p . Part (d). In fact fj• ∈ W m,p -span{ψj,k : k ∈ Zd }, because the sum over k in (44) converges unconditionally in Lp , by [5, Theorem 2(e)] applied to ψ (ρ) . Parts (a)(b). We will first show Dρ f = lim Dρ fj• j→∞

in Lp

if |ρ| < m.

(45)

Then to complete the approximation formula in (a) we will show that if hypothesis (8) holds for all multiindices of order ≤ m (not just < m), then Dρ f = lim Dρ fj• in Lp if |ρ| = m. (46) j→∞

And to complete the proof of part (b) we will show (if the dilations aj grow exponentially) that J 1X ρ • D f = lim D fj J→∞ J ρ

in Lp

if |ρ| = m.

(47)

j=1

To begin with, we calculate from (44) that Dρ fj• = Mainj + Rem•j where X f (σ) (x) P ((−X)σ ψ (ρ) )(aj x), σ! |σ|=r   X X f (σ) (x) σ f (a−1 bk) − Rem•j (x) = | det b| (a−1 j j bk − x) σ! d

Mainj (x) =

k∈Z

|σ|≤r

· arj ψ (ρ) (aj x − bk), noting in this calculation that if |σ| < r then P ((−X)σ ψ (ρ) ) ≡ 0 by Lemma 7(b) with m∗ = n = r and n∗ = r − 1. In formulas (45) and (46) we have Mainj = Dρ f , as shown in the proof of (39) and (40) in the previous section. Thus for proving (45) and (46), we have only to show Rem•j → 0 in Lp . In formula (47) we can express Mainj as in (42), and f

(σ)

J 1X (x) gσ;ρ (aj x) → 0 J j=1

in Lp as J → ∞

32

H.-Q. Bui, R. S. Laugesen

by Lemma 12, since Q(f (σ) ) ∈ Lp . Therefore (43) holds, and so to prove (47) we again have only to show Rem•j → 0 in Lp . Remainder term. We will show Rem•j → 0 in Lp as j → ∞. After taking absolute values inside Rem•j (x), we would like to show X

lim | det b|

j→∞

r (ρ) hr (x, a−1 j bk − x)|bk − aj x| |ψ (aj x − bk)| = 0

k∈Zd

in Lp , where the function hr was defined in (32). Since hr ≤ Hr by the Taylor remainder estimate (33), it suffices to prove lim | det b|

j→∞

X

r (ρ) Hr (x, a−1 j bk −x) |(χ ψ )(aj x−bk)| = 0

in Lp . (48)

k∈Zd

We will do this by comparing with the analogous limit that uses average rather than pointwise sampling. So let our analyzer be φ = 1bC /|bC| and subtract from (48) the quantity Ij [|χr ψ (ρ) |, φ]Hr . This quantity tends to zero in Lp as j → ∞ by Lemma 5(a), observing Hr (·, y) → Hr (·, 0) = 0 in Lp as y → 0. After performing the subtraction of Ij [|χr ψ (ρ) |, φ]Hr from (48) and then taking absolute values, we see it would be enough to prove (whenever |σ| = |ρ| = r ≤ m) that ¯ ¯ XZ Z ¯ (σ) ¯ −1 (σ) lim | det b| bk − x)) − f (x + t(a y − x)) ¯f (x + t(a−1 ¯ j j j→∞

k∈Zd

Rd

[0,1]

dωr (t) φ(y − bk) dy|(χr ψ (ρ) )(aj x − bk)| = 0

in Lp .

−1 But φ(y−bk) 6= 0 if and only if y −bk ∈ bC, in which case |a−1 j bk−aj y| ≤ √ ka−1 j bk d. Therefore the last limit would follow from

lim | det b|

j→∞

XZ k∈Zd

Z

Rd

[0,1]

(Sa−1 b f (σ) )(x + t(a−1 j y − x)) dωr (t) φ(y − bk) dy j

· |(χr ψ (ρ) )(aj x − bk)| = 0 in Lp , where the modulus of continuity operator S is defined in Appendix A. Thus our goal is now to prove lim Ij [|χr ψ (ρ) |, φ]Tj = 0

j→∞

where Tj (x, y) =

R

f [0,1] (Sa−1 j b

(σ) )(x

+ ty) dωr (t).

in Lp

(49)

Sobolev spaces and approximation by affine spanning systems

33

The stability estimate in Lemma 4 together with Minkowski’s integral inequality implies that kIj [|χr ψ (ρ) |, φ]Tj kp ≤ C(ψ, p, r)kTj k(p,∞) ≤ C(ψ, p, r)kSa−1 b f (σ) kp (50) j

→0

as j → ∞

by Lemma 13, which is valid since Qf (σ) ∈ Lp and f (σ) ∈ C by hypothesis. This proves (49), completing our proof that Rem•j → 0 in Lp . Part (c). We have k Mainj kp ≤ C(ψ, m, p)kf kW m,p from the proof of Theorem 1(c). To get stability of the remainder term Remj , it suffices to show (in view of our proof above) that ° ° X ° ° r (ρ) kf (σ) kp , (51) °Ij [|χ ψ |, φ]Hr ° ≤ C(ψ, r, p) p

|σ|=r

° ° X ° ° kQ(f (σ) )kp . °Ij [|χr ψ (ρ) |, φ]Tj ° ≤ C(ψ, r, p, amin ) p

(52)

|σ|=r

The first inequality follows from Lemma 4 together with the estimate P kHr k(p,∞) ≤ |σ|=r 2kf (σ) kp in (36), and the second inequality follows from (50) and the fact that kSa−1 b f (σ) kp ≤ C(amin )kQf (σ) kp for all j > 0 j by Lemma 13. 10. Proof of Lemma 3. Examples. Notice φb ∈ C m and ψb ∈ C m−1 , since χm φ ∈ L1 and χm−1 ψ ∈ L1 . We adapt the reasoning in [34, p. 833] Let K = {k ∈ Zd : P as follows. |k1 |+· · ·+|kd | ≤ m} and write B(ξ) = k∈K βk e2πiξk for a trigonometric polynomial with coefficients βk to be determined later. After checking that Z ³ ´¯ ¯ b (−x)µ Ψ (x) dx = (2πi)−|µ| Dµ B(ξb)ψ(ξ) , |µ| ≤ m − 1, ¯ ξ=0

Rd

we see the task for Ψ in (10) is to choose B such that the derivatives of b −1 . In B(ξb) agree up to order m − 1 at ξ = 0 with the derivatives of ψ(ξ) b −1 )−1 other words the derivatives of B(ξ) should agree with those of ψ(ξb up to order m − 1, at ξ = 0. This is true if we take ³ ´¯ X µ b −1 −1 ¯ B(ξ) = Dθ ψ(θb ) pµ (ξ) ¯ |µ|≤m−1

θ=0

34

H.-Q. Bui, R. S. Laugesen

where θ ∈ Rd is regarded as a row vector and p0 (ξ) ≡ 1 and where for 0 < |µ| ≤ m − 1 we write pµ (ξ) for the unique polynomial of degree m − 1 jointly in e2πiξ1 , . . . , e2πiξd such that ( Dσ pµ (0) =

1 0

if σ = µ otherwise

for all |σ| ≤ m − 1.

P Then B(ξ) has the desired form k∈K βk e2πiξk , and our coefficients βk are determined. Argue similarly to construct Φ satisfying (9). Examples for Lemma 3. In special cases we can argue directly to construct Φ and Ψ , rather than R followingRthe method of the proof above. Take b = I for simplicity, and Rd φ dx = Rd ψ dx = 1. 1. Let m = 1. If φ(x) is even with respect to each component xi , then we can take Φ = φ and Ψ = ψ, in Lemma 3. 2. Let m = 2. If φ(x) and ψ(x) are even with respect to each component xi , and φ(x) is symmetric in x1 , . . . , xd , then in Lemma 3 we can take Ψ = ψ and X φ(x − k), Φ(x) = α0 φ(x) − α1 k:|k|∞ =1

where |k|∞ := max1≤i≤d |xi | and R d

α0 = 1 + (3 − 1)α1 ,

α1 =

x21 φ(x) dx . 2 · 3d−1

Rd

We leave the reader to verify that (9) and (10) hold with m = 2. Notice 3d − 1 = #{k : |k|∞ = 1}. In dimension d = 1 this construction to Φ(x) = (1+2α1 )φ(x)− R reduces 1 2 α1 φ(x − 1) − α1 φ(x + 1) with α1 = 2 R x φ(x) dx, provided φ is an even function of one variable.

11. Proof of Theorem 3 Fix a multiindex ρ with r := |ρ| ≤ m − 1. Part (a). We decompose Dρ Fj = Mainj + Remj

Sobolev spaces and approximation by affine spanning systems

35

where X f (σ) (x) ar−s · P ((−X)σ Ψ (ρ) )(aj x), j σ! |σ|≤m   XZ X f (σ) (x) σ f (a−1 y) − Remj (x) = | det b| (a−1 j j y − x) σ! d d R

Mainj (x) =

k∈Z

|σ|≤m

· Φ(y − bk) dy arj Ψ (ρ) (aj x − bk), and s = |σ|. These quantities are identical to Mainj and Remj in (29) and (30) (see the proof of Theorem 1) except that here we sum over |σ| ≤ m instead of |σ| R≤ r and we use the moment conditions (9) on Φ to evaluate the moments Rd y σ−τ Φ(y) dy. Note that the periodization P ((−X)σ Ψ (ρ) ) occurring in Mainj is bounded, by the hypothesis that Q(χm Ψ (ρ) ) ∈ L1 and Lemma 11. Remainder term. We first show k Remj kp ≤ C(ψ, φ, m, p)|f |W m,p |aj |r−m

for all j > 0.

(53)

Now, Remj is bounded pointwise by ¶ X µZ −1 m | det b| hm (x, aj y − x)|y − aj x| |Φ(y − bk)| dy k∈Zd

Rd

· |Ψ (ρ) (aj x − bk)||aj |r−m

(54)

where hm is defined by taking “r = m” in (32). And after using (34) to estimate |y − aj x|, we see that (54) is bounded by |aj |r−m times Ij [|χm Ψ (ρ) |, Φm ]hm where Φm = |χm Φ|. Hence (53) follows from kIj [|χm Ψ (ρ) |, Φm ]hm kp ≤ C(ψ, φ, m, p)|f |W m,p

for all j > 0,

which holds by the stability estimate in Lemma 4 in view of the following observations. First, Q(χm ψ (ρ) ) ∈ L1 by hypothesis, which implies χm Ψ (ρ) ∈ L1 ∩ L∞ ⊂ Lp by Lemma 11 with “r = 1”. Second, φ ∈ Lq has compact support and so Φ does too, so that Φm ∈ Lq with compact support. Third, hm ∈ L(p,∞) with khm k(p,∞) ≤ C(m, p)|f |W m,p by (33) and (36) (with r changed to m). Main term. Next we simplify Mainj . Notice that Dρ Ψb(`b−1 ) = 0 for all row vectors ` ∈ Zd \ {0}, because the same is assumed for ψ, in this theorem (recalling |ρ| = r ≤ m − 1). Hence ( ρ! if σ = ρ σ (ρ) P ((−X) Ψ ) = whenever |σ| < m, 0 otherwise

36

H.-Q. Bui, R. S. Laugesen

by Lemma 7(b) applied to Ψ (with n∗ = n = m − 1 and m∗ = r and with “m” in the lemma replaced by m − 1), using also here the moment condition (10) on Ψ . Thus the only terms in Mainj that can make a nonzero contribution are those either with |σ| = m or else with |σ| < m and σ = ρ. Hence Mainj (x) = f (ρ) (x) +

X f (σ) (x) ar−m · P ((−X)σ Ψ (ρ) )(aj x), j σ!

|σ|=m

so that k Mainj −Dρ f kp ≤

X

kf (σ )kp kP ((−X)σ Ψ (ρ) )k∞ |aj |r−m

|σ|=m

≤ C(ψ, m, p)|f |W m,p |aj |r−m .

(55)

By putting together (53) and (55) we get kDρ Fj − Dρ f kp ≤ C(ψ, φ, m, p)|f |W m,p |aj |r−m , which proves part (a) of the theorem. Part (b). Similar to part (a) we decompose Dρ Fj• = Mainj + Rem•j where Rem•j (x)



 X f (σ) (x) σ  r (ρ) f (a−1 bk) − (a−1 aj Ψ (aj x − bk). j j bk − x) σ! d

X

= | det b|

k∈Z

|σ|≤m

The term Mainj was estimated already in part (a), leading to (55). Hence to prove part (b) it suffices to show the remainder estimate k Rem•j kp ≤ C(ψ, m, p, amin )

X

kQf (σ) kp |aj |r−m

for all j > 0.

|σ|=m

Notice Rem•j is bounded pointwise by | det b|

X k∈Zd

m (ρ) hm (x, a−1 (aj x − bk)| · |aj |r−m . j bk − x)|bk − aj x| |Ψ

Sobolev spaces and approximation by affine spanning systems

After using hm ≤ Hm like in (33), where Hm (x, y) =

P

37

R

|σ|=m [0,1] |f

(σ) (x+

ty) − f (σ) (x)| dωm (t), we reduce the remainder estimate to showing ° ° ° ° X ° ° −1 m (ρ) °| det b| ° bk − x) |(χ Ψ )(a x − bk)| H (x, a j m j ° ° ° ° k∈Zd p X (σ) ≤ C(Ψ, m, p, amin ) kQf kp (56) |σ|=m

for all j > 0. Next we let φ = 1bC /|bC|, and subtract and add the quantity Ij [|χm Ψ (ρ) |, φ]Hm inside the Lp norm on the left of (56). By reasoning like we did leading up to (49), we deduce (56) will follow once we verify ° ° X ° ° m (ρ) kf (σ) kp , (57) °Ij [|χ Ψ |, φ]Hm ° ≤ C(Ψ, m, p) p

|σ|=m

° ° X ° ° kQf (σ) kp , °Ij [|χm Ψ (ρ) |, φ]Tj ° ≤ C(Ψ, m, p, amin ) p

where Tj (x, y) =

R

f [0,1] (Sa−1 j b

(58)

|σ|=m

(σ) )(x+ty) dω (t), |σ| m

= m. But inequalities

(57)–(58) are essentially the same as (51)–(52) except with r = m, and so they are proved already by the paragraph after (51)–(52). 12. Proof of Proposition 1 If f ∈ W m,p then f can be approximated arbitrarily well in the W m,p norm by linear combinations of functions in H. In the process, Dν f gets approximated arbitrarily well in the W m−|ν|,p -norm by linear combinations of functions in Dν H. Thus we need only prove that the collection {Dν f : f ∈ W m,p } is dense in W m−|ν|,p . The next lemma does this. Write S for the Schwartz class. Lemma 8. {Dσ f : f ∈ S} is dense in W n,p for all 1 < p < ∞, n ∈ N∪{0} and multiindices σ. Proof (of Lemma 8). For σ = 0, the claim is simply that the Schwartz class is dense in W n,p , which is well known. Now we use induction on σ. The task is to show that if {Dσ f : f ∈ S} is dense in W n,p for all 1 < p < ∞, n ∈ N ∪ {0}, then the same is true for the multiindex σ + et for each t = 1, . . . , d. Without loss of generality we can suppose t = 1, so that e1 = (1, 0, . . . , 0).

38

H.-Q. Bui, R. S. Laugesen

Let 1 < p < ∞, n ∈ N ∪ {0}, and take u ∈ S and ε > 0. The induction hypothesis implies that ku − Dσ f kW n+1,p < ε for some f ∈ S. In particular, kD1 u − Dσ+e1 f kW n,p < ε. Thus we have only to show that {D1 u : u ∈ S} is dense in W n,p . Suppose to the contrary that it is not dense. Then by the Hahn–Banach theorem there exists a functional g ∈ (W n,p )∗ \ {0} such that g[D1 u] = 0 for all u ∈ S. The functional g can be written as a sum of distributional derivatives, P τ q with g = c |τ |≤n τ D gτ for some functions gτ ∈ L , by the standard representation of the dual space (W n,p )∗ (see [1, Theorem 3.8]). Hence if η is a mollifier then the mollified distribution X g (ε) = ηε ∗ g = cτ ε−|τ | (Dτ η)ε ∗ gτ |τ |≤n

is a smooth function belonging to Lq , for each ε > 0. We know D1 g (ε) = ηε ∗ D1 g = 0, because D1 g = 0 as a distribution by construction above. Thus the function g (ε) is constant in the x1 -direction. Since g (ε) is also Lq -integrable, it must be identically zero. Letting ε → 0 gives g = 0 as a distribution, and hence by density of S we see g = 0 as a functional on W n,p . This contradicts the construction of g, completing the proof. 13. Proof of Theorem 4 Write z = bκ. Clearly W m,p -span{ψj,k : k ∈ Zd } ⊃ W m,p -span{(∆c,z ψ)j,k : k ∈ Zd }

(59)

because ∆c,z ψ(aj x − bk) = ψ(aj x − bk) − cψ(aj x − b(k + κ))

(60)

and k + κ ∈ Zd . To prove the reverse inclusion in (59), first consider the case |c| < 1. Temporarily fix k ∈ Zd . For n ≥ 1, examine the linear combination n−1 X `=0

c` (∆c,z ψ)j,k+κ` =

n−1 X

c` [ψj,k+κ` − cψj,k+κ(`+1) ] using (60)

`=0

= ψj,k − cn ψj,k+κn → ψj,k

by telescoping in W m,p as n → ∞,

because |c| < 1. Thus ψj,k belongs to the righthand side of (59), so that equality holds in (59).

Sobolev spaces and approximation by affine spanning systems

39

Next consider |c| > 1. (We will reduce to the case “|c| < 1”.) Notice ∆c,z ψ(x) = −c[ψ(x − bκ) − c−1 ψ(x)] = −c∆c−1 ,−z ψ(x − bκ) and hence (∆c,z ψ)j,k = −c(∆c−1 ,−z ψ)j,k+κ . Thus W m,p -span{(∆c,z ψ)j,k : k ∈ Zd } = W m,p -span{(∆c−1 ,−z ψ)j,k : k ∈ Zd }. Since |c−1 | < 1, the previous case now implies equality in (59). We have handled all cases |c| = 6 1. To complete the proof of the theorem, we now suppose 1 < p < ∞ and |c| ≤ 1. To show equality holds in (59), we take n ≥ 1 and examine the linear combination n−1 X `=0

n−1

Xn−` n−` ` c (∆c,z ψ)j,k+κ` = c` [ψj,k+κ` − cψj,k+κ(`+1) ] n n =

`=0 n−1 X `=0

n

Xn−`+1 n−` ` c ψj,k+κ` − c` ψj,k+κ` n n `=1

n 1X ` c ψj,k+κ` . = ψj,k − n `=1

Thus to show that ψj,k lies in the closed W m,p -span of {(∆c,z ψ)j,k0 : k 0 ∈ Zd }, we need only show n

1X ` c ψj,k+κ` → 0 n

in W m,p as n → ∞.

(61)

`=1

Take ε > 0 and choose u ∈ W m,p with compact support and satisfying kψ(aj x) − u(aj x)kW m,p < ε|aj |−d/p (here we use that p < ∞ and j is fixed). Then ° ° n n n ° °1 X X 1 1X ° ° c` ψj,k+κ` − c` uj,k+κ` ° kψj,k+κ` −uj,k+κ` kW m,p < ε ≤ ° °n ° n n m,p `=1

`=1

W

`=1

for all n (using |c| ≤ 1). Thus for (61) it remains only to show n

1X ` c uj,k+κ` = 0 n→∞ n lim

in W m,p

`=1

whenever u ∈ W m,p has compact support. We may further choose u to be supported in a set of the form y + bC for some y ∈ Rd (just by decomposing the original u into a finite sum of

40

H.-Q. Bui, R. S. Laugesen

functions with such supports, by a partition of unity). Then the functions uj,k+κ` for ` = 1, . . . , n have disjoint supports, so that ° n ° °1 X ° 1 ° ° c` uj,k+κ` ° ≤ n1/p ku(aj x)kW m,p ° °n ° n m,p `=1

W

→0

as n → ∞, because p > 1.

Acknowledgments Part of this paper was researched at the Institute for Mathematical Sciences, National University of Singapore, during the program on “Mathematics and Computation in Imaging Science and Information Processing” in 2004. We thank the IMS for its support. Also we thank Ilya Krishtal for stimulating discussions on frames and the Mexican hat function. Laugesen was partially supported by N.S.F. Award DMS–0140481, a Maclaurin Fellowship from the New Zealand Institute of Mathematics and its Applications, and a Visiting Erskine Fellowship from the University of Canterbury. A. The operators P, Q and S Throughout this appendix, we take f to be a measurable function on Rd that is finite a.e. Lemma 9. If P |f | ∈ Lploc for some 1 ≤ p ≤ ∞, then f ∈ Lp . Proof (of Lemma 9). The result is clear when p = ∞, because |f (x)| ≤ P k∈Zd |f (x − bk)|. Suppose 1 ≤ p < ∞, so that X X |f (x − bk)|p ≤ ( |f (x − bk)|)p . k∈Zd

k∈Zd

Hence if P |f | ∈ Lploc then P (|f |p ) ∈ L1loc , which implies |f |p ∈ L1 or f ∈ Lp . Recall the local supremum operator Qf (x) = ess. sup|y−x| R, (62)

for some constant C 0 > 0. 1,1 For this, it suffices by induction on (4) to show that if f ∈ Wloc and |Df (x)| ≤ C|x|−n−1 for almost every x with |x| > R, for some constants C, n > 0, then |f (x)| ≤ (C/n)|x|−n for almost every x with |x| > R. The proof goes by radial integration: for almost every direction 1,1 ∈ S d−1 , the function F (r) = f (rv) belongs to Wloc (0, ∞), with |F (r)| ≤ Rv ∞ −n |Df (sv)| ds ≤ (C/n)r for almost every r > R. r The lemma will now follow once we prove f ∈ Lploc and |f (x)| ≤ C|x|−d−² for almost all |x| > R

P |f | ∈ Lploc , (63) because when f = χ|µ| ψ (µ) we know f ∈ Lploc (since ψ ∈ W m,p ) while |f (x)| ≤ C|x|−d−² for almost all |x| > R by (62). To prove (63), write g = 1|x|≤R f and h = 1|x|>R f , so that g +h = f . It is easy to show P |g| ∈ Lploc , because g ∈ Lp has compact support. And h has a bounded radially decreasing L1 majorant of the form C(1+|x|)−d−² , by construction, so that P |h| ∈ L∞ by a Riemann sum argument (see [5, Lemma A.2]). Therefore P |f | ≤ P |g| + P |h| ∈ Lploc , which proves (63). =⇒

Proof (of Lemma 2). We need only show that −d−² f ∈ L∞ for almost all |x| > R loc and |f (x)| ≤ C|x|

Qf ∈ L1 , (64) |µ| (µ) ∞ m,∞ because when f = χ ψ we know f ∈ Lloc (since ψ ∈ W ) while hypothesis (4) ensures (62) and so |f (x)| ≤ C|x|−d−² for almost all |x| > R. To prove (64), write f = g+h like in the previous proof. Then Qg ∈ L1 because g is bounded with compact support, and Qh ∈ L1 because h has a bounded radially decreasing L1 majorant (cf. [6, Lemma 21]). Therefore Qf ≤ Qg + Qh ∈ L1 . Next we derive pointwise relations for f and Qf .

=⇒

42

H.-Q. Bui, R. S. Laugesen

Lemma 10. We have |f | ≤ Qf a.e. And if E is a bounded set in Rd then X |f (x)| ≤ QE f (y) := Qf (y + k) (65) √ k:|k| 0 be arbitrary. Then ¯ ¯p ¯ ¯p ¯ ¯ ¯ X ¯ Z J J X ¯ ¯ ¯1 ¯ p ¯f (x) 1 ¯ ¯ g(aj x)¯ dx ≤ kf k∞ g(aj x)¯¯ dx ¯ ¯ J B(0,R) ¯ B(0,R) ¯ J j=1 ¯ ¯ j=1

Z

→0

as J → ∞, by (66).

44

H.-Q. Bui, R. S. Laugesen

Furthermore, |f (x)| ≤ Qf (k) for almost every x ∈ k + C ⊂ B(k, definition of Q. Thus for each J, Z J 1X |f (x) g(aj x)|p dx J d R \B(0,R)

√ d), by

j=1



Z

X √ |k|>R− d

Qf (k)p

| k+C

J 1X g(aj x)|p dx J j=1

 ¶1/p p J µZ X 1  |g(aj x)|p dx ≤ Qf (k)p  J √ k+C j=1 |k|>R− d  Ã !1/p p Z J X X 1  ≤ Qf (k)p  |aj |−d |g(x)|p dx J √ a (k+C) j j=1 |k|>R− d X ≤ Qf (k)p · CkgkpLp (bC) 

X

√ |k|>R− d

(67)

since the mean value of the bZd -periodic function |g|p over the set aj (k+C) is bounded by a constant times its mean value over the period cell bC (see for example [6, Lemma 25]; the constant C depends on minj>0 |aj |). The expression (67)P can be made as small as we like by choosing R sufficiently large, because k∈Zd Qf (k)p < ∞ as explained below. Letting R → ∞ then proves the lemma. We have √ √ √ √ B(x, d) ⊂ B(0, 2 d) ⊂ ∪|`|