nonparametric density estimation for randomly ... - Semantic Scholar

Comment

Report 12 Downloads 208 Views

NONPARAMETRIC DENSITY ESTIMATION FOR RANDOMLY PERTURBED ELLIPTIC PROBLEMS I: COMPUTATIONAL METHODS, A POSTERIORI ANALYSIS, AND ADAPTIVE ERROR CONTROL D. ESTEP∗ , A. M˚ ALQVIST† , AND S. TAVENER‡ Abstract. We consider the nonparametric density estimation problem for a quantity of interest computed from solutions of an elliptic partial differential equation with randomly perturbed coefficients and data. Our particular interest are problems for which limited knowledge of the random perturbations are known. We derive an efficient method for computing samples and generating an approximate probability distribution based on Lion’s domain decomposition method and the Neumann series. We then derive an a posteriori error estimate for the computed probability distribution reflecting all sources of deterministic and statistical errors. Finally, we develop an adaptive error control algorithm based on the a posteriori estimate. Key words. a posteriori error analysis, adjoint problem, density estimation, domain decomposition, elliptic problem, Neumann series, nonparametric density estimation, random perturbation, sensitivity analysis AMS subject classifications. 65N15, 65N30, 65N55, 65C05

1. Introduction. The practical application of differential equations to model physical phenomena presents problems in both computational mathematics and statistics. The mathematical issues arise because of the need to compute approximate solutions of difficult problems while statistics arises because of the need to incorporate experimental data and model uncertainty. The consequence is that significant error and uncertainty attends any computed information from a model applied to a concrete situation. The problem of quantifying that error and uncertainty is critically important. We consider the nonparametric density estimation problem for a quantity of interest computed from the solutions of an elliptic partial differential equation with randomly perturbed coefficients and data. The ideal problem is to compute a quantity of interest Q(U ), expressed as a linear functional, of the solution U of ( ¡ ¢ −∇ · A(x)∇U = G(x), x ∈ Ω, (1.1) U = 0, x ∈ ∂Ω, where Ω is a convex polygonal domain with boundary ∂Ω and A(x), G(x) are stochastic functions that vary randomly according to some given probability structure. The ∗ Department of Mathematics and Department of Statistics, Colorado State University, Fort Collins, CO 80523 ([email protected]). D. Estep’s work is supported in part by the Department of Energy (DE-FG02-04ER25620, DE-FG02-05ER25699, DE-FC02-07ER54909), Lawrence Livermore National Laboratory (B573139), the National Aeronautics and Space Administration (NNG04GH63G), the National Science Foundation (DMS-0107832, DMS-0715135, DGE-0221595003, MSPA-CSE-0434354, ECCS-0700559), Idaho National Laboratory (00069249), and the Sandia Corporation (PO299784) † Department of Information Technology, Uppsala University, SE-751 05 Uppsala, Sweden ([email protected]. A. M˚ alqvist was supported in part by the Department of Energy (DEFG02-04ER25620)) ‡ Department of Mathematics, Colorado State University, Fort Collins, CO 80523 ([email protected]). S. Tavener’s work is supported in part by the Department of Energy (DE-FG02-04ER25620)

1

2

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

problem (1.1) is interpreted to hold almost surely (a.s.), i.e. with probability 1. Under suitable assumptions, e.g. A, G are uniformly bounded and have piecewise smooth dependence on their inputs (a.s.) with continuous and bounded covariance functions and A is uniformly coercive, Q(U ) is a random variable. The density estimation problem is: given probability distributions describing the stochastic nature of A, G, determine the probability distribution of Q. The approach we use extends to problems with more general Dirichlet or Robin boundary conditions in which the data for the boundary conditions are randomly perturbed as well as problems with more general elliptic operators in a straightforward way. The parametric density estimation problem assumes that the output distribution is one of the standard distributions so that the problem involves determining values for the parameters defining the distribution. The nonparametric density estimation problem is relevant when the output distribution is unknown and/or complicated, e.g. multi-modal. In this case, we seek to compute an approximate distribution for the output random variable using sample solutions of the problem. A limited version of this problem is to seek only to compute one or two moments, e.g. the expected value. However, this is of limited utility when the output distribution is complicated, as it tends to be for outputs computed from (1.1) under general conditions. Nonparametric density estimation problems are generally approached using a Monte-Carlo sampling method. Samples {An , Gn } are drawn from their distributions, solutions {U n } are computed to produce samples {Q(U n )}, and the output distribution is approximated using a binning strategy coupled with smoothing. This ideal density estimation problem poses several computational issues, including 1. We have only limited information about the stochastic nature of A and G; 2. We can compute only a finite number N of sample solutions; 3. The solution of (1.1) has to be computed numerically, which is both expensive and leads to significant variation in the numerical error as the coefficients and data vary; 4. The output distribution is an approximation affected by the binning and smoothing strategies. In this paper, we consider the first three issues. Our goals are to construct an efficient numerical method for approximating the cumulate density function for the output distribution and to derive computable a posteriori error estimates that account for the significant effects of error and uncertainty in the approximation. We extend the analysis to include adaptive modeling and apply the method to an interesting problem in [5]. In [3], we present convergence proofs for the method described in this paper. There are many papers addressing the fourth issue in the statistics literature, e.g. kernel density estimation. Our main goal is to treat the effects of stochastic variation in the diffusion coefficient A. The treatment of a problem in which just the right-hand side and data vary stochastically is somewhat easier because there is just one differential operator to be inverted. When the elliptic coefficient varies stochastically, we are dealing with a family of differential operators. We include a brief treatment of stochastic variation in the right-hand side and data to be complete. 1.1. Notation. The notation is cumbersome since we are dealing with two discretizations: solution of the differential equation and approximation of a probability distribution by finite sampling. Generally, capital letters denote random variables, or samples, and lower case letters represent deterministic variables or functions. When this assignment is violated, we use italics to denote deterministic quantities. We

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

3

let Ω ⊂ Rd , d = 2, 3, denote the piecewise polygonal computational domain with boundary ∂Ω. For an arbitrary domain ω ⊂ Ω weRdenote L2 (ω) scalar product by R (v, w)ω = ω vw dx in the domain and hv, wi∂ω = ∂ω vw ds on the boundary, with associated norms k kω and | |ω . When ω = Ω, we drop the index in the scalar products. We let Hs (ω) denote the standard Sobolev space of smoothness s for s ≥ 0. In particular, H01 (Ω) denotes the space of functions in H1 (Ω) for which the trace is 0 on the boundary. If ω = Ω, we drop ω and also if s = 0 we drop s i.e. k · k denotes the L2 (Ω)-norm. We use the finite element method to compute numerical solutions. Let Th = {τ } be a quasiuniform partition into elements that ∪τ = Ω. Associated to Th , we define the discrete finite element space Vh consisting of continuous, piecewise linear functions on T satisfying Dirichlet boundary conditions, with mesh size function hτ = diam(τ ) for x ∈ τ and h = maxτ ∈Th hτ . In some situations, we use a more accurate finite element space Vh˜ comprising either the space of continuous, piecewise quadratic functions Vh2 ˜ ¿ h. or involving a refinement Th˜ of Th where h We assume that any random vector X is associated with a probability space (Λ, B, P ) in the usual way. We let {X n , n = 1, · · · , N } denote a collection of samples. We assume it is understood how to draw these samples. We let E(X) denote the expected value, Var(X) denote the variance, and F (t) = P (X < t) denote the cumulative distribution function. We compute approximate cumulative distribution functions in order to determine the probability distribution of a random variable. 1.2. A modeling assumption. The first step in developing a numerical method for the density estimation problem is to characterize the stochastic nature of the random variations affecting the problem. One powerful approach is based on the use of Karhunen-Lo´eve, Polynomial Chaos, or other orthogonal expansions of the random vectors [7, 1]. This approach requires detailed knowledge of the probability distributions for the input variables that is often not available. In many situations, we have only the ability to sample the random inputs at a relatively small set of points in the domain Ω. One such example is oil reservoir simulation, see e.g. [6] where data from the SPE10 Comparative solution project is used. We assume that the stochastic diffusion coefficient can be written A = a + A, where the uniformly coercive, bounded deterministic function a, may have multiscale behavior, and A is a piecewise constant function with random coefficients. We assume that a(x) ≥ a0 > 0 for x ∈ Ω and that |A(x)| ≤ δa(x) for some 0 < δ < 1. To describe A, we let K be a finite polygonal partition of Ω, where Ω = ∪κ∈K κ and κ1 and κ2 are either disjoint or intersect only along a common boundary when κ1 6= κ2 . We let χκ denote the characteristic function for the set κ ∈ K. We assume that X A(x) = Aκ χκ (x), x ∈ Ω, (1.2) κ∈K

¡ ¢ where Aκ is a random vector and each coefficient Aκ is associated with a given probability distribution. We illustrate such a representation in Fig. 1.1. The coefficients Aκ are quantities that may be determined experimentally, e.g. by measurements taken at specific points in the domain Ω. We denote a finite set of samples by {An,κ , n = 1, · · · , N }.

4

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

1.2 1 0.8 0.6 0.4 0.2 0 1 1 y

0.8

0.5

0.6 0.4 0

0.2 0

x

Fig. 1.1. Illustration of the modeling assumption (1.2). The unit square is partitioned into 9 × 9 identical squares. The diffusion coefficient a is 0.1 on the “cross”-shaped domain centered at the origin and 1 elsewhere. The random perturbations are uniform on the interval determined by ±10% of the value of a.

Remark 1.1. We assume that Aκ describes a relatively small stochastic perturbation to the diffusion coefficient a on the associated domain κ. This is a modeling assumption. Improving the model requires choosing a finer partition and taking more measurements Aκ . We also assume that the finite element discretization Th is obtained by refinement of K. This is natural when the diffusion coefficient a and the data vary on a scale finer than the partition K. In [3], we give a theoretical convergence analysis of a generalization of the method presented in this paper. In particular, we relax the assumption that A is piecewise constant and instead assume that it is a piecewise polynomial function. In particular, this makes it possible to use continuous perturbations. 2. The case of a randomly perturbed diffusion coefficient. We begin by studying the Poisson equation with a randomly perturbed diffusion coefficient. We let U ∈ H01 (Ω) (a.s.) solve, ( −∇ · A∇U = f, x ∈ Ω, (2.1) U = 0, x in ∂Ω, where f ∈ L2 (Ω) is a given deterministic function and A = a + A satisfies the conditions described in Sec. 1.2. We construct an efficient numerical method for computing sample solutions and then provide an a posteriori analysis of the error of the method. 2.1. The computational method. Our approach uses Lion’s nonoverlapping domain decomposition method [12]. We let {Ωd , d = 1, · · · , D}, be a decomposition of Ω into a finite set of non-overlapping polygonal subdomains with ∪Ωd = Ω. We denote the boundaries by ∂Ωd and outward normals by nd . For a function An on Ω, An,d means An restricted to Ωd . For d = 1, · · · , D, d0 denotes the set of indices in {1, 2, · · · , D}\{d} for which the corresponding domains Ωd0 share a common boundary with Ωd . The method is iterative, so for a function U involved in the iteration, Ui denotes the value at the ith iteration. Let An = a + An be a particular sample of

5

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

© ª the diffusion coefficient with corresponding solution U n . We let U0n,d , d = 1, · · · , D denote a set of initial guesses for solutions in the subdomains. Given the initial conditions, for each i ≥ 1, we solve the D problems  n,d n x ∈ Ωd ,  −∇ · A ∇Ui = f, Uin,d = 0, x ∈ ∂Ωd ∩ ∂Ω,   1 n,d n,d˜ n,d˜ + nd · An ∇Uin,d = λ1 Ui−1 − nd˜ · An ∇Ui−1 , x ∈ ∂Ωd ∩ ∂Ωd˜, d˜ ∈ d0 , λ Ui where the parameter λ ∈ R is chosen to minimize the number of iterations. In practice, we compute I iterations. Note that the problems can be solved independently. 1 To discretize, we let Vh,d ⊂ H0,∂Ω (Ωd ) be a finite element approximation space corresponding to the mesh Td on Ωd , where 1 H0,∂Ω (Ωd ) = {v ∈ H1 (Ωd ) : v|∂ΩD ∩∂Ω = 0}.

We let (·, ·)d denote the L2 (Ωd ) scalar product, and h·, ·id∩d˜ denote the L2 (∂Ωd ∩ ∂Ωd˜) scalar product for d˜ ∈ d0 . These are associated with norms k kd and | |d respectively. For each i ≥ 1, we compute Uin,d ∈ Vh,d , d = 1, · · · , D, solving X µ1

(An ∇Uin,d , ∇v)d +

˜ 0 d∈d

= (f, v)d +

Xµ ˜ 0 d∈d

λ

Uin,d , v

¶ n,d ® n · A ∇U − n , v i d˜ d∩d˜ d∩d˜

®

¶ 1 n,d˜ ® n,d˜ ® n U ,v − nd˜ · A ∇Ui−1 , v d∩d˜ , all v ∈ Vh,d . (2.2) λ i−1 d∩d˜

ª © It is convenient to use the matrix form of (2.2). We let ϕdm , m = 1, · · · , nd be ~ n,d denote the finite element basis functions for the space Vh,d , d = 1, · · · , D. We let U i n,d d the vector of basis coefficients of Ui with respect to {ϕm }. On each domain Ωd , ¡ 0¢ ~ n,d = ~bd (f ) + ~bn,d An , U n,d , (ka,d + kn,d )U i i−1 where a,d

(k

)lk =

(a∇ϕd,l , ∇ϕdk )d

+

X µ1 ˜ 0 d∈d

λ

¶ hϕdl , ϕdk id∩d˜ −

hnd˜ · a∇ϕdl , ϕdk id∩d˜

,

(kn,d )lk = (An,d ∇ϕdl , ∇ϕdk )d , (~bd )k = (f, ϕdk )d , ¶ X µ 1 n,d˜ ® n,d˜ n,d d n d ~ (b )k = U ,ϕ − hnd˜ · A ∇Ui−1 , ϕk id∩d˜ , λ i−1 k d∩d˜ 0 ˜ d∈d

for 1 ≤ l, k ≤ nd . We abuse notation mildly by denoting the dependence of the data ¡ 0¢ ~bn,d An , U n,d on the values of U n,d˜ for d˜ ∈ d0 . We summarize this approach in i−1 i−1 Alg. 1. Unfortunately, this algorithm is expensive for a large number of realizations since each solution U n requires the solution of a discrete set of equations. To construct a more efficient method, we impose a restriction on the domains in the decomposition. We assume that each domain Ωd is contained in a domain κ in the partition K used in the modeling assumption (1.2). This implies that the random perturbation An,d is

6

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

Algorithm 1 Monte-Carlo Domain Decomposition Finite Element Method for n = 1, · · · , N (number of samples) do for i = 1, · · · , I (number of iterations) do for d = 1, · · · , D (number of domains) do 0 ~ n,d = ~bd (f ) + ~bn,d (An , U n,d ). Solve (ka,d + kn,d )U i i−1 end for end for end for constant on each Ωd , i.e. it is a random number. Consequently, the matrix kn,d has coefficients (kn,d )lk = (An,d ∇ϕdl , ∇ϕdk )d = An,d (∇ϕdl , ∇ϕdk )d = An,d (kd )lk , where kd is the standard stiffness matrix with coefficients (kd )lk = (∇ϕdl , ∇ϕdk )d . We now use the fact that An,d is relatively small to motivate the introduction of the Neumann series. Formally, the Neumann series for the inverse of a perturbation of the identity matrix provides ¡

ka,d + An,d kd

¢−1

¡ ¡ ¢¢−1 = ka,d id + An,d (ka,d )−1 kd ¡ ¢−1 a,d −1 = id + An,d (ka,d )−1 kd (k ) ∞ X ¡ ¢p = (−An,d )p (ka,d )−1 kd (ka,d )−1 , p=0

where id is the identity matrix. We compute only P terms in the Neumann expansion to generate the approximation, ~ n,d = U P,i

P−1 X

¡

¢ ¢ ¡ n,d0 ) . (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bd (f ) + ~bn,d (An , UP,i−1

(2.3)

p=0

We discuss the convergence as P → ∞ in detail below. Note that the n-dependent part of ~bn,d is nonzero only at boundary nodes. If Wh,d denotes the set of vectors determined by the finite element basis functions associated with the boundary nodes on Ωd , then ~bn,d is in the span of Wh,d . We can pre-compute ((ka,d )−1 kd )p (ka,d )−1 Wh,d efficiently, e.g. using PLU factorization. This computation is independent of n. Combining this with Alg. 1, we obtain the computational method given in Alg. 2. n,d ~ n,d for n = 1, · · · , N We let UP,I denote the finite element functions determined by U P,I n and d = 1, · · · , D. We let UP,I denote the finite element function which is equal to n,d UP,I on Ωd . Remark 2.1 Note that the number of linear systems that have to be solved in Alg. 2 is independent of N . Hence, there is potential for enormous savings when the number of samples is large. 2.2. Convergence of the Neumann series approximation. It is crucial for the method that the Neumann series converges. The following theorem shows

7

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

Algorithm 2 Monte-Carlo Domain Decomposition Finite Element Method using a Truncated Neumann Series for d = 1, · · · , D (number of domains) do for p = 1, · · · , P (number of terms) do ¢p ¡ Compute ~y = (ka,d )−1 kd (ka,d )−1~bd (f ) ¡ ¢p Compute yp = (ka,d )−1 kd (ka,d )−1 Wh,d end for end for for i = 1, · · · , I (number of iterations) do for d = 1, · · · , D (number of domains) do for p = 1, · · · , P (number of terms) do for n = 1, · · · , N (number of samples) do ¢ ¡ 0 ~ n,d = PP−1 (−An,d )p yp~bn,d (An , U n,d ) + ~y Compute U P,i P,i−1 p=0 end for end for end for end for convergence under the reasonable assumption that the random perturbations An,d to the diffusion coefficient is smaller than the coefficient. We let |||v|||2d = k∇vk2d + ²|v|2d for some ² > 0. We define the matrices cn,d = −An,d (ka,d )−1 kd and denote the corresponding operators on the finite element spaces by cn,d : Vh,d → Vh,d . Theorem 2.1. If η = | max{An,d }/a0 | < 1, then (a)

¡

id − cn,d

¢−1

=

∞ X ¡ n,d ¢p c ,

(2.4)

p=0

(b)

¯¯¯µ ¶ ¯¯¯ P−1 X¡ ¯¯¯ ¡ ¯¯¯ ¢ ¢ ηP n,d p ¯¯¯ 1 − cn,d −1 − c v ¯¯¯¯¯¯ ≤ ¯¯¯ 1 − ηP d p=0

¯¯¯ P−1 ¯¯¯ ¯¯¯ X ¡ n,d ¢p ¯¯¯ ¯¯¯ ¯¯¯ , c v ¯¯¯ ¯¯¯ p=0

(2.5)

d

for any v ∈ Vh,d . Proof. Let z = cn,d w for an arbitrary w ∈ Vh,d . From the definition of cn,d , z ∈ Vh,d satisfies (a∇z, ∇v)d +

1 hz, vid = −An,d (∇w, ∇v)d , λ

(2.6)

for all v ∈ Vh,d . Choosing v = z in (2.6) and using the Cauchy-Schwartz inequality yields k∇zk2d +

1 |z|2 ≤ ηk∇wkd k∇zkΩd . a0 λ d

Choose ² < 2/(λa0 ) in the definition of the norm |||v|||2d = k∇vk2d + ²|v|2d and making standard estimates gives |||z|||2d ≤ η 2 k∇wk2d ≤ η 2 |||w|||2d . By induction, ¯¯¯¡ n,d ¢p ¯¯¯ ¯¯¯ c w¯¯¯d ≤ η p |||w|||d .

(2.7)

8

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

¡ ¢p In particular, cn,d → 0 as p → ∞. We take the limit as P tends to infinity in the identity X¡ ¡ ¢P ¡ ¢ P−1 ¢p id − cn,d = id − cn,d cn,d p=0

to obtain (2.4). In order to prove (b) we note that, ∞ X¡ ¡ ¢−1 P−1 ¢p X ¡ n,d ¢p ¡ n,d ¢P ¡ ¢−1 id − cn,d − cn,d = c = c id − cn,d . p=0

p=P

In the finite element context, for v ∈ Vh,d , ¯¯¯ ¯¯¯ P−1 X¡ ¯¯¯¡ ¢ ¢ ¯¯¯ n,d p ¯¯¯ ¯¯¯ 1 − cn,d −1 v − c v ¯¯¯ ¯¯¯ p=0

d

¯¯¯¡ ¯¯¯¡ ¢P ¡ ¢−1 ¯¯¯ ¢−1 ¯¯¯ v ¯¯¯d 1 − cn,d v ¯¯¯d ≤ η P ¯¯¯ 1 − cn,d = ¯¯¯ cn,d ¯¯¯ P−1 ¯¯¯ ¯ ¯ ¯ ¯¯¯ P−1 X¡ ¯¯¯ X ¡ n,d ¢p ¯¯¯ ¯¯¯¡ ¢ ¢ ¯¯¯ P ¯¯¯ P ¯¯¯ n,d −1 n,d p ¯¯¯ ¯ ¯ ¯ ≤ η ¯¯¯ c v ¯¯¯ + η ¯¯¯ 1 − c v− c v ¯¯¯ . p=0

d

p=0

d

The theorem follows immediately. 2.3. A numerical example. We present a numerical example that illustrates the convergence properties of the proposed method. In this example, we partition the unit square into 9 × 9 equal square subdomains for the domain decomposition algorithm. The coefficient An = a + An , where a and An are piecewise constant on the 9 × 9 subdomains. The diffusion coefficients a is equal to 1 except on a cross in the center of the domain Ω where it is equal to 0.1. The random perturbations are uniform on the interval determined by ±10% of the value of a. We illustrate a typical sample in Fig. 1.1. The data is f ≡ 1. We compute the error in the average of the solution by choosing ψ ≡ 1. 2.3.1. Convergence for a single realization. First we consider a single sample An in order to focus on the convergence with respect to mesh size, number of iterations, and number of terms. We study how a linear functional of the solution 1,d (UP,I , ψ) depends on the three parameters h, P, and I. To compute approximate errors, we use a reference solution with h = 1/72, I = 300 and P = 5. We start by letting I = 300 and P = 5 and let the number of elements in each direction (1/h) vary from 18 to 72, i.e. the total number of nodes in the mesh varies from (18 + 1)2 = 361 to (72 + 1)2 = 5329. In Fig. 2.1, we plot an approximate relative as the mesh size decreases. Next we fix 1/h = 72, keep P = 5, and vary I from 10 to 300. In Fig. 2.1, we plot the relative error as the number of iteration increases. Finally we let 1/h = 72, I = 300, and let P vary from 1, 3, and 5. Here, the reference solution is computed using P = 7. We plot the results in Fig. 2.1. 2.3.2. Convergence with respect to number of samples. We now fix h = 1/72, I = 300, and P = 5 and vary N from 30 to 480 and compute the distribution function FN (t). We present the result in Fig. 2.2. We observe that the distribution function becomes smoother as the number of samples increases and appears to converge.

9

Relative error compared to reference

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS −1 10

10 10

−2 10

10 10

−3 10

10 10

−4 10 10

20

30

40 1/ h

50

60

70

10

0

10

−1

10

−2

10

−3

10

−4

10

−5

10

−6 0

50

100 150 Iterations

200

250

10

−3 −4 −5 −6 −7 −8 −9 1

2

3 Terms

4

5

1 0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1 0 0.04

~ FN (t)

~ FN (t)

1 0.9

0.1 0.042

0.044

0.046

0.048

0 0.04

0.05

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

~ FN (t)

~ FN (t)

Fig. 2.1. Left: Convergence in mesh size as numbers of domain decomposition iterations and terms in the Neumann series are held constant. Middle: Convergence with respect to number of domain decomposition iterations as the mesh size and number of terms in the Neumann series are held constant. Right: Convergence with respect to number of terms in the Neumann series as the mesh size and number of domain decomposition iterations are held constant.

0.5

0.4

0.3

0.3

0.2

0.2

0 0.04

0.044

0.046

0.048

0.05

0.042

0.044

0.046

0.048

0.05

0.5

0.4

0.1

0.042

0.1 0.042

0.044

0.046

0.048

0.05

0 0.04

Fig. 2.2. Convergence in number of samples as the mesh, the number of domain decomposition iterations, and the number of terms in the Neumann series are held constant. Plots from left to right, top to bottom, N = 30, 60, 120, 480

To approximate the error as the samples increase, we compute a reference solution using N = 480. We plot the errors in Fig. 2.3, The error decreases significantly between N = 30 and N = 240, but the convergence is fairly slow. 2.4. A posteriori error analysis of sample values. We next derive an a posteriori error estimate for each sample linear functional value (U n , ψ). We introduce a corresponding adjoint problem, ( −∇ · A∇Φ = ψ, x ∈ Ω, (2.8) Φ = 0, x ∈ ∂Ω. We compute N sample solutions {Φn , n = 1, · · · , N } of (2.8) corresponding to the samples {An , n = 1, · · · , N }. To obtain computable estimates, we compute numerical solutions of (2.8) using © ª Alg. 2. We obtain numerical approximations Φn,d using a more ˜ I ˜ , d = 1, · · · , D P,

10

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER 0.2 Pointwise Difference

Pointwise Difference

0.2 0.16 0.12 0.08 0.04 0 0.04

0.042 0.044 0.046 0.048

0.08 0.04

0.042 0.044 0.046 0.048

0.05

0.042 0.044 0.046 0.048

0.05

0.2 Pointwise Difference

Pointwise Difference

0.12

0 0.04

0.05

0. 2 0.16 0.12 0.08 0.04 0 0.04

0.16

0.042 0.044 0.046 0.048

0.05

0.16 0.12 0.08 0.04 0 0.04

Fig. 2.3. Error in distribution function compared to a reference solution as the mesh, the number of iterations, and the number of terms are held constant. Plots from left to right, top to bottom, N = 30, 60, 120, 240

accurate finite element discretization computed using either the space of continuous, ˜ ¿ h. We piecewise quadratic functions Vh2 or using a refinement Th˜ of Th where h n denote the approximation on Ω by ΦP, ˜ I ˜. Theorem 2.2. For each n ∈ 1, . . . , N , ¯¡ n ¢¯ ¢¯ ¯¡ ¡ ¢ n n n n ˜ P, ˜ I˜ , ¯ U − UP,I ¯ + R h, , ψ ¯ / ¯ f, ΦnP, (2.9) ˜ I ˜ ˜ I ˜ ) − (A ∇UP,I , ∇ΦP, where ¡ ¢ ˜ P, ˜ I˜ = R h,

( O(h3 ), ˜ 2 ), O(h

Vh˜ = Vh2 , Th˜ is a refinement of Th .

Proof. With Φ solving (2.8), the standard Greens function argument yields the representation ¡

¢ ¡ ¢ n n U n − UP,I , ψ = (f, Φn ) − An ∇UP,I , ∇Φn .

We write this as ¡ n ¢ ¡ ¢ ¡ n ¢ n n n U − UP,I , ψ = f, ΦnP, ˜ I ˜ − A ∇UP,I , ∇ΦP, ˜ I ˜ µ ¶ ¡ ¢ ¡ n ¡ n ¢¢ n n + f, Φn − ΦnP, , − A ∇U , ∇ Φ − Φ ˜ I ˜ ˜ I ˜ P,I P, and define ¡ ¢ ¡ ¢ ¡ ¡ ¢¢ n ˜ P, ˜ I˜ = f, Φn − Φn˜ ˜ − An ∇UP,I R h, , ∇ Φn − ΦnP, ˜ I ˜ . P,I

(2.10)

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

11

We introduce auxiliary adjoint problems for the purpose of analysis. Let Υn ∈ V solve ¡ ¢ n (An ∇Υn , ∇v) = (f, v) − An ∇UP,I , v , all v ∈ V, (2.11) ¢ ¡ n corresponding to the quantity of interest, (f, Φ) − An ∇UP,I , Φn . The standard Greens function argument yields ¡ ¢ ¡ n ¢ ¡ n ¢ ¡ n ¢ n n n n n f, Φn − ΦnP, ˜ I ˜ − A ∇UP,I , Φ − ΦP, ˜ I ˜ = Υ , ψ − A ∇Υ , ∇ΦP, ˜ I ˜ . Using Φn∞,I˜ to denote the approximate solution obtained by using the full Neumann series, i.e. by solving the problem with the full diffusion coefficient, we have ¡ ¢ ¡ n ¢ ¡ n ¢ ¡ n ¢ n n n n n f, Φn − ΦnP, ˜ I ˜ − A ∇UP,I , Φ − ΦP, ˜ I ˜ = Υ , ψ − A ∇Υ , ∇Φ∞,I ˜ ¡ ¢¢ ¡ n + An ∇Υn , ∇ ΦnP, ˜ I ˜ − Φ∞,I ˜ . (2.12) By Theorem 2.1, the second term on the right-hand side can be made arbitrarily small by taking P large. We can use Galerkin orthogonality on the first two terms on the right-hand side of (2.12) by introducing a projection πh˜ into Vh˜ . Decomposing into a sum of integrals over elements and integrating by parts on each element, we have ¢ ¢ ¢ ¡ ¡ ¡ n ¢ ¡ n ¢ ¡ Υ , ψ − A ∇Υn , ∇Φn∞,I˜ = Υn − πh˜ Υn , ψ − An ∇ Υn − πh˜ Υn , ∇Φn∞,I˜ X µ¡ ¢ ¢ ¡ = Υn − πh˜ Υn , ψ τ + Υn − πh˜ Υn , ∇ · An ∇Φn∞,I˜ τ τ ∈Th ˜

¶ ® + Υn − πh˜ Υn , An ∂n Φn∞,I˜ ∂τ ,

where ∂n denotes the normal derivative to ∂τ . The standard argument involving interpolation estimates now yield the bounds in (2.10). 2.4.1. Numerical example. We present a brief example illustrating the accuracy of the a posteriori estimate (2.9). We consider just one sample diffusion value A1 = a + A1 with a = 0.9 and A1 = 0.1 on the unit square. We compute the error in the average value by choosing ψ = 1. We set f = 2 · x(1 − x) + 2 · y(1 − y), so that the exact solution is U 1 = x(1 − x) · y(1 − y) and (U 1 , ψ) = 1/36. We divide the computational domain into 8 × 8 equally sized squares on which we compute the numerical approximation to U 1 using Lion’s domain decomposition algorithm with I iterations using the approximate local solver involving a truncated Neumann series of P terms. We let h = 1/32 so that each subdomain has discretization ˜ = h/2, 5 × 5 nodes. To solve the adjoint problem, we refine the mesh to obtain h use γP terms in the truncated Neumann series and γI iterations in the domain decomposition algorithm, where γ > 0. To evaluate the accuracy of the estimate, we use the efficiency index γ ηP,I =

1 |(f, Φ1γP,γI ) − (A1 ∇UP,I , ∇Φ1γP,γI )| . 1 , ψ)| |(U 1 − UP,I

We start by letting γ = 2 i.e. we put a lot of effort in solving the adjoint solution. We present results for varying I and P in Table 2.1.

12

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

P 1 2 3

I 50 50 50

Ratio 0.992 1.03 1.01

P 3 3 3

I 10 25 50

Ratio 2.78 1.02 1.01

Table 2.1 Efficiency index results for γ = 2

P 1 2 3

I 50 50 50

Ratio 0.908 1.00 0.933

P 3 3 3

I 10 25 50

Ratio 8.21 1.08 0.933

Table 2.2 Efficiency index results for γ = .5

Next we let γ = 0.5, i.e. we use much poorer resolution for the adjoint solution. We plot the results in Table 2.2. The efficiency indexes are close to one except when the number of domain decomposition iterations for the adjoint problem is very low. In general, it appears that as long as the number of domain decomposition iterations is sufficiently large, the adjoint problem can be solved with rather poor resolution, yet we still obtain a reasonably accurate error estimate. 3. The case of random perturbation in data. For sake of completeness, we treat the case in which the data G in (1.1) is randomly perturbed. It is straightforward to combine the cases of randomly perturbed diffusion coefficient and data. We present a fast method for computing samples of a linear functional of the solution given samples of the right-hand side data. It is straightforward to deal with a more general elliptic operator, so we let U ∈ H01 (Ω) (a.s.) solve a(U, v) = (G(x), v),

v ∈ H01 (Ω),

(3.1)

where ¡ ¢ ¡ ¢ ¡ ¢ a(w, v) = a∇w, ∇v + b · ∇w, v + cw, v , for w, v ∈ H01 (Ω, G(x) ∈ L2 (Ω) (a.s.), G(·) has a continuous and bounded covariance function, and a, b, c are deterministic functions chosen such that equation (3.1) has a unique weak solution in H01 (Ω). In particular a(x) ≥ a0 > 0 for all x. We let {Gn (x), n = 1, · · · , N } denote a finite collection of samples. 3.1. Computational method. In the case of randomly perturbed right-hand side and data, we can use the method of Green’s functions to construct an efficient method for density estimation. We introduce a deterministic adjoint problem. We let the quantity of interest be a linear functional Q(v) = (v, ψ) determined by a function ψ ∈ L2 (Ω) and construct the corresponding adjoint problem for the generalized Greens function φ ∈ H01 (Ω), a∗ (φ, v) = (ψ, v),

v ∈ H01 (Ω),

where a∗ (w, v) = (∇w, ∇v) − (∇ · (bw), v) + (cw, v).

(3.2)

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

13

It immediately follows that (U n , ψ) = a∗ (φ, U n ) = a(U n , φ) = (Gn , φ), for n = 1, . . . , N . By linearity, we see that E(U ) ∈ H01 (Ω) solves a(E(U ), v) = (E(Gn ), v),

v ∈ H01 (Ω)

and we can obtain an analogous representation. We conclude that the classic Greens representation holds Theorem 3.1. For samples {Gn , n = 1, · · · , N }, we have (U n , ψ) = (Gn , φ),

(3.3)

¡ ¢ E (U, ψ) = (E(G), φ).

(3.4)

for n = 1, . . . , N . We also have

The point is that, theoretically, instead of solving a partial differential equation for each sample in order to build the distribution of (U n , ψ), we can solve one deterministic problem to get φ and then calculate values of (U n , ψ) using a relatively inexpensive inner product. Indeed, we never approximate U n in order to estimate the samples of the quantity of interest in this approach. In order to make this approach practical, we introduce a finite element approximation φh ∈ Vh satisfying satisfying a∗ (φh , v) = (ψ, v),

v ∈ Vh .

(3.5)

We obtain the computable approximations (U n , ψ) ≈ (φh , Gn ) ¡ ¢ ¡ ¢ E (U, ψ) ≈ E(G), φh .

(3.6) (3.7)

3.2. A posteriori error estimate for samples. We next present an a¡ posteri¢ ori error analysis for the approximate value for each sample (U n , ψ) and for E (U, ψ) . For samples, the error is (U n , ψ) − (Gn , φh ) = (Gn , φ) − (Gn , φh ) = (Gn , φh − φ). For each sample, this as a quantity of interest for φ corresponding to Gn . To avoid confusion, we let Θn ∈ H01 (Ω) denote the forward adjoint solution solving, a(Θn , v) = (Gn , v),

v ∈ H01 (Ω).

(3.8)

Note that because we treat a linear problem, Θn = U n . Similarly, we let E(Θ) ∈ H01 (Ω) solve a(E(Θ), v) = (E(G), v),

v ∈ H01 (Ω).

The standard analysis gives (Gn , φh − φ) = a(Θn , φh − φ) = a∗ (φh − φ, Θn ) = a∗ (φh , Θn ) − a∗ (φ, Θn ) = a∗ (φh , Θn ) − (ψ, Θn ) = a∗ (φh , (I − πh )Θn ) − (ψ, (I − πh )Θn ),

(3.9)

14

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

where the last step follows from Galerkin orthogonality. We can argue similarly for ¡ ¢ E (Gn , φh − φ) . To use this representation, we need to solve the forward adjoint problem (3.8) in Vh˜ . Unfortunately in the case of the error of the samples, this requires computing n approximate forward adjoint solutions Θn using a more expensive finite element computation. Another approach is simply to approximate (Gn , φ) by (Gn , φh˜ ) where φh˜ is a finite element approximation of φ in Vh˜ . An analysis of the accuracy of this replacement follows easily from the relation (Gn , φh − φ) = (Gn , φh − φh˜ ) + (Gn , φh˜ − φ). In either case, arguing as for Theorem 2.2 yields Theorem 3.2. The solution error in each sample is estimated by (Gn , φh − φ) ≈ a∗ (φh , (I − πh )Θn ) − (ψ, (I − πh )Θn ),

(3.10)

n

where Θ is a finite element approximation for the adjoint problem (3.8) in Vh˜ and πh denotes a projection into the finite element space Vh . We also have (Gn , φh − φ) ≈ (Gn , φh − φh˜ ),

(3.11)

where φ˜h is a finite element solution of the adjoint problem (3.2) computed in Vh˜ . We also have the estimates ¡ ¢ E (Gn , φh − φ) ≈ a∗ (φh , (I − πh )E(Θ)) − (ψ, (I − πh )E(Θ)), (3.12) where E(Θ) is a finite element approximation for the adjoint problem (3.9) computed on a finer mesh Th˜ or using Vh2 . We also have ¡ ¢ E (G, φh − φ) ≈ (E(G), φh − φ˜h ). (3.13) ˜ 2 . In both cases, these The error of these estimates is bounded by Ch3 or C h bounds are asymptotically smaller than the estimates themselves. 4. A posteriori error analysis for an approximate distribution. We now present an a posteriori error analysis for the approximate cumulative distribution function obtained from N approximate sample values of a linear functional of a solution of a partial differential equation with random perturbations. We let U = U (X) be a solution of an elliptic problem that is randomly perturbed by a random variable X on a probability space (Ω, B, P ) and Q(U ) = (U, ψ) be a quantity of interest, for some ψ ∈ L2 (Ω). We want to approximate the probability distribution function of Q = Q(X), ¡ ¢ ¡ ¢ F (t) = P {X : Q(U (X)) ≤ t} = P Q ≤ t . We use the sample distribution functionªcomputed from a finite collection of approx© n © ª ˜ , n = 1, · · · , N = (U ˜n , ψ), n = 1, · · · , N , imate sample values Q N ¢ 1 X ¡ ˜n F˜N (t) = I Q ≤t , N n=1

˜ n is a numerical approximation for a true where I is the indicator function. Here, U n n solution U corresponding to a sample X . We assume that there is an error estimate ˜ n − Qn ≈ E n , Q

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

15

with Qn = (U n , ψ). We use Theorem 2.2 or Theorem 3.2 for example. There are two sources of error, 1. finite sampling, 2. numerical approximation of the differential equation solutions. We define the sample distribution function FN (t) =

N ¢ 1 X ¡ n I Q ≤t , N n=1

and decompose the error ¯ ¯ ¯ ¯ ¯ ¯ ¯F (t) − F˜N (t)¯ ≤ ¯F (t) − FN (t)¯ + ¯FN (t) − F˜N (t)¯ = I + II.

(4.1)

There is an extensive statistics literature treating I, e.g. see [13]. We note that FN has very desirable properties, e.g. • As a function of t, FN (t) is a distribution function, and for each fixed t, FN (t) is a random variable corresponding to the sample, • It is an unbiased estimator, i.e. E(FN ) ≡ E(F ), • N FN (t) has exact binomial distribution for N trials and probability of success F (t) • Var(FN (t)) = F (t)(1 − F (t))/N → 0 as N → ∞ and FN converges in mean square to F as N → ∞, The approximation properties of FN (t) can be studied in various ways, all of which have the flavor of bounding the error with high probability in the limit of large N . One useful measure is the Kolmogorov-Smirnov distance ¯ ¯ sup ¯FN (t) − F (t)¯. t∈R

A result that is useful for being uniform in t, is that there is a constant C > 0 such that µ ¶ ¯ ¯ 2 ¯ ¯ P sup FN (t) − F (t) > ² ≤ Ce−2² N , all ² > 0, t∈R

see [13]. We rewrite this as for any ² > 0, ¯ ¯ sup ¯FN (t) − F (t)¯ ≤ t∈R

µ

log(²−1 ) 2N

¶1/2

with probability greater than 1 − ². Another standard measure is the mean square error (MSE), ¡ ¢ ˜ = E (Θ ˜ − Θ)2 , M SE(Θ) ˜ is an estimator for Θ. We define where Θ ( 1, Qn < t, Xn (t) = 0, otherwise,

( 1, Q < t, X (t) = 0, otherwise.

We have FN (t) =

N 1 X Xn (t), N n=1

F (t) = E(X )(t).

(4.2)

16

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

For all t, µ M SE

¶ N σ2 1 X Xn (t) = , N n=1 N

σ 2 = F (t)(1 − F (t)).

(4.3)

We can also estimate the (unknown) variance by defining, 2 SN

¶2 N µ N 1 X 1 X Xn (t) − = Xn (t) . N n=1 N n=1

2 is a computable estimator for σ 2 and SN 2 )= M SE(SN

2N + 1 4 σ . N2

(4.4)

Another useful result follows from the observation that {Xn } are i.i.d. Bernoulli variables. The Chebyshev inequality implies that for ² > 0, ¯ µ µ¯ ¶1/2 ¶ N ¯ ¯ 1 X F (t)(1 − F (t)) ¯ ¯ P ¯E(X )(t) − Xn (t)¯ ≤ > 1 − ², ² > 0. (4.5) N n=1 N² To obtain a computable estimate, we note that F (t)(1 − F (t)) = FN (t)(1 − FN (t)) + (F (t) − FN (t))(1 − F (t) − FN (t)). Therefore using (4.5) along with the fact that 0 ≤ F, FN ≤ 1, µ ¶1/2 µ ¶1/2 F (t)(1 − F (t)) FN (t)(1 − FN (t)) ≤ N² N² µ ¶1/2 |F (t) − FN (t)| |1 − F (t) − FN (t)| + N² µ ¶1/2 FN (t)(1 − FN (t)) 1 ≤ + . N² 2N ² We conclude ¯ µ µ¯ ¶1/2 N ¯ ¯ 1 X FN (t)(1 − FN (t)) 1 Xn (t)¯¯ ≤ P ¯¯E(X )(t) − + . N n=1 N² 2N ²

(4.6)

Next, we consider II in (4.1). ¯ ¯ ¯ ¯ N ³ N ¯1 X ¯ ´¯ ¯ 1 X ¯ ¯ ¯ ¯ n n n ˜ II = ¯ I(Q ≤ t) − I(Q ≤ t) ¯ = ¯ (I(Q + E ≤ t) − I(Q ≤ t))¯ ¯N ¯ ¯N ¯ n=1 ¯ ¯ n=1 ¯ ¯ N N ¯ ¯1 X 1 X ¯ n n n n n n ¯ (I(Q ≤ t ≤ Q + |E |)) + (I(Q − |E | ≤ t ≤ Q ))¯ . =¯ ¯ ¯ N n=1 N n=1 ¯ ¯ E n ≤0 E n >0 We estimate

¯ ¯ N ¯1 X ¯ ¯ ¯ n n n n II ≤ ¯ (I(Q − |E | ≤ t ≤ Q + |E |))¯ . ¯N ¯ n=1

(4.7)

17

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

˜ n − E n , we obtain the computable estimate If instead we expand using Qn = Q ¯ ¯ N ³ ´¯ ¯ ¯ ¯¯ 1 X ˜ n − |E n | ≤ t ≤ Q ˜ n + |E n |) ¯¯ . ¯FN (t) − F˜N (t)¯ ≤ ¯ I(Q (4.8) ¯N ¯ n=1 Setting E = max E n in (4.7), we obtain ¯ ¯ ¯ ¯ ¯FN (t) − F˜N (t)¯ ≤ ¯FN (t + E) − FN (t − E)¯.

(4.9)

Now ¯ ¯ ¯ ¯ ¯FN (t + E) − FN (t − E)¯ ≤ ¯F (t + E) − F (t − E)¯ ¯ ¯ ¯ ¯ + ¯F (t + E) − FN (t + E)¯ + ¯F (t − E) − FN (t − E)¯. Using (4.2), for any ² > 0, ¯ ¯ ¯F (t ± E) − FN (t ± E)¯ ≤

µ

log(²−1 ) 2N

¶1/2

with probability greater than 1 − ². Therefore, for any ² > 0 ¯ ¯ ¯ ¯ ¯FN (t) − F˜N (t)¯ ≤ ¯F (t + E) − F (t − E)¯ + 2

µ

log(²−1 ) 2N

¶1/2 (4.10)

with probability greater than 1 − ². Note that FN (t)(1 − FN (t)) = F˜N (t)(1 − F˜N (t)) + (FN (t) − F˜N (t))(1 − FN (t) − F˜N (t)). We can bound the second expression on the right-hand side using (4.9) or (4.10) to obtain a computable estimator for the variance of F . We summarize the most useful results. Theorem 4.1. For any ² > 0, Ã |F (t) − F˜N (t)| ≤

!1/2 F˜N (t)(1 − F˜N (t)) N² ¯ ¯ N ³ ¯1 X ´¯ ¯ ˜ n − |E n | ≤ t ≤ Q ˜ n + |E n |) ¯¯ + 1 + 2¯ I(Q ¯N ¯ 2N ² n=1

(4.11)

with probability greater than 1 − ². With L denoting the Lipschitz constant of F , for any ² > 0, µ |F (t) − F˜N (t)| ≤

F (t)(1 − F (t)) N²

¶1/2

µ n

+ L max E + 2 1≤n≤N

log(²−1 ) 2N

¶1/2 (4.12)

with probability greater than 1 − ². Remark 4.1. The leading order bounding terms in the a posteriori bound (4.11) are computable while the remainder tends to zero more rapidly in the limit of large N . The bound (4.12) is useful for the design of adaptive algorithms among other

18

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

things. Assuming that the solutions of the elliptic problems are in H2 , it indicates that the error in the computed distribution function is bounded by an expression that to leading order is proportional to √

1 + Lh2 ²N

with probability 1 − ². This suggests that in order to balance the error arising from finite sampling against the error in each computed sample, we typically should choose N ∼ h−4 . This presents a compelling argument for seeking efficient ways to compute samples and control the accuracy.

Remark 4.2. The expression ¯ ¯ N ³ ¯1 X ´¯ ¯ ¯ n n n n ˜ ˜ I(Q − |E | ≤ t ≤ Q + |E |) ¯ ¯ ¯N ¯ n=1 is itself an expected value. If M < N and N 0 = {n1 < · · · < nM } is a set of integers chosen at random from {1, · · · , N }, we can use the unbiased estimator ¯ ¯ ¯ 1 X ³ ´¯ ¯ ˜ n − |E n | ≤ t ≤ Q ˜ n + |E n |) ¯¯ , I(Q (4.13) ¯ ¯M ¯ 0 n∈N

√ which has error that decreases as O(1/ M). This is reasonable when N is large since we are likely to require less accuracy in the error estimate than in the primary quantity of interest.

Remark 4.3. A similar error analysis can be carried out for an arbitrary stochastic ˜ be an approximoment q with an unbiased estimator Q using N samples. We let X mation to X, and decompose the error as ˜ ≤ |q(X) − Q(X)| + |Q(X) − Q(X)|. ˜ |q(X) − Q(X)| The first term can be estimated using the Chebyshev inequality, for ² > 0, Ã µ ¶1/2 ! Var(Q(X)) P |q(X) − Q(X)| ≤ > 1 − ². ²N Since the variance of Q(X) decreases as N increases we obtain estimates for this term ˜ analogous to the expressions above. We can estimate the numerical error Q(X)−Q(X) ˆ − Q(X). ˜ In the particular case that by computing a solution on a finer mesh, Q(X) q(X) = E[X], we can compute the quantity very efficiently, see Sec. 3.

19

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

1

1

0.9

0.9

0.8

0.8

0.8

0.7

0.7

0.7

0.6

0.6

0.6

0.5

~ FN (t)

1 0.9

~ FN (t)

~ FN (t)

4.1. A numerical example. We illustrate the accuracy of the computable bound (4.11) using some simple experiments. We emphasize that (4.11) is a bound, and, in particular, we trade accuracy in terms of estimating the size of the error by increasing the probability that the bound is larger than the error. In this case, we desire that the degree of overestimation does not depend strongly on the discretization parameters. To carry out the test, we specify a true c.d.f. and sample N points at random from the distribution. To each sample value, we add a random error drawn at random from another distribution. We use the Kaplar-Meier estimate for the approximate c.d.f. in the Matlab statistics toolbox then compute the difference with the true c.d.f. values at the sample points. We also compute the difference divided by the true c.d.f. values. The experiments we report include First Computation Sample Distribution Normal, mean 1, variance 2 Error Distribution Uniform on [−δ, δ] Second Computation Sample Distribution Exponential, parameter 1 Error Distribution Uniform on [−δ, δ] Third Computation Sample Distribution Exponential, parameter 1 Error Distribution Uniform on [−δX, δX], X = sample value We obtained similar results for a variety of examples. In Fig. 4.1, we present three examples of approximate c.d.f. functions. In all cases, we bound the error with probability greater than 95%. In Fig. 4.2, we plot the

0.5

0.5

0.4

0.4

0.4

0.3

0.3

0.3

0.2

0.2

0.2

0.1 0 .2

0.1 0 0

.4

.6

.8

1

1.2

1.4

1.6

1.8

1

2

3

4

5

6

7

0.1 0 0

8

1

2

3

4

5

6

7

8

Fig. 4.1. Plots of approximate (solid line) and true (dashed lines) c.d.f. functions. Left: First Computation with N = 5000, δ = .001. Middle: Second Computation with N = 2000, δ = .0001. Right: Third Computation with N = 500, δ = .05.

bounds and the corresponding errors. 0.06

0.035 Bound

0.03 0.025

0.12

0.05

Bound

0.1

Bound

0.04

0.08

0.02 Error

0.015

0.03

Error

0.06

Error

0.02

0.04

0.005

0.01

0.02

0

0

0.01

0.2 0.4 0.6

0.8

1 1.2 1.4 1.6 1.8

0 0

1

2

3

4

5

6

7

8

0

1

2

3

4

5

6

Fig. 4.2. Errors in approximate c.d.f. functions and corresponding bounds shown in Fig. 4.1. Left: First Computation with N = 5000, δ = .001. Middle: Second Computation with N = 2000, δ = .0001. Right: Third Computation with N = 500, δ = .05.

In Fig. 4.3, we plot the performance of the bound with respect to estimating the

20

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER

size of the error. In all three cases, the bound is asymptotically around 5 times too large. 7

0.3 Normal

0.2

Exponential 1 0.1

Exponential 2

|True - Estimate|/|True|

|True- Estimate|

0.4

6 5 4 Exponential 2 3 2

Normal Exponential 1

1 0.0 0.0

0.04

0.02

0.06 0.08 Error

0 0.10

0.12

0.01

0.10 Error

Fig. 4.3. Performance of the bound (4.11) for the three examples shown in Fig. 4.1. Left: Plot of difference between the estimate and bound versus the error. Right: Plot of the relative difference versus error.

5. Adaptive error control. We now use Theorems 2.2, 3.2 and 4.1 to construct an adaptive error control algorithm. The computational parameters we wish to optimize are the mesh size h, number of terms in the truncated Neumann series P, the number of iterations in the domain decomposition algorithm I, and number of samples N . The first task is to express the error E as a sum of three terms corresponding respectively to discretization error, error from the incomplete Neumann series, and error from the incomplete domain decomposition iteration. Considering the problem with randomly perturbed diffusion coefficient, we bound the leading expression in the error estimate (2.9) in one sample value as ¯ ¯ n n n ¯ E n = ¯(f, ΦnP, ˜ I ˜) ˜ I ˜ ) − (A ∇UP,I , ∇ΦP, ¯ ¯ n n n n n n n ¯ ≤ ¯(f, ΦnP, ˜ I ˜) ˜ I ˜ ) − (A ∇(U∞,∞ − UP,I ), ∇ΦP, ˜ I ˜ ) − (A ∇UP,I , ∇ΦP, ¯ ¯ n n n ¯ (5.1) + ¯(A ∇(UP,∞ − UP,I ), ∇ΦnP, ˜ I ˜) ¯ ¯ n n n n + ¯(A ∇(U∞,∞ − UP,∞ ), ∇Φ ˜ ˜ )¯ P,I

=

EIn

+

n EII

+

n EIII ,

where we use the obvious notation to denote the quantities obtained by taking I, P → ∞. n n The goal is to estimate EIn , EII , EIII in terms of computable quantities. To do this, we introduce ∆I and ∆P as positive integers and use the approximations n n UP+∆P,I+∆I ≈ U∞,∞ .

The accuracy of the estimates below improves as ∆I and ∆P increase. We have ¯ ¯ n n n n n n n ¯ EIn ≈ ¯(f, ΦnP, ˜ I ˜ ) . (5.2) ˜ I ˜ ) − (A ∇(UP+∆P,I+∆I − UP,I ), ∇ΦP, ˜ I ˜ ) − (A ∇UP,I , ∇ΦP, Likewise, we estimate ¯ ¯ n n n ¯ EII ≈ ¯(An ∇(UP+∆P,I − UP,I ), ∇ΦnP, ˜ I ˜) ¯ ¯ n n n ¯ EIII ≈ ¯(An ∇(UP+∆P,I − Uh,P,I ), ∇ΦnP, ˜ I ˜)

(5.3) (5.4)

21

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

n We can find other expressions for EIII by passing to the limit in (2.3) on each domain d to write ∞ X ¡ ¢ ¡ ¢ n,d0 (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bd (f ) + ~bn,d (An , U∞,∞ )

n,d ~ ∞,∞ U =

p=0

while ~ n,d = U P,∞

P−1 X

¡ ¢ ¡ n,d0 ¢ (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bd (f ) + ~bn,d (An , UP,∞ ) .

p=0

Subtracting and approximating, we find n,d ~ n,d ~ ∞,∞ U −U P,∞ ∞ X ¡

≈

¢ ¡ ¢ n,d0 (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bd (f ) + ~bn,d (An , U∞,∞ )

p=P

+

P−1 X

¡

¢ ¡ n,d0 ¢ n,d0 (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bn,d (An , U∞,∞ ) − ~bn,d (An , UP,∞ )

p=0

Summing yields n,d ~ n,d ~ ∞,∞ −U U P,∞ ¡ ¢ n,d n,d P ~ ∞,∞ = (−A ) ((ka,d )−1 kd )P U

+

P−1 X

¡ ¢ ¡ n,d0 ¢ n,d0 ) . (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bn,d (An , U∞,∞ ) − ~bn,d (An , UP,∞

p=0

Finally, approximating yields n,d ~ ∞,∞ ~ n,d U −U P,∞ ¡ ¢ n,d n,d P ~ ≈ (−A ) ((ka,d )−1 kd )P U P,I

+

P−1 X

¡

¢ ¡ n,d0 ¢ n,d0 ) − ~bn,d (An , UP,I ) . (−An,d )p ((ka,d )−1 kd )p (ka,d )−1 ~bn,d (An , UP+∆P,I

p=0

We denote the operators corresponding to ka and k on Vh,d by k a and k. We have n EIII ≈

D µ X ¢ n,d An ∇((−An,d )P ((k a,d )−1 k d )P )UP,I , ∇ΦnP, ˜ I ˜ d d=1

+

P−1 X

¡

(−A

n,d p

) ((k

a,d −1 d p

)

k )

¢

a,d −1

(k

)

¶ ¡ n,d n n,d0 n,d0 ¢ n,d n ~b (A , U ~ (A , UP,I ) . P+∆P,I ) − b

p=0

(5.5) We now present an adaptive error control strategy in Alg. 3 based on Theorem 4.1 and the approximations n n E n ≈ Eˆn = EIn + EII + EIII .

22

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER 0.055

160

0.05

140

0.045 120

0.035

I

h

0.04 0.03

100 80

0.025 60

0.02

40

0.015 1

2 Iterations

1

3

N

P

3

2

1 1

2 Iterations

3

240 220 200 180 160 140 120 100 80 60

2 Iterations

1

2 Iterations

3

3

Fig. 5.1. Computational parameters chosen adaptively according to the adaptive algorithm.

We set n n EI = max EIn , EII = max EII , EIII = max EIII . n

n

n

We define in addition Ã EIV =

F˜N (t)(1 − F˜N (t)) N²

!1/2 ,

(5.6)

for a given ² > 0. 5.1. A numerical example. We apply the adaptive algorithm to the problem given in Sec. 2.3. We start with a coarse mesh and small number of iterations, terms, and samples and let the adaptive algorithm choose the parameter values in order to make the error bound of F (t) smaller than 15% with 95% likelihood, i.e. we set TOL = 0.15 and ² = .05. We set σ1 = 0.5, σ2 = σ3 = 0.125 and σ4 = 0.25. Initially, we let h = 1/18 determine a uniform initial mesh, I = 40, P = 1, and N = 60. We set ∆I = 0.3I and ∆P = 1. We compute the adjoint solution using ˜ = h/2 but using the same number of iterations, terms, and a refined mesh with h samples as the forward problem. To refine, we set hi = 1/(9(i − 1)), with with i = 3 initially, and then for each refinement we increment i by 2. This means that we get 3, 5, 7, etc. nodes in the x-direction and y-direction on each subdomain. In Fig. 5.1 we present the parameter values for each of the iterates. The tolerance was reached after three iterations with h = 1/54, I = 160, P = 3, and N = 240. In Fig. 5.2, we plot error bound indicators after each iteration in the adaptive algorithm and the total error bound. We compute an approximate error using a reference solution with h = 1/72, I = 300, P = 5, and N = 480) and show the result in Fig. 5.3. The error decreases from almost 100% initially, with a distribution function that fails to detect critical behavior, to an error of around 30% to finally an error less then 3%.

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

23

Algorithm 3 Adaptive Algorithm Choose ² in Theorem 4.1, which determines the reliability of the error control Let TOL be the desired tolerance of the error |F (t) − F˜N (t)| Let σ1 +σ2 +σ3 +σ4 = 1 be positive numbers that are used to apportion the tolerance TOL between the four contributions to the error, with values chosen based on the computational cost associated with changing the four discretization parameters Choose initial meshes Th , Th˜ and P, I, and N n Compute {UP,I , n = 1, · · · , N } in the space Vh and the sample quantity of interest values Compute {ΦnP, ˜ ˜ I ˜ , n = 1, · · · , N } in the space Vh n n n Compute EI , EII , EIII for n = 1, · · · , N

Compute F˜N (t) Estimate the Lipschitz constant L of F using F˜N Compute EIV ¯ P ³ ´¯ ¯ N ˜n ˆn ˜n ˆn ¯ while EIV + ¯ N1 n=1 I(Q − |E | ≤ t ≤ Q + |E |) ¯ ≥ TOL do if LEI > σ1 TOL then Refine Th and Th˜ to meet the prediction that EI ≈ σ1 TOL on the new mesh end if if LEII > σ2 TOL then Increase P to meet the prediction EII ≈ σ2 TOL end if if LEIII > σ3 TOL then Increase P to meet the prediction EIII ≈ σ3 TOL end if if EIV > σ4 TOL then Increase N to meet the prediction EIV ≈ σ4 TOL end if n Compute {UP,I , n = 1, · · · , N } in the space Vh and the sample quantity of interest values Compute {ΦnP, ˜ ˜ I ˜ , n = 1, · · · , N } in the space Vh n n n Compute EI , EII , EIII for n = 1, · · · , N

Compute F˜N (t) Estimate the Lipschitz constant L of F using F˜N Compute EIV end while

6. Conclusion. In this paper, we consider the nonparametric density estimation problem for a quantity of interest computed from solutions of an elliptic partial differential equation with randomly perturbed coefficients and data. We focussed on problems are problems for which limited knowledge of the random perturbations are known. In particular, we assume that the random perturbation to the diffusion coef-

24

Error Estimators

D. ESTEP, A. M˚ ALQVIST, AND S. TAVENER 10

2

10

0

TOL 10

−2

10

−4

1

EI EII EIII EIV E 2 Iterations

3

0.042 0.044 0.046 0.048

0.05

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.04

Pointwise Difference

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.04

Pointwise Difference

Pointwise Difference

Fig. 5.2. The error estimators computed after each iteration in the adaptive algorithm.

0.042 0.044 0.046 0.048

0.05

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.04

0.042 0.044 0.046 0.048

0.05

Fig. 5.3. Approximate error in the solutions produced by the adaptive algorithm for iteration 1, 2, and 3.

ficient is described by a piecewise constant function. We derive an efficient method for computing samples and generating an approximate probability distribution based on Lion’s domain decomposition method and the Neumann series. We then derive an a posteriori error estimate for the computed probability distribution reflecting all sources of deterministic and statistical errors, including discretization of the domain, finite iteration of the domain decomposition iteration, finite truncation in the Neumann series, and the effect of using a finite number of random samples. Finally, we develop an adaptive error control algorithm based on the a posteriori estimate. REFERENCES [1] I. Babuˇ ska, R. Tempone, and G. E. Zouraris, Solving elliptic boundary value problems with uncertain coefficients by the finite element method: the stochastic formulation, Comput. Methods Appl. Mech. Engrg., 194 (2005), pp. 1251–1294. [2] W. Bangerth and R. Rannacher, Adaptive finite element methods for differential equations, Lectures in Mathematics ETH Z¨ urich, Birkh¨ auser Verlag, Basel, 2003. [3] D. Estep, M. J. Holst, and A. M˚ alqvist, Nonparametric density estimation for randomly perturbed elliptic problems III: Convergence and a priori analysis. In preparation, 2008. [4] D. Estep, M. G. Larson, and R. D. Williams, Estimating the error of numerical solutions of systems of reaction-diffusion equations, Mem. Amer. Math. Soc., 146 (2000), pp. viii+109. [5] D. Estep, A. M˚ alqvist, and S. Tavener, Nonparametric density estimation for randomly perturbed elliptic problems II: Applications and adaptive modeling. In preparation, 2008. [6] C. L. Farmer, Upscaling: a review, Internat. J. Numer. Methods Fluids, 40 (2002), pp. 63–78. ICFD Conference on Numerical Methods for Fluid Dynamics (Oxford, 2001). [7] R. Ghanem and P. Spanos, Stochastic Finite Elements: A Spectral Approach, Dover, New York, 2003. [8] W. Guo and L. S. Hou, Generalizations and accelerations of Lions’ nonoverlapping domain decomposition method for linear elliptic PDE, SIAM J. Numer. Anal., 41 (2003), pp. 2056– 2080 (electronic). [9] T. Y. Hou and X.-H. Wu, A multiscale finite element method for elliptic problems in composite

DENSITY ESTIMATION FOR ELLIPTIC PROBLEMS

25

materials and porous media, J. Comput. Phys., 134 (1997), pp. 169–189. [10] T. J. R. Hughes, Multiscale phenomena: Green’s functions, the Dirichlet-to-Neumann formulation, subgrid scale models, bubbles and the origins of stabilized methods, Comput. Methods Appl. Mech. Engrg., 127 (1995), pp. 387–401. [11] M. G. Larson and A. M˚ alqvist, Adaptive variational multiscale methods based on a posteriori error estimation: Energy norm estimates for elliptic problems, Comput. Methods Appl. Mech. Engrg., 196 (2007), pp. 2313–2324. [12] P.-L. Lions, On the Schwarz alternating method. III. A variant for nonoverlapping subdomains, in Third International Symposium on Domain Decomposition Methods for Partial Differential Equations (Houston, TX, 1989), SIAM, Philadelphia, PA, 1990, pp. 202–223. [13] R. Serfling, Approximation Theorems of Mathematical Statistics, John Wiley $ Sons, Inc., New York, 1980.