c 2006 Society for Industrial and Applied Mathematics
SIAM J. SCI. COMPUT. Vol. 28, No. 1, pp. 24–46
A MULTILEVEL NONLINEAR METHOD∗ IRAD YAVNEH† AND GREGORY DARDYK† Abstract. A multilevel nonlinear method (MNM) for the numerical solution of nonlinear partial differential equations is developed. The MNM algorithm is motivated and analyzed using a simplified model which retains the essential features of the new approach. It is thereby shown to combine the advantages of the two classical multigrid approaches to nonlinear problems. The analysis is supported by numerical tests for nonlinear differential equations in one and two dimensions. Key words. multigrid, nonlinear partial differential equations, numerical methods for partial differential equations AMS subject classifications. 65N55, 65N22, 65H10 DOI. 10.1137/040613809
1. Introduction. Many problems in the sciences and engineering are modeled by nonlinear partial differential equations (PDEs), usually requiring numerical solution. Most commonly, the PDE is approximated by a finite difference, element, or volume discretization on a grid, which results in a large but sparse nonlinear algebraic system of equations. Reliable and computationally efficient solvers for such systems are essential for achieving relevant and accurate simulations. One of the most efficient techniques for solving such large systems is the multigrid iterative method (see, for example, the classical [1, 14, 15], the expository [5], and the comprehensive [25]). In this approach, a sequence of progressively coarser grids is used to accelerate the convergence of some basic iterative process on the target grid. The basic iterative method is referred to as the “relaxation” or “smoothing” operator, as its objective is to smooth the error in the current approximation relative to the computational grid. There are two general approaches for solving discretized nonlinear PDEs by multigrid methods. One is to perform a global linearization (GL), usually by Newton’s method or some inexact variant thereof, and solve the resulting linear system approximately by a linear multigrid algorithm; this is then repeated iteratively. The second approach, represented by the full approximation scheme (FAS) of [1, 2] and the closely related nonlinear multigrid method (NLMG) of [14], is to perform only local linearization (LL)—in the error-smoothing process. Convergence acceleration is then provided by nonlinear coarse-grid operators. Over the years, variants of these two general categories have been proposed and developed, usually in an ad hoc framework of a particular problem. Some recent examples of developments and implementations of GL multigrid solvers appear in [16, 17, 23, 27, 28], while LL applications can be found in, e.g., [4, 7, 11, 12, 13, 18, 19, 20, 21]. For “friendly” problems, both approaches work well and the difference in efficiency is usually not large (e.g., [25, section 5.3]). However, for more difficult problems, the two approaches often exhibit distinct behavior, with GL holding an advantage in some situations and LL in others. We are therefore motivated to develop ∗ Received
by the editors August 24, 2004; accepted for publication (in revised form) June 27, 2005; published electronically March 3, 2006. This research was supported by the Israeli Science Foundation under grant 48/02. http://www.siam.org/journals/sisc/28-1/61380.html † Department of Computer Science, Technion—Israel Institute of Technology, Haifa 32000, Israel (
[email protected],
[email protected]). 24
A MULTILEVEL NONLINEAR METHOD
25
methods which will be at least as good as the more suitable of these two approaches, and often better than both. The algorithm we propose is first motivated and analyzed in section 2, employing a simplified scalar analogy, and then presented in detail in section 3. The algorithm has been applied recently in [6] for the problem of two-dimensional phase unwrapping. However, this was an application-focused paper comparing different methods, with no analysis and just a very brief description of the algorithm. Multigrid methods for linear problems have been studied very extensively. For many problems, robust multigrid methods (such as the classical “algebraic” [3, 22], or “black-box” [8, 9] multigrid) are well known to be effective even in the presence of discontinuous coefficients and irregular domain boundaries. This represents an advantage for GL methods, where the problem solved per iteration is linear. LL approaches, on the other hand, generally require “direct” rediscretization of the nonlinear operator on the coarse grids, which is less robust. On the other hand, the attraction basin of efficient GL methods is usually small, which means that slow global search methods must often be applied before the fast GL method can be effective. This may represent a significant disadvantage. 1.1. Classical approaches. We first review two classical GL and LL methods. A two-grid framework is used, which is later generalized to a complete multigrid algorithm by recursion. Indices h and H denote fine- and coarse-grid operators and functions, respectively. The fine grid h is the target grid, while grid H is employed to accelerate convergence. We assume that, for linear fine-grid operators, we can construct robust coarse-grid approximations on grid H. By “robust” we mean that, given an appropriate smoother, the multigrid solver exhibits the typical fast and mesh-independent convergence. Let the fine-grid problem be (1)
Nh uh = fh ,
where Nh is some discrete nonlinear matrix operator, uh is the sought solution vector, and fh is a given vector. Using iterative methods, it is assumed that we have some current approximation to the solution, denoted u ˜h , which we are trying to improve. It is convenient to subtract Nh u ˜h from both sides of (1), obtaining the defect equation (2)
˜h = fh − Nh u ˜ h ≡ rh . Nh uh − Nh u
Let Kh denote a linearization of Nh . For example, using Newton’s method we define Kh to be the Jacobian matrix of Nh , computed for the current approximation: uh ]. Kh = Nh [˜ Then, Newton’s method approximates (2) by (3)
˜ h ) ≈ rh . Kh (uh − u
Let KH denote the coarse-grid approximation to the linear operator Kh , which is obtained using a robust linear coarse-grid correction method. Indeed, in our simple analysis below, it is assumed that this coarse-grid operator provides a “perfect” approximation of the linear fine-grid operator (as would be the case if total reduction is used). That is, the two-grid algorithm, with suitable error-smoothing, solves the h linear fine-grid problem exactly. Let IhH and IH denote appropriate fine-to-coarse
26
IRAD YAVNEH AND GREGORY DARDYK
(restriction) and coarse-to-fine (prolongation) operators, respectively. Then, after we smooth the error in (3) with a suitable relaxation, we solve the following coarse-grid problem: (4)
KH (uH − I¯hH u ˜h ) = IhH rh ,
where I¯hH is some restriction operator1 which may be different from IhH . Once uH is computed, the correction is interpolated and added to u ˜h : (5)
h =u ˜h + IH (uH − I¯hH u ˜h ). u ˜new h
We can optionally follow up with additional relaxation of (3), and this completes one Newton iteration. Next, we review the classical FAS. First, the error in the fine-grid solution is smoothed using a suitable nonlinear relaxation method (which employs only local ˆH denote a nonlinear coarse-grid approximation to Nh , obtained linearization). Let N by rediscretizing the nonlinear problem on the coarse grid (normally using the same discretization as that of the fine grid; this is called direct discretization). The coarsegrid problem is then (6)
H ˆ H uH − N ˆH I¯H u N h ˜h = Ih rh .
The correction to the fine-grid approximation is added as in (5). The two approaches are substantially distinct. In the GL method we approximate Nh by its linearization, which we then proceed to approximate on the coarse grid. Using suitable linear multigrid methods, the latter approximation is excellent, and the overall robustness of the method is therefore determined, loosely speaking, by the accuracy of the linearization, which depends in turn on the properties of the Jacobian and on the closeness of u ˜h to uh . In the LL approach, on the other hand, we use a nonlinear approximation to Nh on the coarse grid. If this approximation is sufficiently accurate, the LL algorithm can yield fast convergence even with a rather poor initial guess for the fine-grid solution, thus having an advantage over the GL approach. ˆH approximates Nh poorly, for example due to some nonsmoothness of However, if N the coefficients, then the method might not converge even if the initial error is very small and the nonlinearity very weak. 2. A scalar analogy. To motivate and analyze our approach, we present an analogy in the form of a simple scalar problem, which contains the essential elements we are studying. This analysis can be generalized to the multivariate vector-valued case, but this would increase the complication without yielding additional insight. Consider the problem of finding a real number u satisfying (7)
N (u) = f,
where N is some twice continuously differentiable nonlinear real-valued function, and f is a constant which serves to help follow the analogy. For any given ρ > 0, denote Dρ = {z : |u − z| < ρ}, and assume |N (z)| ≥ K > 0 for all z ∈ Dρ . 1 Since this is a linear problem, the coarse-grid correction is independent of this transfer operator; the usual choice for linear problems is I¯hH = 0, which yields the so-called correction scheme.
A MULTILEVEL NONLINEAR METHOD
27
2.1. FAS. Denote the current approximation to u by un ∈ Dρ , and the error by ˆ which approximates en = un −u. Suppose that there exists some nonlinear function N ˆ corresponds to N ˆH in our analogy). Then we can N and which is easy to invert (N define an iteration process, which is analogous to FAS, by (8)
ˆ (un ) = f − N (un ) ≡ rn . ˆ (un+1 ) − N N
ˆ is an O() approximation to N , where is a positive small parameter. Suppose that N That is, (9)
ˆ (z) ≡ N (z) + φ(z), N
for some twice continuously differentiable function φ. Substituting (9) into (8), we obtain after rearrangement [φ(un+1 ) − φ(un )] = f − N (un+1 ) ≡ rn+1 . Using the mean-value theorem and the fact that f = N (u), we get φ (ξ1 )(en+1 − en ) = −N (ξ2 )en+1 , for some ξ1 ∈ (un , un+1 ) and ξ2 ∈ (u, un+1 ). Assume |φ | ≤ P < K/2 in Dρ . Then P en+1 φ (ξ1 ) ≤ = (10) en N (ξ2 ) + φ (ξ1 ) K − P < 1. Thus, the iteration converges to u, and the asymptotic convergence behavior is given by φ (u) en+1 φ (u) + O 2 lim = = (11) for → 0. n→∞ en N (u) + φ (u) N (u) 2.2. Newton’s method. The Newton iteration for this problem can be written as (12)
N (un )(un+1 − un ) = rn .
For a sufficiently good initial approximation Newton’s method converges. In particular, if |N | ≤ R in Dρ , then en+1 en R (13) en ≤ 2K , and the iteration converges to u if |en | < 2K/R. The well-known quadratic asymptotic convergence behavior is given by en+1 N (u) . (14) lim 2 = n→∞ en 2N (u) 2.3. Multilevel nonlinear method (MNM). Equations (10)–(11) and (13)– (14) demonstrate the advantage and disadvantage of each approach. Newton’s method converges extremely fast when we have a good initial approximation (|en | sufficiently ˆ is a good small). FAS, on the other hand, will have a large attraction basin if N
28
IRAD YAVNEH AND GREGORY DARDYK
ˆ approximates N poorly. It approximation to N , though it will not be useful when N is generally far less dependent on a good initial approximation than Newton’s method. We now devise a method that has the advantages of both fast asymptotic convergence and a large attraction basin. First, we subtract N (un ) from both sides of (7) to obtain the defect equation, (15)
N (u) − N (un ) = rn .
The key idea is to split the left side of (15) into a relatively large linear part and ˆ approximation only for the nonlinear part. a small nonlinear part and use the N Accordingly, we add and subtract N (un )(u − un ) in the left side of (15). This yields (16)
N (un )(u − un ) + F [ N (u), N (un ), N (un ), u, un ] = rn ,
where (17)
F [ N (u), N (un ), N (un ), u, un ] = N (u) − N (un ) − N (un )(u − un ).
Note that (16) is equivalent to the original problem (7), but the nonlinear term F (which is nothing but the remainder term of a truncated Taylor series) is only of size O(|u − un |2 ), while the first term and the right side are O(|u − un |). We use the nonlinear approximation only for the small nonlinear part. That is, we define the iteration by ˆ (un+1 ), N ˆ (un ), N ˆ (un ), un+1 , un = rn . N (un )(un+1 − un ) + F N (18) Leaving only un+1 terms on the left side we obtain (19)
ˆ (un+1 ) − N ˆ (un )un+1 = rn + N (un )un N (un )un+1 + N ˆ (un ) − N ˆ (un )un . +N
Assume that we can easily invert the left side of (19) and solve for un+1 . To analyze the convergence behavior of this iteration, we first use (9) to obtain, after rearrangement, (20)
{ [φ(un+1 ) − φ(un )] − φ (un )(un+1 − un ) } = rn+1 .
Taylor series expansions now yield (21)
φ (ξ1 )(en+1 − en )2 = −en+1 N (ξ2 ) 2
for some ξ1 ∈ (un , un+1 ) and ξ2 ∈ (un+1 , u). Denote μ=−
N (ξ2 ) . en φ (ξ1 )
(If the denominator here is zero, then we have immediate convergence by (21), so we exclude this case.) Solving the quadratic equation (21) for the error reduction factor we obtain two solutions, the relevant one being (22)
en+1 = μ + 1 − sign(μ) μ2 + 2μ. en
A MULTILEVEL NONLINEAR METHOD
29
Table 1 Comparison of convergence behavior for the scalar problem, under the assumptions |N | ≥ K > 0, |N | ≤ R, |φ | ≤ P , |φ | ≤ Q in Dρ .
Method
e limn→∞ n+1 p e
p
Sufficient for convergence
FAS
φ (u) N (u)
1
2P K