Glacier Terminus Estimation from Landsat Image Intensity Profiles ...

Report 4 Downloads 46 Views
Glacier Terminus Estimation from Landsat Image Intensity Profiles Joseph Usset , Arnab Maity, Ana-Maria Staicu, and Armin Schwartzman Mountain glacier retreat is an important problem related to temperature increase caused by global climate change. The retreat of mountain glaciers has been studied from the ground, but there exists a need for automated methods to catalog glacial change with a wider scope. A viable approach is to extract intensity profiles from Landsat images along the glacial flowline and follow the terminus location over time. We propose a new robust and accurate statistical algorithm to estimate the movement of glacial termini over time from these extracted image intensity profiles. The method we propose first uses regression splines to smooth the image intensity profiles. For each profile, the glacial terminus location is assumed to lie near a point of high negative change in the smoothed profiles. An approximate path of termini locations over time is obtained by an algorithm that seeks to minimize the cumulative first derivative value across the profiles. Spline smoothing is applied to this pilot path for estimation of long-term terminus movement. The predictions from the method are evaluated on simulated data and compared to available ground measurements for the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers. Key Words: Change-point estimation; Cross-validation; Nonparametric regression; Satellite imagery; Spline smoothing.

1. INTRODUCTION Mountain glacier retreat is an important problem for several reasons. Glaciers can be used as a proxy indicator for global climate change, and in particular temperature change (Oerlemans 2000). Also, glacial retreat affects water supply in communities where glacial melt is a source of freshwater (Bintanja et al. 2005; Seidel and Martinec 2004). Over the past several decades, there has been a widely reported retreat of mountain glaciers, and this retreat has accelerated in the past decade (Oerlemans and van de Wal 1995; Bradley et al. 1999; Oerlemans 2000; Seidel and Martinec 2004; Bintanja et al. 2005). For these reasons, there is an interest in tracking mountain glacier retreat worldwide. One method for tracking

Joseph Usset (B) Department of Biostatistics, University of Kansas, Kansas city, USA (E-mail: [email protected]). Arnab Maity (E-mail: [email protected]), Ana-Maria Staicu (E-mail: [email protected]) and Armin Schwartzman (E-mail: [email protected]) Department of Statistics, North Carolina State, Raleigh, USA. © 2015 International Biometric Society Journal of Agricultural, Biological, and Environmental Statistics, Volume 20, Number 2, Pages 279–298 DOI: 10.1007/s13253-015-0207-4

279

Joseph Usset et al.

280

200 150 100 50

Profile Intensity

250

glacial retreat has been ground measurements, but these can be difficult to obtain. Instead, (Kachouie et al. 2012, 2014) have proposed to locate and track glacier termini over time using Landsat images, which have been available to the public since 2009. This approach is based on image intensity profiles extracted from the Landsat images along the glacier flowline. In this paper, we present an improved statistical algorithm used to locate and track glacial termini over time from the extracted image intensity profiles. Figure 1 demonstrates the basic elements of our analysis, using the Nigardsbreen mountain glacier as an example. The top left plot shows a 2-dimensional (2-D) false color composite image of the glacier cropped from the Landsat archive at a given time frame. Drawn on it in yellow is a 1-dimensional (1-D) outline of the glacier flowline, produced semi-manually by the method of Kachouie et al. (2012). The red arrow indicates the terminus location, at the boundary of the ice and a sub-glacial lake. The top right plot displays the extracted intensity profiles along the flowline from 15 spatially registered Landsat images (thermal band B62)

0

500

1000

1500

2000

2500

2010

200

150

100

2002

2002

100

2008

150

Year

2006

200

2004

Year

2008

MIAE = 7.7 MAD = 12.9 COV = .90

250

2006

250

2004

2010

Space Along Glacier Path

0

500

1000

1500

2000

2500

Space along the Glacial Path (Meters)

1300

1400

1500

1600

1700

Space along the Glacial Path (Meters)

Figure 1. Top left 2-D cropped Landsat image of Nigardsbreen glacier with marked flowline (yellow) and terminus (red arrow). Top right extracted 1-D intensity profiles along the flowline. Bottom left extracted 1-D intensity profiles laid out over time; along with the estimated terminus path (yellow), confidence intervals (black), and ground measurements (red dots), with evaluated metrics in the legend. Obstructed profiles appear as light blue stripes. Bottom right zoomed in display of estimated path, confidence intervals, and ground measurements. Time points where the image profiles were observed are marked by purple hash marks on the left of the image.

Glacier Terminus Estimation...

281

between 2000 and 2012. The terminus location corresponds to the high-to-low transition in each profile. The bottom plots present the image profiles over time, where the terminus location corresponds to the blue-to-brown transition. The estimated glacier terminus path from our proposed statistical algorithm is shown in yellow, with confidence bands shown in black and compared to ground measurements represented by the red dots. Note that the ground measurements were not used at all in the estimation, indicating how remarkably accurate the proposed method can be based on the Landsat images alone. In this paper, we describe the statistical algorithm that produced the estimated glacier terminus path and confidence bands. The algorithm we propose receives as input the time series of 1-D spatial intensity profiles and gives as output the estimated terminus path over time. The proposed algorithm contains three main steps: 1. Smooth the intensity profiles; and obtain the first derivative estimates of the smoothed profiles. 2. Input the estimated first derivative profiles into an algorithm that outputs a pilot path of the terminus location over time by minimizing the integrated first derivative. 3. Globally smooth the pilot path to estimate the long-term advance or retreat of the terminus. The above algorithm, beginning with Step 1, is motivated by the fact that the terminus of the glacier is often marked by a sharp high-to-low transition of the intensity profile, which corresponds to a highly negative local minimum of the first derivative after smoothing (Fig. 1). However, as we shall see in other glaciers, sharp local minima can also occur due to obstructive elements such as shadows from nearby mountains and debris. In addition, the glacier terminus itself may be occluded by clouds and seasonal snowfall, producing flat profiles such as those observed in Fig. 1 (top right panel) and as blue horizontal stripes in Fig. 1 (bottom panels). These sources of error, which we call systematic, are mostly deterministic but hard to model because not enough information about them is available. Step 2 overcomes the systematic error by finding a controlled sequence of locations over time that captures persistent local minima. Step 3 removes local noise. The proposed algorithm follows roughly the same framework as Kachouie et al. (2014), although there are some important differences in the approaches. In Step 1 of Kachouie et al. (2014), first derivatives of the mean intensity profiles are estimated directly with local kernel smoothing. A drawback of this approach is the difficulty of determining a fixed local smoothing bandwidth that will provide a good fit globally over the entire spatial domain of the intensity profiles. We correct this by using global spline smoothing instead, fitted by penalized weighted least squares with generalized cross-validation. In Step 2 of Kachouie et al. (2014), a tracking algorithm is applied to connect local minima of the first derivative between consecutive time points. However, because it forces the pilot path to pass through those local minima, it is sensitive to systematic noise. We fix this by globally minimizing the integrated first derivative over time. The proposed pilot path algorithm robustly finds the terminus path despite systematic error in the derivative profiles. Finally, in Step 3, (Kachouie et al. 2014) estimates the long-term terminus location change by local kernel smoothing of the pilot path. In addition to the bandwidth selection problem mentioned before, local

282

Joseph Usset et al.

smoothing gives imprecise or non-existent estimates over time periods where the image data are sparse. Instead, we gain more precise estimates of long-term glacial retreat by smoothing the pilot path with global spline smoothing. From a broader statistical viewpoint, the work presented here relates to the problem of change-point detection, which has been well studied in the context of nonparametric regression (Muller 1992; Loader 1996; Joo and Qiu 2009). However, the specific data analysis problem treated here presents a unique methodological challenge; as opposed to identifying change points in an individual continuous curve, we seek a path of change points from a time series of continuous curves. Moreover, while multiple change points may exist in each intensity profile (e.g., due to shadows in the original images), we are interested in a sequence of change points whose location changes smoothly over time. The 3-step algorithm presented above solves this problem. The paper is organized as follows. Section 2 provides a description of the data and how it was obtained. Section 3 discusses smoothing of the intensity profiles with penalized regression splines (Step 1), and shows how this facilitates smooth derivative estimates. Section 4 describes the pilot path algorithm that gives an initial estimate of the terminus path (Step 2), and compares this method to the tracking algorithm of Kachouie et al. (2014). Section 5 discusses temporal spline smoothing of the pilot path (Step 3), and evaluates several weighting schemes that can be used to estimate the long-term trend of the terminus location. Sections 6 and 7 include a numerical study and real data analyses of the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers. Section 8 summarizes our method and presents future directions for research.

2. DATA DESCRIPTION For a given mountain glacier, the proposed method takes as inputs Landsat image intensity profiles extracted along the glacier flowline. The image profiles used here were obtained using a Matlab graphical user interface (GUI) designed for this purpose; see Kachouie et al. (2012). Briefly, for every glacier considered, Landsat scenes taken at different time points were spatially registered and cropped around the glacier using user-defined geographical coordinates. The glacier flowline was outlined manually in the GUI on an arbitrary image of good quality and automatically smoothed; see Fig. 1, top left. The flowline was then applied to all the spatially registered images to extract the intensity profiles at each time frame. To reduce the influence of shadows, it was chosen for each glacier to use images corresponding to the thermal frequency band B62, or alternatively the Normalized Difference Snow Index (NDSI) processed band (defined as the difference between the visual frequency band B20 and the infrared frequency band B50, divided by their sum Irish 2000; Salomonson and Appel 2004; Kachouie et al. 2012). For each glacier, the intensity profiles were extracted by sampling the registered images at n s locations si , i = 1, . . . , n s along the flowline at constant arclength intervals, measured in meters. For n t time frames, this produced a data array Yi j , i = 1, . . . , n s , j = 1, . . . , n t , where each column j corresponds to the intensity profile at time t j . While the locations s j are equally spaced, the times t j are not. According to the Landsat orbit, the satellite passes

Glacier Terminus Estimation...

283

over the same location on the Earth every 16 days, but most of these images are not available to the public in the Landsat catalog, possibly because of quality issues. The proposed method is applied here to four glaciers: Nigardsbreen (Norway), Gorner (Switzerland), Rhone (Switzerland), and Franz Josef (New Zealand). For each glacier, ground measurements for the location of the terminus were obtained from the World Glacier Monitoring Service database, and are used to evaluate the performance of the proposed method.

3. SPATIAL SMOOTHING 3.1. M ODEL In this section, we discuss the modeling approach. At each time point t j , the intensity profiles (si , Yi j ), i = 1, . . . , n s , are modeled non-parametrically as Yi j = ξ(si , t j ) + i j ,

i j ∼ (0, σ j2 ),

(1)

where ξ(si , t j ) represents the profile’s mean function at spatial location si and time t j , and i j is independently distributed error with variance σ j2 . To facilitate the estimation of derivatives, we model the mean function of each profile as ξ(s, t j ) =

K 

ψk (s)a jk ,

k=1

where ψ1 (s), . . . , ψ K (s) are smooth pre-specified basis functions and a j1 , . . . , a j K are unknown scalar coefficients: the dependence on j is to emphasize that the coefficients are allowed to vary with t j . The basis expansion approach is ideal because it allows us to calculate derivatives of the profile intensities by simply taking derivatives of the basis functions; i.e., for l ≥ 0, ξ (l) (s, t j ) =

∂ l ξ(s, t j )  (l) = ψk (s)a jk , ∂s l K

(2)

k=1

where the superscript (l) denotes the lth derivative. For the basis functions, we use B-spline bases of order 6 with knots placed at equally spaced quantiles. The specification of order 6 facilitates smooth estimates of the third derivatives, which, we show in Sect. 5, can be used to improve the precision of the estimation procedure in the time domain. Next we discuss estimation of model components. 3.2. E STIMATION For profile j, define Y j = (Y1 j , . . . , Yn s j )T to be a column vector of observations, and  to be an n s × K matrix whose entry (i, k) equals ψk (si ). For each profile, the coefficients

Joseph Usset et al.

284

a j = (a j1 , . . . , a j K ) are found by the penalized weighted least squares criterion   argmin (Y j − a j )T (Y j − a j ) + λ j a Tj a j , aj

where λ j is a roughness penalty parameter that controls the smoothness of the regressor function. The solution to this criterion is the well-known ridge regression estimator aˆ j = ( T  + λ j I)T  T Y j . The coefficients provide estimates of the derivatives (2) via ξˆ (1) (s, t j ) =

K 

(1)

ψk (s)aˆ jk ,

k=1

ξˆ (3) (s, t j ) =

K 

(3)

ψk (s)aˆ jk .

(3)

k=1

Note that only the derivatives of order 1 and 3 are needed in what follows. 3.3. BASIS AND P ENALTY S PECIFICATION An important aspect of the spatial smoothing is the specification of the number of knots determining the regression matrix  and the penalty parameter λ j for each profile j = 1, . . . , n t . These model components determine the trade-off between the regression fit to the data and the smoothness of the fitted functions. Because the spatial sampling is the same for all the profiles, we use the same number of knots (and thus the same regression matrix ) for all profiles and let the penalty parameter λ j adjust the fit to each profile as needed. To specify the bases and penalty, we follow the established framework of Eilers and Marx (1996) and Ruppert (2002), and set  to be a basis capable of capturing a flexible enough fit to the data so that the amount of smoothness is driven by the selection of λ j . Specifically, we follow a rule of thumb set by Ruppert (2002) and let K = min(35, n4s ), selecting the smoothing parameters λ j for each profile with the generalized cross-validation criterion (GCV) (Golub et al. 1979). The GCV smoothing parameter tends to fit a regression function whose residuals resemble white noise, in agreement with our modeling assumptions in (1).

4. PILOT PATH ALGORITHM In this section, we describe a pilot path algorithm that takes as input the time series of first derivative profile estimates, and outputs a rough estimate of the glacier terminus location over time. We first give the motivation behind the algorithm and then describe the algorithm itself. 4.1. A G LOBAL O PTIMIZATION C RITERION BASED ON F IRST D ERIVATIVES We are interested in the profile means ξ(s, t j ) in (1) only insofar as they provide knowledge about the location of glacier terminus over time. An essential assumption is that underlying each profile is a transition from glacial ice to land or water, causing a sharp downward

50 150 250

285

0

500

1000

1500

2000

2500

3000

0

500

1000

1500

2000

2500

3000

-1.5 -0.5 0.5

First Derivative

Intensity

Glacier Terminus Estimation...

Space in Meters Figure 2. Top individual profile for Gorner from June 1987 (black circles), superimposed penalized regression spline fit (orange) and ideal mean function (blue). Bottom estimated first derivative (orange) and ideal first derivative (blue); hypothesized terminus location is marked in red.

trend in the intensity values over the terminus location. Therefore, in an ideal situation where the image profile is not obstructed by systematic error, the terminus location g(t j ) at time t j is marked by the location where the first spatial derivative reaches its apex, i.e., g(t j ) = argmin ξ (1) (s, t j ). s

(4)

An example of such an ideal situation is shown in Fig. 2. The top plot displays an intensity profile obtained from the Gorner glacier in June 1987. The ideal profile shape is illustrated by the blue sigmoid, fitted here with a logistic function with four parameters (left height, right height, transition location, and transition slope). The nonparametric penalized spline fit ξˆ (s, t) in orange follows this general shape. More importantly, the estimated first derivative ξˆ (1) (s, t) has its minimum nearly at the same location as the inflection point of the sigmoid, finding agreement in the estimation of the terminus location. Unfortunately, as seen in the top right panel of Fig. 1 and other examples below, most intensity profiles do not conform to the ideal sigmoidal shape because of systematic noise. Thus, (4) cannot be taken as a hard modeling assumption. Nevertheless, the intensity transition corresponding to the terminus location does appear often as a local minimum of the first derivative and tends to persist over time (except for occasional occlusions), while other intensity transitions do not. Based on these observations, the pilot path algorithm is guided by two principles. The first stems from our main assumption in (4) and states that for a given time t j , the terminus location will tend to lie near the point in the profile where the first derivative is minimal. Second, the location of the glacier terminus moves at a finite speed over time. This second principle helps rule out spurious intensity transitions (e.g., caused by shadows or debris) that are too far from the true terminus location in order to be reached by actual glacial retreat or advance. Based on the two principles, we define the pilot path as a sequence of locations over time that result in a minimal cumulative first derivative value across profiles, and move at a finite speed. Mathematically, this pilot path is defined as a function g(t) ˜ from the temporal to the spatial domain that minimizes

Joseph Usset et al.

286

nt 

ξˆ (1) (g(t j ), t j )

(5)

g(t ˜ j+1 ) − g(t ˜ j) < ρ for all j. t j+1 − t j

(6)

g(·) ˜ = argmin g

j=1

subject to the constraint

The rate ρ represents the maximum possible annual advance or retreat of the terminus location, and can be based on prior knowledge. Here we use a conservative value of ρ = 2 km/year, substantially higher than the speed of advance or retreat of most mountain glaciers. 4.2. T HE A LGORITHM To minimize criterion (5) subject to constraint (6), we take the following approach. We run n s paths from unique start locations in space (s1 , . . . , sn s ) in both forward and backward manner in time. From each unique starting point, we seek a path that takes a minimum cumulative first derivative value over time. Each individual is path found with the following greedy approach: 1

2 3

Initialize g Fi (t1 ) = si for j ∈ {1, . . . , n t−1 } do Choose a set {k} such that |si+{k} − g Fi (t j )|/(t j+1 − t j ) < ρ. For s ∗ ∈ (si + {k}) define g Fi (t j+1 ) = argmin{ξˆ (1) (s ∗ , t j+1 )}. s∗

Store g Fi (t). The backward algorithm is run similarly to the forward algorithm. Then, the forward nS nS {g Fi (t)}i=1 and backward {g Bi (t)}i=1 paths collected together are evaluated over the first derivative profiles at time points (t1 , . . . , tn t ), and their cumulative first derivative values are recorded. The path with the lowest cumulative first derivative score is set to be the pilot path: g(·) ˜ =

argmin

nt 

g∈(g F1 ,...,g Fn s ;g B1 ,...,g Bn s ) j=1

ξˆ (1) (g(t j ), t j ).

The algorithm is illustrated in Fig. 3. 4.3. P ILOT PATH V ERSUS T RACKING This pilot path algorithm is a robust alternative to the tracking algorithm of Kachouie et al. (2014). The key difference is that we base estimation off the global criterion in (5). This leads to two specific differences between the approaches: (1) the starting points of the algorithm, and (2) the movement of the pilot path across time. 1) Starting Points The pilot path algorithm begins at 2n s unique start locations that correspond to combinations across spatial values (s1 , . . . , sn s ) and the time endpoints

Glacier Terminus Estimation...

287

250 2000

2000

1.0 0.5

200

1.0

Pilot Path Start Location Gates

0

500

1500

Year

1995

150 100

1990

0.5

1990

Year

1995

0.0

50

1.5 2500

Space along the Glacial Path (Meters)

0

500

1500

2500

Space along the Glacial Path (Meters)

Figure 3. Left the estimated profile of first derivatives displayed over time for the Gorner glacier. From each start point, a path moves across time to the subsequent point with the the lowest first derivatives, subject to the rate constraint (shown by the gates). The red dots represent the estimated pilot path. Right observed intensity profiles for the Gorner glacier laid out over time, along with the corresponding pilot path in red.

t1 and tn t . In contrast, the tracking algorithm of Kachouie et al. (2014) begins at the individual point that takes the lowest first derivative value across time and space for all the profiles. For tracking, heavy systematic error can cause the individual point with lowest estimated first derivative value to lie far from the true terminus location, and lead the path astray. 2) Movement Across Time Points The pilot path algorithm moves across adjacent time points towards the point within the gated window that takes a lowest first derivative value. Tracking moves across adjacent time points to the nearest inflection point within the gated window; and if no inflection point exists, the path maintains the same spatial location. A drawback of the tracking algorithm is that a spurious inflection within the gated window may lead the path off-track with an inability to recover. The pilot path algorithm, on the other hand, is not forced to go through inflection points and recovery from spurious inflection points is possible because many alternative paths are compared according to a global optimization criterion. In summary, the pilot path algorithm is a robust alternative to tracking that requires negligible extra computation. The main innovation is that we seek a minimum cumulative first derivative value in the profiles over time.

5. TEMPORAL SMOOTHING 5.1. M ODEL The pilot path locations directly relate to the change in glacier length over time. However, these locations are observed at discrete times and estimated with error. Furthermore, there is not enough image data in the Landsat catalog (a few images a year at best) to track seasonal variability. However, we have enough data to estimate the long-term trend. To estimate the long-term trend, we propose a nonparametric regression fit with global spline smoothing. In contrast to the spatial noise model, we assume independent het-

Joseph Usset et al.

288

eroscedastic temporal noise between time frames. We propose the model g(t ˜ j ) = g(t j ) + θ j ,

j = 1, . . . , n t

(7)

where g(t) is a smoothed representation of the glacial terminus location over time, g(t) ˜ is the pilot path obtained earlier, and each θ j is an independent noise term distributed (0, δ j σ 2 ). The independence assumption in the errors is reasonable given that the time points are often months apart. The heterogenous variance weights δ j reflect the fact that some pilot path points are more accurate than others in their estimation of the terminus location. For example, pilot path points with high negative derivative correspond to sharper intensity transitions and thus tend to be closer to the glacier terminus location than points with first derivatives that are closer to zero. 5.2. W EIGHTS To consider the heteroscedasticity, we consider three approaches to estimating proportional variances in the temporal smoothing model. 5.2.1. Equal Weights The simplest approach is to give equal weights to each of the pilot path points in the least squares fitting: δ j = 1 for all j. In this scenario, the long-term trend is estimated with ordinary least squares. 5.2.2. Third Derivative Weights An alternative approach is obtained as a local approximation to the optimization criterion (5). Consider a quadratic approximation of the estimated first derivative profile ξˆ (1) (s, t j ) about s = g(t ˜ j ) (e.g., in the neighborhood of the red mark in the bottom panel of Fig. 2). Evaluated at s = g(t j ), this approximation states (1) (2) ξˆ (1) (g(t j ), t j ) ≈ ξˆ0 (t j ) + ξˆ0 (t j )(g(t j ) − g(t ˜ j )) +

(3) ξˆ0 (t j ) (g(t j ) − g(t ˜ j ))2 , 2

(8)

where ξˆ0(1) (t j ) = ξˆ (1) (g(t ˜ j ), t j ), ξˆ0(2) (t j ) = ξˆ (2) (g(t ˜ j ), t j ), and ξˆ0(3) (t j ) = ξˆ (3) (g(t ˜ j ), t j ). Since the pilot path g(t ˜ j ) is usually near a local minimum of the first derivative, we can (2) further approximate ξˆ0 (t j ) ≈ 0. Plugging (8) into (5), we obtain that the optimization criterion (5) becomes g(·) ˆ ≈ argmin g

nt 

(3) ξˆ0 (t j )(g(t ˜ j ) − g(t j ))2 .

j=1

This corresponds to the weighted least squares fitting criterion in the proposed temporal smoothing model (7) with δ j = 1/ξˆ0(3) (t j ). Intuitively, the third derivative weights are desirable because they pull the smooth path g(t j ) more strongly towards pilot path points g(t ˜ j ) that are near local minima with sharp

Glacier Terminus Estimation...

289

inflections and are thereby more accurate. Note that approximation (8) is representative of the terminus location only if g(t ˜ j ) is near a local minimum rather than a local maximum, (3) i.e., if ξˆ0 (t j ) > 0. Therefore, pilot path points whose third derivative is negative (due to systematic noise) must be dropped from the model. 5.2.3. Propagation The final weights considered are adapted from Kachouie et al. (2012) as another way of incorporating the accuracy of the pilot path estimates. This weighting scheme assumes the location of the pilot path is an inflection point in the estimated spatial intensity profile. By considering local minima of the first derivative as zero crossings of the second derivative, (Kachouie et al. 2012) shows that the location of an estimated inflection point g(t ˜ j ) has an associated standard error of the form SE(ξˆ (2) (g(t ˜ j ), t j ))/ξˆ (3) (g(t ˜ j ), t j ). For spatial model (1) with independent and identically distributed errors,  ξˆ (2) (g(t Var{ ˜ j ), t j )} = ψ (2) (t j )2 σˆ j2 , where ψ  (t j ) = (ψ1(2) (g(t j ), t j ), . . . , ψ K(2) (g(t j ), t j ))T is the vector of spatial spline derivatives used in (2) and (3). In model (7), the propagation weight at each time t j is set equal to the variance of the  ξˆ (2) (g(t inflection point location, δ j = Var[ ˜ j ), t j )]/[ξˆ (3) (g(t ˜ j ), t j )]2 . 5.3. E STIMATION Similar to spatial smoothing of the image profiles described in Sect. 3, we use a spline basis expansion to facilitate long-term estimation of the terminus location. From equation (7), we assume g(t) =

L 

φl (t)νl ,

l=1

where φl (t) are cubic spline basis functions and ν = [ν1 , . . . , ν L ] are scalar coefficients. Define  to be an n t × L matrix whose entry ( j, l) equals φl (t j ), let V be a diagonal matrix filled with weights [δ1 , . . . , δn t ] from Sect. 5.2, and g˜ = [g(t ˜ 1 ), . . . , g(t ˜ n t )]. The coefficients are estimated with penalized weighted least squares as νˆ = (T V−1  + λI)−1 T V−1 g˜ , and the estimated path of glacier termini is g(t) ˆ = ˆν . The specification of the number of basis functions L, and selection of λ, follows the framework from Sect. 3.3.

Joseph Usset et al.

290

5.4. C ONFIDENCE I NTERVALS To account for the bias induced by the smoothing parameter, we follow the standard approach to quantify uncertainty with penalized splines and use the Bayesian covariance matrix (Wahba 1983; Ruppert 2002; Wood 2006). The estimated Bayesian covariance matrix takes the form  Cov(ˆν ) = (T V−1  + λˆ I)−1 σˆ 2 , where we use the residual variance estimate n t (g(t ˆ j ) − g(t ˜ j ))2 2 σˆ = 1 . nt − L At time t j , we find the standard error of the path estimate via  g(t SE( ˆ j )) =



(t j ) Cov(ˆν ) (t j )T ,

where (t j ) is the j-th row of . The level 1 − α confidence intervals are set as  E(g(t ˆ j )), g(t ˆ j ) ± z α/2 S where z α/2 represents the α/2 quantile of a standard normal. Confidence intervals fit with penalized regression splines are best interpreted in the “across-the-function” sense described by Wahba (1983), Nychka (1988), Marra and Wood (2012). This interpretation states that the average coverage probability should be approximately 1 − α, but due to the bias in the fitting procedure, the pointwise coverage maybe be uneven across time. These intervals are applied and evaluated in the simulation and real data analyses presented in the following sections.

6. SIMULATION STUDY This simulation study serves to evaluate the viability of our method, and to guide the specification of weights on the pilot path points. 6.1. DATA G ENERATION We consider profiles observed at times t j , j ∈ {1, 2, . . . , 50} and at equally spaced points si , i ∈ {1, 2, . . . , 200}. The time intervals between the sampled profiles, t j+1 − t j , follow a Poisson distribution with mean 12 months, while the spatial intervals are non-random and equal to 200 m. As described in Sect. 2, the images in the real data are unequally spaced in time, and Poisson sampling is used as a simple way to simulate this unknown process. Within each profile, intensity values are evaluated over the same spatial coordinates. Each profile at time t j is generated independently according to model (1) with σ j2 = 1. The signal ξ is generated as the sum of a sigmoidal function μ representing an underlying transition

Glacier Terminus Estimation...

291

from glacial ice to land or water, and a systematic noise component ζ . For each time period t j , the sigmoidal function μ is generated independently as 1 − zi j , μ(si , t j ) = a j + b j  1 + z i2j

zi j =

si − g(t j ) , 200

where a j is a random intercept distributed N (200, 30) and b j is a random slope distributed N (60, 20). The location of the glacier terminus g(t j ) corresponds to the point of lowest first derivative in the sigmoid. The locations of the glacier terminus as a function of time take two forms: 1. g1 (t j ) = 2400 − 2t j . 2. g2 (t j ) = 2100 − 2t j − 75 sin(t j /30). To generate systematic error, we consider two settings: 6 1. ζ1 (si , t j ) = δ j1 + q=1 δ j (2q+1) cos(qsi ) + δ j (2q) sin(qsi ) • δ jq ∼ gamma(shape = 2, rate = 250) − 500  ζ1 (si , t j ), with probability 0.9 2. ζ2 (si , t j ) = 100, with probability 0.1.

100 50

Months

150

100 200 300 400 500

Months

100 200 300 400 500

300 200 100

200

200 150 100 50 0

0

0

Profile Intensity

The cosine and sine waves that generate the systematic noise 1 correspond to Fourier basis functions. The weights for the noise are generated from a centered gamma distribution with mean 0. The specified asymmetry of the weights, and number of basis functions are somewhat arbitrary choices, but were made to approximate visually the profiles from the real data analyses. The flat profiles introduced with probability 0.1 reflect the fact that some satellite images are cloudy and do not show the desired downward transition in the intensity profile. Examples of profiles generated by the above recipe are shown in Fig. 4. The profiles exhibit visual resemblance to the real data profiles of Figs. 1, 5, 6, and 7 below. At each combination of path and systematic error, we generate 200 Monte Carlo samples. For each sample, we apply the versions of our algorithm where temporal smoothing uses the three types of weights—equal, third derivative, and propagation—detailed in Sects.

0

1000

2000

3000

Space along Path

4000

0

1000 2000 3000 4000

Space in Meters

0

1000 2000 3000 4000

Space in Meters

Figure 4. Left simulated profiles for systematic noise ζ1 . Middle profiles over time with terminus location g1 (t) (red). Right profiles when terminus location is g2 (t) (red).

Joseph Usset et al.

292

5.2.1–5.2.3. Results are compared with the tracking procedure (as described in Sect. 4) in combination with the propagation weights, as done in Kachouie et al. (2014). 6.2. S IMULATION R ESULTS The results of the simulation are assessed in terms of three criteria: 1. Integrated mean absolute errors: MIAE(g) ˆ =

nt 200  

abs(gˆi (t j ) − g(t j ))/(200 · n t ),

i=1 j=1

2. Mean maximum absolute deviation over the path: MMAD(g) ˆ =

200 

max j {abs[gˆi (t j ) − g(t j )]}/(200),

i=1

3. Mean confidence coverage over the path: MCI(g) ˆ =

nt 200  

 gˆi (t j ))} /(200 · n t ). I g(t j ) ∈ {gˆi (t j ) ± 2SE( i=1 j=1

Table 1 displays the results for the case where the terminus location change is linear, g1 (t). The pilot path algorithm outperforms tracking for each application of least squares weights and according to all metrics. Given that the pilot path algorithm is applied, when no flat profiles are present (error ζ1 ), temporal smoothing with third derivative weights performs best across all metrics, followed by the propagation weights. These methods outperform the equal weighting procedure because the third derivatives’ weights pull the paths toward points of high inflection, which on average, correspond to the true glacial terminus location. Flat profiles (error ζ2 ) adversely affect all methods, but to varying degrees. The temporal smoothing procedure that uses third derivative weights is the most robust to flat profiles, as indicated by the small MIAE and MMAD, and it is the only method to maintain 95 % Table 1. Simulation results for g1 (t). Path

Error

g1 (t)

ζ1

ζ2

Method

Weights

MIAE

MMAD

MCI

Pilot path Pilot path Pilot path Tracking Pilot path Pilot path Pilot path Tracking

Equal Third derivative Propagation Propagation Equal Third derivative Propagation Propagation

9.8 6.0 7.7 45.9 21.0 8.9 87.6 114.4

18.8 11.5 14.7 100.1 40.6 16.6 303.5 360.6

96.1 97.5 92.5 83.2 93.2 96.7 78.8 86.8

All standard errors are less than 1.

Glacier Terminus Estimation...

293

Table 2. Simulation results for g2 (t). Path

Error

g2 (t)

ζ1

ζ2

Method

Weights

MIAE

MMAD

MCI

Pilot path Pilot path Pilot path Tracking Pilot Path Pilot path Pilot path Tracking

Equal Third derivative Propagation Propagation Equal Third derivative Propagation Propagation

17.2 12.4 14.5 50.5 36.1 22.2 88.0 137.7

52.6 39.1 42.9 129.4 100.2 67.4 297.9 417.2

95.4 97.4 91.5 91.7 91.3 95.5 79.2 85.0

All standard errors are less than 1.

confidence coverage on the average. The equal weighting procedure is also fairly robust to the flat profiles, but not to the same extent. The fitting procedure based on propagation weights lacks robustness, and performs poorly with flat profiles. Further investigation found that for flat profiles, the propagation weights tended to be very large. This gives undesirable high leverage to the flat profiles, which contain no information on the terminus location. Finally, the tracking method performs poorly due to the propagation weights, and a substantial number of pilot paths being led astray by systematic error. For all the other simulations (Table 2), when the terminus location is given by g2 (t), with systematic error ζ1 and ζ2 , the performance of the methods is all slightly worse than with g1 (t)—it is more difficult to estimate a glacial path that contains short-term variability. However, the overall story is similar. The pilot path algorithm and temporal fitting with third derivative weights provide the most robust and precise fits that maintain the desired 95 % confidence coverage on the average. These results guide our real data analyses in the discussed in the following section.

7. REAL DATA ANALYSES The results from the simulation study guide our real data analyses. The version of our fitting method with third derivative weights is applied to image intensity data collected for the Nigardsbreen, Gorner, Rhone, and Franz Josef glaciers. The results are compared to independent ground measurements sampled over time. The results are displayed in Figs. 1, 5, 6, and 7. All four figures are presented in the same format as Fig. 1, described before in the Introduction: the top left plot displays a 2-D Landsat image at an individual time point, with the coordinates of the 1-D extraction; the time series of image profiles from the 1-D extraction are shown in the top right plot; the bottom plots show the profiles laid out over time, the estimated glacial path, the estimated confidence bands, and the ground measurements for comparison. The ground measurements are originally given as relative changes in the terminus location between adjacent time points; to facilitate comparison, these measurements are shifted by a constant offset so that their average is equal to the average path estimate calculated over the years of available data. The plot legends contain the mean integrated absolute error (MIAE) over the path

Joseph Usset et al.

150 100 50 0

Profile Intensity

200

250

294

0

500 1000 1500 2000 2500 3000

100 50

0

500 1000

2000

3000

Space along the Glacial Path (Meters)

2005

Year

1995

150

200 150

1995

200

1990

Year

2000

MIAE = 12.2 MAD = 35.7 COV = .79

250

2000

250

100

1990

2005

Space Along Glacier Path

1800

50

1900

2000

2100

2200

2300

Space along the Glacial Path (Meters)

Figure 5. Gorner results. Top left specified 1-D extraction from an individual 2-D Landsat image of Gorner. Top right extracted 1-D intensity profiles. Bottom left: Estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evaluated metrics in the legend. Bottom right zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

and maximum absolute deviation (MAD) in meters of the estimated path with respect to the ground measurements, along with the empirical coverage (COV) of the confidence bands. Broadly, the estimates for each data set follow the ground measurements well: the estimated terminus locations for the Nigardsbreen, Gorner, and Rhone glaciers retreated over the measured time period, while the estimated Franz Josef terminus retreated and then advanced in equal measure. The performance of the method varies across the four glaciers, depending on the image quality, density of the profile samples over time, and the amount of systematic error in the profiles. For the Nigardsbreen, Gorner, and Rhone glaciers, the performance is best, with an accuracy of about 8–20 m. This is remarkable given that the resolution of the Landsat images is 30 m. Note that the Gorner and Rhone glaciers have periods of time of several years with no image data, yet the temporal spline model is able to interpolate in these periods. For Franz Josef, the results are less accurate; here the data suffer from heavy systematic noise due to shadows from an adjacent mountain and cloud cover.

Glacier Terminus Estimation...

150 100 0

50

Profile Intensity

200

250

295

0

500

1000 1500 2000 2500 3000

250

250 2005

MIAE = 17.9 MAD = 30.4 COV = .50

200

200

150

0

500 1000

2000

3000

Space along Glacial Path (Meters)

Year

1995

100

150 100

50

50

0

0

1985

Year

1985 1990 1995 2000 2005 2010

Space Along Glacier Path

1700

1800

1900

2000

2100

2200

Space along Glacial Path (Meters)

Figure 6. Rhone results. Top left specified 1-D extraction from an individual 2-D Landsat image of Rhone. Top right extracted 1-D intensity profiles. Bottom left: estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evaluated metrics in the legend. Bottom right zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

8. DISCUSSION In this paper, we have proposed a statistical algorithm that estimates glacial recession as a path of near-inflection points from a time series of image intensity profiles. The proposed method involves three stages. First, the intensity profiles are spatially smoothed using penalized spline smoothing, and the first derivatives of the smoothed profiles are extracted. Second, a fast and robust algorithm is applied to the collection of first derivative profiles that identifies a rough estimate of the glacial terminus location over time. The guiding framework for the proposed algorithm is that the temporal terminus location corresponds to a path that takes the lowest summed first derivative values, and that the terminus changes at a limited speed over time. Third, the rough pilot path of terminus locations is globally smoothed to obtain an estimate of long-term terminus trend. The viability of our method was shown through simulation and real data analyses.

Joseph Usset et al.

200 150 100 50

Profile Intensity

250

296

0

2000

4000

6000

8000

2008

200

150

200

150

2004

MIAE = 64.4 MAD = 137.8 CI = .38

Year

100

100 2000

Year

2000 2002 2004 2006 2008

Space Along Glacier Path

0

2000

4000

6000

8000

Space along the Glacial Path (Meters)

6000

6200

6400

6600

6800

7000

Space along Glacial Path (Meters)

Figure 7. Franz Josef results. Top left specified 1-D extraction from an individual 2-D Landsat image of Franz Josef. Top right Extracted 1-D intensity profiles. Bottom left Estimated path (yellow), confidence intervals (black), ground measurements (red dots), with evaluated metrics in the legend. Bottom right zoomed in display of estimated path, confidence intervals, and ground measurements. The purple hash marks on the left show the time points where the image profiles were observed.

Through simulation, we showed that our approach is more robust to systematic error in the intensity profiles than tracking. Also, we considered different weighting schemes for the pilot path points. Weighting by third derivatives gave the best results, since pilot path points near the true terminus location have higher inflection and are more accurate. The data analyses showed that the performance of the method can vary depending on the glacier of interest. For glaciers with high image quality and densely sampled intensity profiles, such as Nigardsbreen and Gorner, the estimates from our algorithm matched very closely to those obtained from ground measurements. For the Rhone and Franz Josef glaciers, the image quality was corrupted by more systematic noise, and the profiles were observed less densely. Therefore, the estimates were somewhat less accurate. However, in these situations, the method qualitatively showed fidelity to empirical measurements of terminus location. A limitation of our approach is that the pilot path algorithm is conditional on the smooth first derivative estimates of the intensity profiles, and also, the temporal smoothing is conditional on the output from the path algorithm, thereby making it difficult to properly estimate standard errors. While we showed the viability of our method through simulation, it would

Glacier Terminus Estimation...

297

be ideal to incorporate and propagate the uncertainty that exists in the initial stages of estimation. To address this limitation, we considered bootstrap and jackknife procedures to account for estimation uncertainty. For the bootstrap, the data are difficult to effectively resample due to the systematic error in the profiles, whose underlying distribution is unknown. For jackknife estimation, the leave-one-out path estimates are highly correlated, but this correlation is challenging to quantify due to the algorithmic nature of the fitting procedure. Without knowledge of this correlation of the leave-one-out estimates, estimating the inflation factor for the jackknife variance is difficult. An alternative direction we attempted for estimation of the target terminus location is to use maximum likelihood. In this framework, each profile is represented by a multivariate normal distribution whose mean is a sigmoid function with an inflection point that corresponds to the true glacial terminus location. By building a likelihood of the profiles observed over time, the terminus path can estimated. An advantage of this approach is that standard likelihood-based tools for uncertainty quantification are available. However, we decided against this approach because the systematic error in the profiles is difficult to specify in a general way.

[Received June 2014. Accepted April 2015. Published Online May 2015.]

REFERENCES Bintanja, R., van de Wal, R. S., and Oerlemans, J. (2005) “Modelled atmospheric temperatures and global sea levels over the past million years,” Nature, 437(7055):125–128. Bradley, R. S., Mann, M., and Hughes, M. K. (1999) “Northern hemisphere temperatures during the past millennium: inferences, uncertainties, and limitations,” Geophysical Research Letters, 26(6):759–762. Eilers, P. H. and Marx, B. D. (1996) “Flexible smoothing with b-splines and penalties,” Statistical Science, 11:89– 102. Golub, G. H., Heath, M., and Wahba, G. (1979) “Generalized cross-validation as a method for choosing a good ridge parameter,” Technometrics, 21(2):215–223. Irish, R. R. (2000) “Landsat 7 automatic cloud cover assessment,” In AeroSense 2000. International Society for Optics and Photonics. pp. 348–355. Joo, J.-H., Qiu, P. (2009) “Jump detection in a regression curve and its derivative,” Technometrics, 51(3):289–305. Kachouie, N. N., Gerke, T., Huybers, P., and Schwartzman, A. (2014) “Non-parametric regression for estimation of spatiotemporal mountain glacier retreat from satellite images,” IEEE Transactions on Geosciences and Remote Sensing, 53:1135–1145. Kachouie, N. N., Huybers, P., and Schwartzman, A. (2012) “Localization of mountain glacier termini in landsat multi-spectral images,” Pattern Recognition Letters, 34:94–104. Landsat.org. http://landat.org/. Landsat-US National Aeronautics and Space Administration (NASA). http://landsat.gsfc.nasa.gov/. Loader, C. R. (1996) “Change point estimation using non-parametric regression,” The Annals of Statistics, 24(4):1667–1678. Marra, G. and Wood, S. N. (2012) “Coverage properties of confidence intervals for generalized additive model components,” Scandinavian Journal of Statistics, 39(1):53–74. Muller, H.-G. (1992) “Change-points in non-parametric regression analysis,” The Annals of Statistics, 20(2):737– 761.

298

Joseph Usset et al.

Nychka, D. (1988) “Bayesian confidence intervals for smoothing splines,” Journal of the American Statistical Association, 83(404):1134–1143. Oerlemans, J. (2000) “Holocene glacier fluctuations: Is the current rate of retreat exceptional?,” Annals of Glaciology, 31(1):39–44. Oerlemans, J. and van de Wal, R. (1995) “Response of valley glaciers to climate change and kinematic waves: A study with a numerical ice-flow model,” Journal of Glaciology, 41(137):142–152. Ruppert, D. (2002) “Selecting the number of knots for penalized splines, ” Journal of Computational and Graphical Statistics, 11(4):735–757. Salomonson, V. and Appel, I. (2004) “Estimating fractional snow cover from modis using the normalized difference snow index,” Remote Sensing of Environment, 89(3):351–360. Seidel, K. and Martinec, J. (2004) Remote Sensing in Snow Hydrology: Runoff Modelling, Effect of Climate Change, Springer: Berlin. The Swiss Glacier Inventory 2000 (SGI 2000). http://www.geo.unizh.ch/fpaul/SGI2000/. US Geographical Survey (USGS). http://landat.usgs.gov/. Wahba, G. (1983) “Bayesian “confidence intervals” for the cross-validated smoothing spline,” Journal of the Royal Statistical Society. Series B (Methodological), 45:133–150. Wood, S. N. (2006) Generalized Additive Models: An Introduction with R, Chapman and Hall/CRC: Boca Raton, Florida. World Glacier Monitoring Service. http://www.wgms.ch/.