Click Here
WATER RESOURCES RESEARCH, VOL. 43, W05407, doi:10.1029/2006WR005258, 2007
for
Full Article
Stochastic simulation model for nonstationary time series using an autoregressive wavelet decomposition: Applications to rainfall and temperature Hyun-Han Kwon,1 Upmanu Lall,1 and Abedalrazq F. Khalil1 Received 19 June 2006; revised 2 February 2007; accepted 15 February 2007; published 2 May 2007.
[1] A time series simulation scheme based on wavelet decomposition coupled to an
autoregressive model is presented for hydroclimatic series that exhibit band-limited lowfrequency variability. Many nonlinear dynamical systems generate time series that appear to have amplitude- and frequency-modulated oscillations that may correspond to the recurrence of different solution regimes. The use of wavelet decomposition followed by an autoregressive model of each leading component is explored as a model for such time series. The first example considered is the Lorenz-84 low-order model of extratropical circulation, which has been used to illustrate how chaos and intransitivity (multiple stable solutions) can lead to low-frequency variability. The central England temperature (CET) time series, the NINO3.4 series that is a surrogate for El Nino–Southern Oscillation, and seasonal rainfall from Everglades National Park, Florida, are then modeled with this approach. The proposed simulation model yields better results than a traditional linear autoregressive (AR) time series model in terms of reproducing the time-frequency properties of the observed rainfall, while preserving the statistics usually reproduced by the AR models. Citation: Kwon, H.-H., U. Lall, and A. F. Khalil (2007), Stochastic simulation model for nonstationary time series using an autoregressive wavelet decomposition: Applications to rainfall and temperature, Water Resour. Res., 43, W05407, doi:10.1029/2006WR005258.
1. Introduction [2] Stochastic hydrologic methods have been very useful for a variety of water resources problems where temporal uncertainty needs to be quantified. The time series models that were developed extensively since the 1960s have typically assumed that the series modeled comes from a stationary or cyclostationary process. Thus the literature [Box and Jenkins, 1970; Thomas and Fiering, 1962; Yevjevich, 1972] has developed around autoregressive moving average models and their extensions to consider seasonality through periodic terms. Multisite models and space-time disaggregation approaches have also been considered [Koutsoyiannis, 1994; Stedinger and Vogel, 1984; Valencia and Schaake, 1973]. [3] However, as record lengths have increased, hydrologists have become aware of the low-frequency structure (e.g., oscillations such ENSO, PDO, NAO) of climate and associated hydrologic time series [Dettinger et al., 1995; Ghil and Vautard, 1991; Keppenne and Ghil, 1992a; Keppenne and Lall, 1996; Lall and Mann, 1995; Mann and Park, 1993, 1994, 1996]. Traditionally, an ARMA ( p, q) model is considered for such a time series. Such a model is capable of generating linear oscillations, even with relatively low values of p and q. However, applica1 Department of Earth and Environmental Engineering, Columbia University, New York, New York, USA.
Copyright 2007 by the American Geophysical Union. 0043-1397/07/2006WR005258$09.00
tions of such models often do not reproduce the spectral signature of the time series, specifically the amplitudefrequency modulation over time that is seen in moving window spectra or wavelet spectra. Such temporal variations could be generated by (1) nonlinear autoregressive models [e.g., Tong, 1990], (2) time varying AR [e.g., Huang and Chalabi, 1995], (3) autoregressive conditionally heteroscedastic (ARCH) models [e.g., Ahn and Kim, 2005], (4) nonparametric approximations to nonlinear autoregressive models [e.g., Lall and Sharma, 1996], (5) hidden Markov models [e.g., Hughes and Guttorp, 1999; Robertson et al., 2004], and (6) empirical decompositions of the low-frequency signals and noise using spectral methods followed by linear autoregressive models [e.g., Jiang et al., 1995; Keppenne and Ghil, 1992a, 1992b]. The idea of decomposing the time series into components that correspond to simpler dynamical systems with characteristic dynamics at particular scales is pragmatically attractive, even though a low-order nonlinear autoregressive model [e.g., Abarbanel, 1996; Abarbanel and Lall, 1996] that automatically simulates the desired behavior may be more intellectually satisfying. A multitaper (MTM) based approach along with AR modeling in a forecasting context was developed by Rajagopalan et al. [1998]. [4] An objective of this study is to explore the use of autoregressive models with wavelet decomposition as a time series simulator for systems with (1) quasiperiodic long memory behavior or (2) nonlinear dynamics that may lead to persistent regime like behavior or (3) stochastic
W05407
1 of 15
W05407
KWON ET AL.: REGIONAL CALIBRATION OF CATCHMENT MODELS
W05407
Figure 1. WARM simulation procedure. The time series is decomposed using a continuous wavelet transform using the Morlet wavelet basis, and the components whose global wavelet power exceeds the 95% of the spectrum associated with a red noise background are retained. Optimum lag orders of each dominant component are estimated by the Akaike information criteria, and autoregression parameters are estimated via maximum likelihood under the assumption that the residuals are normally distributed.
regime transitions, without a priori specifying any of these modeling structures. The continuous wavelet transform is applied to decompose a univariate time series into several statistically significant components and then a linear AR model is employed to simulate each component extracted from wavelet transform analysis, as well as the residual ‘‘noise’’ term. The intent of this study was exploratory procedures; to attain the best possible fit to a data set were not an explicit goal. [5] The general approach is described in the next section. Four applications follow. The first considers time series data from a deterministic, numerical model that provides some motivation for the physical context of the problem. The next three test the approach with some relatively long hydroclimatic time series where the ability to reproduce interannual variability may be of interest. A discussion of the
applicability of the method and of potential future directions concludes the paper.
2. Wavelet Autoregressive Model [6] Consider a time series xt, t = 1,. . ., N, recorded at monthly intervals, that exhibits quasi-oscillatory, lowfrequency variations at intraseasonal, interannual and longer timescales, as seen in many hydroclimatic time series. Consider the decomposition (see Figure 1) of this series into K component series Rkt that represent ‘‘signal’’ and a residual term et. xt ¼
K X
Rkt þ et
ð1Þ
k¼1
[7] The decomposition in (1) considers that there are K orthogonal or independent series that carry the low-frequency
2 of 15
KWON ET AL.: REGIONAL CALIBRATION OF CATCHMENT MODELS
W05407
information, and the residual, et, is a stochastic process. The notion is that the dynamics of each of these terms (Rkt and et) is simpler to model using an autoregressive model than an autoregressive model for the composite dynamics of all the components. In general, each could be modeled using an appropriate time series technique (e.g., linear of nonlinear). Here we consider a linear autoregressive model for each term, leading to the following model structure: xt ¼
K X
functions have been proposed [Foufoula-Georgiou and Kumar, 1995; Torrence and Compo, 1998]. Here we have 2 used the Morlet wavelet, defined as 8 (t) = p1/4eiwotet /2, where w0 is a frequency. To estimate the continuous wavelet transform, an N times convolution of function (1) is done for each scale, where N is the number of points in the time series [Kaiser, 1994]. Numerically, we are able to estimate the wavelet power spectrum in Fourier space, at N given points, using a discrete Fourier transform (xn) as done by Torrence and Compo [1998].
ARðRkt ; pk ÞþARðet ; pÞ
k¼1
¼
pk K X X k¼1
W05407
! ak;i Rti;k þ vt;k
þ
p X
^xj ¼ b j etj þ wt
[8] Here pk is the order of an autoregressive model fit to the kth signal, ak,i are the corresponding autoregression coefficients; p is the order of an autoregressive model fit to the stochastic process et, with bj as the associated autoregression coefficients, and vt,k and wt are independent, identically distributed, noise processes. Note that this is a very simple additive model with no interactions across the noise or the signals. The question is whether such a model can capture the low- and high-frequency dynamics of hydroclimatic time series better than a traditional autoregressive model applied directly to the time series xt. Noting that the order ( pk or p) of each autoregressive model and the coefficients of each autoregression in (2) can be estimated using standard time series methods (e.g., maximum likelihood and Akaike information criteria (AIC) [Akaike, 1974]), the procedures for the selection of the ‘‘signals’’ Rkt in the decomposition in equation (1) are described next. 2.1. Wavelet Transform for Time Series Decomposition [9] The wavelet transform has been applied to many geophysical time series [e.g., Farge, 1992; FoufoulaGeorgiou and Kumar, 1995; Hubbard, 1996; Kulkarni, 2000; Torrence and Compo, 1998; Wang and Wang, 1996; Weng and Lau, 1994]. The main advantage cited for wavelet transforms over other spectral methods is that they permit an orthogonal decomposition of the original signal in the time and in the frequency domain. This is an attractive property in the context of the additive decomposition suggested in equation (1). We briefly summarize the presentation of the wavelet transform by Torrence and Compo [1998] for our problem context. [10] The term wavelets refers to sets of functions of the form 8b,a (t) = jaj1/28 ((t b)/a), i.e., sets of functions formed by dilation and translation of a single function 8 (t), called the mother wavelet. The continuous wavelet transform of a real time series x(t) is defined by [Chui, 1992]: X ðb; aÞ ¼ jaj1=2
Z
þ1
xðt Þ8*ððt bÞ=aÞdt
ð4Þ
ð2Þ
j¼1
i¼1
N 1 1 X xn expð2pijn=N Þ N n¼0
ð3Þ
1
where X(b, a) is a wavelet spectrum, 8 (t) is a wavelet function, the (*) indicates the complex conjugate, b is the translation (shift) parameter and a 6¼ 0 is the scale parameter. By localizing the wavelet function at t b = 0 and then by computing the coefficients X(b, a) we can explore the behavior of x(t) near t = b. A variety of wavelet
where j = 0,. . ., N 1 is the frequency index. In the continuous limit, the Fourier transform of a function 8 (t/a) ^ (aw). By the convolution theorem, the wavelet is given by 8 transform is the inverse Fourier transform of the product: Xn ðaÞ ¼
N 1 X
^ * awj exp iwj nd t ^xj 8
ð5Þ
j¼0
[11] We can interpret the information in the wavelet power spectrum at each time and scale in two ways, as suggested by Torrence and Compo [1998]. One is the timeintegrated variance of energy coefficients at every scale to construct the global wavelet power and another one is the scale-integrated variance of energy coefficients over time to compute the scale-averaged wavelet power (SAWP) as [12] GWP N 1 1 X Xn2 ðaÞ ¼ jXn ðaÞj2 N n¼0
ð6Þ
[13] SAWP dj dt Xn2 ðtÞ ¼ Cd
j 2 2 X Xn aj j¼j1
aj
ð7Þ
[14] The Cd is a reconstruction coefficient and is a constant for each wavelet function; j1 and j2 are scales over which the averaging takes place; dj and dt are the scale averaging coefficient, sampling period respectively. [15] The GWP for xt is computed as a function of the scale a. A ‘‘red noise’’ significance test for the GWP(x) is then developed following the procedure described by Torrence and Compo [1998]. It entails first fitting an AR(1) model to xt, and then computing its Fourier spectrum, and the associated one-sided 95% confidence limits as a function of frequency. The scales aj for which the GWP(x) spectrum is higher than the red noise significance level are then selected as candidates for reconstruction. Conceptually, this decomposition is similar to what a Hidden Markov Model (HMM) may try to do. A HMM introduces latent variables that cluster the state-space in a way such that regime transitions from one state or cluster to another follow a Markov process. Given the transition probability matrix across such states, low-frequency, quasiperiodic behavior could be generated. The scale decomposition is explicit in the wavelet approach while it is implicit in the HMM.
3 of 15
KWON ET AL.: REGIONAL CALIBRATION OF CATCHMENT MODELS
W05407
[16] Following the reconstructed component series Rkt can be estimated by the sum of the real part of the wavelet transform over the scale associated with the kth component as 1=2
Rkn ¼
dk dt