Robust Autocorrelation Estimation Christopher C. Chang and Dimitris N. Politis Department of Mathematics University of California at San Diego La Jolla, CA 92093-0112 USA. E-mails:
[email protected],
[email protected] 24 Apr 2014
Abstract In this paper, we introduce a new class of robust autocorrelation estimators based on interpreting the sample autocorrelation function as a linear regression. We investigate the efficiency and robustness properties of the estimators that result from employing three common robust regression techniques. Construction of robust autocovariance and positive definite autocorrelation estimates is discussed, as well as application to AR model fitting. Simulation studies with various outlier configurations are performed in order to compare the different estimators. Keywords: autocorrelation; regression; robustness.
1
Introduction The estimation of the autocorrelation function plays a central role in time series analysis. For example,
when a time series is modeled as an AutoRegressive (AR) process, the model coefficient estimates are straightforward functions of the estimated autocorrelations [1]. Recall that the autocovariance function (acvf for short) of a wide-sense stationary time series {Xt } is defined as γ(h) := E[(Xt+h − µ)(Xt − µ)] where µ := E[Xt ]; similarly, the autocorrelation function (acf for short) is ρ(h) := γ(h)/γ(0). Given data X1 , . . . , Xn , the classical estimator of the acvf is the sample acvf: γˆ (h) := n−1
n−h X
¯ ¯ (Xj+h − X)(X j − X)
j=1
¯ := n−1 Pn Xj . The classical estimator of the acf is simply ρˆ(h) := γˆ (h)/ˆ for |h| < n where X γ (0). j=1 1
However, the classical estimator is not robust: just one contaminated point is enough to corrupt it and mask the real dependence structure. Since it is not uncommon for 10% or more of measured time series values to be outliers [4], this is a serious problem. To address it, several robust autocorrelation estimators have been proposed [3] [9] [12] [13]. Due to the limitations of older computers, these techniques were not widely adopted; instead, explicit detection and removal of outliers was typically employed. However, today’s hardware is powerful enough to support robust estimation in most contexts, and it is far from clear that classical estimation with explicit outlier elimination is more effective in practice than a well-chosen robust estimator. In the paper at hand, we propose a new class of robust autocorrelation estimators, based on constructing an autoregressive (AR) scatterplot, and applying robust regression to it. The remainder of this paper is structured as follows: In section 2, we introduce the new class of robust autocorrelation estimators. Next, in section 3, we analyze the estimators that result from using three common robust regression techniques, and compare their performance to that of the sample acf. Then, in sections 4 and 5, we discuss the derivation of autocovariance and positive definite autocorrelation estimates from our initial estimator. We apply our method to robust AR model fitting in section 6. Finally, we present the results of our simulation study (including one real data example) in section 7.
2
Scatterplot-based robust autocorrelation estimation Assume we have data X1 , . . . , Xn from the wide-sense stationary time series {Xt } discussed in the
Introduction. Fix a lag h < n where h ∈ Z+ , and consider the scatterplot associated with the pairs ¯ Xt+h − X), ¯ for t ∈ {1, . . . , n − h}}; see 1 for an example with h = 1. [Here, and in what follows, {(Xt − X, we use the R convention of scatterplots, i.e., the pairs (xi , yi ) correspond to a regression of y on x.] If the time series {Xt } satisfies the causal AR(p) model Xt = φ1 Xt−1 + . . . + φp Xk−p + zt
(1)
with zt being iid (0, σ 2 ), then E[Xt+h − µ|Xt ] = (Xt − µ)ρ(h); this includes the case when {Xt } is Gaussian. In the absence of a causal AR(p)—or AR(∞)—model with respect to iid errors, the above equation is modified to read ¯ t+h − µ|Xt ] = (Xt − µ)ρ(h) E[X ¯ |X] denotes the provided the spectral density of {Xt } exists and is strictly positive. In the above, E[Y
2
-4
-3
-2
-1
x[i+1]
0
1
2
3
x[i] vs. x[i+1] plot for AR(1) time series, phi=0.8
-4
-3
-2
-1
0
1
2
3
x[i], i=1..49
Figure 1: Scatterplot of (Xt , Xt+1 ) for a realization of length 50 from the AR(1) time series Xt = 0.8Xt−1 +Zt , Zt iid N (0, 1). Regression line is y = 0.82375x + 0.01289. orthogonal projection of Y onto the linear space spanned by X; see e.g. Kreiss, Paparoditis, and Politis [11]. ¯ Xt+h − X) ¯ on the scatterplot should tend to be In either case, it is apparent that the point (Xt − X, close to the line y = ρ(h)x, and one would expect the slope of a regression line through the points to be a reasonable estimate of the autocorrelation ρ(h). This works well as Figure 1 shows. Indeed, it is well known that the (Ordinary) Least Squares (OLS) fit is almost identical to the sample acf for
h n
small.
To elaborate, if the points on the scatterplot are denoted (xi , yi ), i.e., letting yi = Xi+h and xi = Xi , we have
Pn−h ρˆOLS (h) =
(xj − x ¯)(yj − y¯) Pn−h ¯ )2 j=1 (xj − x
j=1
Pn−h =
j=1
¯ (h+1)...n )(Xj − X ¯ 1...(n−h) ) (Xj+h − X Pn−h ¯ 1...(n−h) )2 (Xj − X j=1
Pn−h ≈ =
¯
j=1 (Xj+h − X)(Xj − Pn n−h ¯ 2 j=1 (Xj − X) n
n ρˆ(h) n−h
3
¯ X)
where the notation x ¯a...b := (b − a + 1)−1 The additional
n n−h
Pb
j=a
xj and x ¯ := x ¯1...n has been used.
factor is expected, since the regression slope is supposed to be an unbiased estimator
while the sample acf is biased by construction. The only other difference between ρˆOLS (h) and ρˆ(h) is the inclusion/exclusion of the first and last time series points in computing sample mean and variance; the impact of that is negligible. Since ρˆOLS (h) is based on simple OLS regression, the implication is that if we run a robust linear regression on the pairs {(Xt , Xt+h )}, we should get a robust estimate of autocorrelation. We therefore define ρˆROBU ST (h) to be the estimator of slope β1 in the straight line regression
Xt = β0 + β1 Xt−h + error
(2)
using any robust method to fit the regression.
In what follows, we investigate in more detail three possibilities for the aforementioned robust regression fitting that result in three different robust acf estimators. Note, however, that other robust regression techniques could alternatively be used in the context of producing ρˆROBU ST (h). 1. ρˆL1 . Recall that a residual of a linear regression is the vertical distance between the point (xi , yi ) and the regression line; i.e. given the regression line y = β1 x + β0 , we have ri (β) = (β1 xi + β0 ) − yi . P The simplest robust regression technique, L1 regression, selects β to minimize i |ri (β)| instead of the P usual i ri (β)2 ; the effect is to find a “median regression line”. 2. ρˆLT S . Least trimmed squares regression, or LTS for short, takes a different approach: instead of changing the pointwise loss function, we use the usual squared residuals but throw the largest values out of the sum. More precisely, define |r|(1) ≤ . . . ≤ |r|(n−h) to be the ordered residual absolute values. α-trimmed squares minimizes
d(1−α)(n−h)e
σ ˆ :=
X
1/2 |r|2(j)
.
j=1
We look at α-trimmed squares for α =
1 2
(i.e. we minimize the sum of the smallest 50% of the residuals).
3. ρˆM M . An M-estimate [5] minimizes n X ri (β) L(β) := ` . σ ˆ i=1 for some pointwise loss function `, where σ ˆ is an estimate of the scale of the errors. It is efficient, but not resistant to outliers in the x values. A “redescending” M-estimate utilizes a loss function that 4
decreases to zero at the tails (as opposed to a monotone increasing loss function). In contrast, an S-estimate (S for “scale”) minimizes a robust estimate of the scale of the residuals:
βˆ := argmin σ ˆ (r(β)) β
where r(β) denotes the vector of residuals and σ ˆ satisfies n−h 1 X rj = δ. ` n j=1 σ ˆ
(δ is usually chosen to be 12 .) It has superior robustness, but is inefficient. MM-estimates, pioneered by Yohai [20], combine these two techniques in a way intended to retain the robustness of S-estimation while gaining the asymptotic efficiency of M-estimation. Specifically, an initial robust-but-inefficient estimate βˆ0 is computed, then a scale M-estimate of the residuals, and finally the iteratively reweighted least squares algorithm is used to identify a nearby βˆ that satisfies the redescending M-estimate equation. For further discussion on the above three robust regression techniques, see Maronna et al. [13].
3
Theoretical Properties
3.1
General
We focus our attention on normal efficiency and two measures of robustness (breakdown point and influence function). Relative normal efficiency is the ratio between the asymptotic variance of the classical estimator and that of another estimator under consideration, assuming Gaussian residuals and no contamination. This is a measure of the price we are paying for any robustness gains. The breakdown point (BP) is the asymptotic fraction of points that can be contaminated without entirely masking the original relation. Now, in the case of time series modeled by ARMA (AutoRegressive Moving Average) processes, we distinguish two types of outliers [3]: 1. Innovation outliers affect all subsequent observations, and can be observed in a pure ARMA process with a heavy-tailed innovation distribution. 2. Additive outliers or replacement outliers that exist outside the ARMA process and do not affect other observations. For second-order stationary data, the difference between the two notions is minimal—a replacement outlier functions like a slightly variable additive outlier—, so for brevity we just concern ourselves with additive outliers. 5
●
0
200
400
x[i+1]
600
800
1000
Single outlier determines regression line
● ●
0
●
200
400
600
800
1000
x[i], i=1..49
Figure 2: Degenerate regression line from 50 N(0,1) points contaminated by one outlier at 1000. For additive outliers, the classical autocorrelation estimator has a breakdown point of zero since a single very large outlier is enough to force the estimate to a neighborhood of
−1 n−h ;
see Figure 2 for an illustration.
Since one additive outlier influences the position of at most two points in the regression, our robust autocorrelation estimators will exhibit BPs at least half that of the robust regression techniques they are built on; see Ma and Genton [12] for a more exhaustive discussion on “temporal breakdown point”. The impact of an innovation outlier on the regression line varies. For instance, only one point is moved off the regression line in the AR(1) case, but three points are affected in the MA(1) case. So in the former scenario, our robust autocorrelation estimators can be expected to fully inherit the BPs of the robust regressors with respect to innovation outliers, but we cannot expect as much reliability with MA models; for more details, see Genton and Lucas [6] [7] who analyze in detail how dependence can adversely affect the BP. Perhaps surprisingly, infinite variance symmetric alpha-stable innovation distributions result in a faster sample acf convergence rate than the finite variance innovation case [2]. We will perform simulations to investigate whether our robust regression estimates are able to keep up. Next, the influence function (IF) describes the impact on the autocorrelation estimate ρˆROBU ST of adding an infinitesimal probability of an innovation or additive outlier. Given an innovation or residual distribution F , let ρˆROBU ST (F ) denote the value the robust autocorrelation estimate converges to, as the number of observations increases without bound. Then the influence function can be described as 6
IF (x, ρˆROBU ST , F ) = lim
→0+
ρˆROBU ST ((1 − )F + ∆x ) − ρˆROBU ST (F )
for x such that this limit exists, where ∆x denotes a probability point mass at x. This is a measure of the asymptotic bias caused by observation contamination [12]. For example, the value of the classical estimator’s influence function increases without bound as |x| → ∞, since the numerator in the limit converges to a nonzero constant while the denominator goes to zero. Remark 3.1. As a final note, our robust estimator ρˆROBU ST (h) is not guaranteed to be in the range [-1, 1]. As an example, consider the n = 3 time series dataset {1, 2, 0}; for h = 1, the slope of the straight line regression equals -2. If an estimator of ρ(h) for a fixed h is sought, then an easy fix is to ‘clip’ the estimator ρˆROBU ST (h) to the values -1 and 1. In other words, estimate ρ(h) by ρˆROBU ST (h) when the latter is in the interval [-1, 1]; else, estimate ρ(h) by 1 when ρˆROBU ST (h) > 1, and by -1 when ρˆROBU ST (h) < −1. On the other hand, if it desired to estimate ρ(k) for many lags, e.g. k = 1, . . . , h, a more elaborate fix is required; in that respect, see Section 5, and in particular, Remark 5.1.
3.2
L1
Because the x-coordinates are not fixed, ρˆL1 does not inherit all the asymptotic robustness advantages normally enjoyed by L1 regression. Any outlier in the middle of the time series appears as both an x- and a y-coordinate, and while L1 regression shrugs off the y outlier, the x outlier point can have an extreme influence on it. Therefore, the BP is zero in the additive outliers case and the influence function increases without bound again. Since, if the underlying process is AR(1), an additive outlier can have an effect similar to that of two adjacent innovation outliers, the same results hold in the innovation outliers case.
3.3
LTS
LTS regression exhibits the highest possible breakdown point ( 21 ). It is robust with respect to both xand y-outliers, so ρˆLT S retains the
1 2
BP in the AR(1) innovation outliers case and has a BP of at least
1 4
with respect to additive outliers. The influence function decreases to zero at the tails since the probability of mistaking the outlier for a “real” point declines exponentially in n. It also exhibits the optimal convergence rate, but has a very low normal efficiency of around 7% [18]; this weakness is clearly visible in our simulations.
3.4
MM
MM-estimates also have an asymptotic breakdown point of so ρˆM M has a BP of
1 2
1 2
and are resistant to both x- and y-outliers,
in the innovation outliers case and at least
7
1 4
in the additive outliers case.
The normal efficiency is actually a user-adjustable parameter. In practice, it it is usually chosen to be between 0.7 and 0.95; aiming for an even higher normal efficiency results in too large a region where the MM-estimate tracks the performance of the classical estimator rather than exhibiting the S-estimate’s robustness. We use 0.85 in our simulations.
4
Robust autocovariance estimation Our robust acf estimators can be converted into autocovariance estimators via multiplication by a robust
estimate of variance. This could proceed as follows: 1. First, obtain a robust estimate of location. Building on equation (2), from each autocorrelation regression we perform, we can derive an estimate of the process mean µ: Xt − µ = β1 (Xt−h − µ) + error, since this line should have zero intercept Xt = β0 + β1 Xt−h + error where β0 = µ(1 − β1 ) Finally, let µ ˆh :=
(3)
(combining eq. (2) and (3)).
βˆ0 1 − βˆ1
where βˆ0 , βˆ1 are robust estimates of β0 , β1 in the linear regression (3). 2. Each value of h > 0 used in the above step will yield a distinct estimator of location denoted by µ ˆh . We can now use the median of the distinct values µ ˆk for k = 1, . . . , (some) p to arrive to a single, better estimator for µ denoted by µ ˆ. For example, we can use L1 notions and take µ ˆ as the median of the values {ˆ µk for k = 1, . . . , p}. Alternatively, LTS regression can be applied to µ ˆk for k = 1, . . . , p in order to boil them down to a single estimate. 3. Since (Xt − µ)2 = γ(0) + error, we can robustly estimate γ(0) by using L1 (or LTS) on the centered sample values (Xt − µ ˆ)2 for t = 1, . . . , n. E.g., let γˆROBU ST (0) be the median of (Xt − µ ˆ)2 for t = 1, . . . , n. 4. Finally, we define our robust autocovariance estimator for general h as γˆROBU ST (h) = ρˆROBU ST (h) · γˆROBU ST (0).
Remark 4.1. The estimators ρˆROBU ST (h) and γˆROBU ST (h) are robust point estimators of the values ρ(h) and γ(h). To construct confidence intervals and hypothesis tests based on these estimators, an approximation to their sampling distribution would be required. Such an approximation appears analytically intractable 8
at the moment without imposing assumptions that are so strong to make the whole setting uninteresting, e.g., assuming that the time series {Xt } is Gaussian. To avoid such unreliastic assumptions, the practitioner may use a bootstrap approximation to the sampling distribution of the estimators ρˆROBU ST (h) and/or γˆROBU ST (h). Several powerful resampling methods for time series data have been developed in the last 25 years, e.g., blocking methods, AR-sieve bootstrap, frequency domain methods, etc.; see the recent review by Kreiss and Paparoditis [10] and the references therein. It is unclear at the moment which of these methods would be preferable in the context of ρˆROBU ST (h) and γˆROBU ST (h). A benchmark method that typically works under the weakest assumptions is Subsampling; see Politis, Romano, and Wolf [17]. There are other approaches to robust autocovariance estimation in the literature; see e.g. Ma and Genton [12] and the references therein.
5
Robust positive definite estimation of autocorrelation matrix Let Σ denote the n × n matrix with i, j element Σi,j := ρ(|i − j|); in other words, Σ is the autocorrelation
matrix of the data (X1 , . . . , Xn ) viewed as a vector. An immediate way to robustly estimate the autocorˆ that has relation matrix Σ is by plugging our robust correlation estimates. For example, define a matrix Σ ˆ is neither consistent for Σ as n → ∞, nor is it i, j element given by ρˆROBU ST (|i − j|). Although intuitive, Σ positive definite; see Wu and Pourahmadi [19] and the references therein. Following Politis [16], we may define a ‘flat-top’ taper as the function κ satisfying 1 if |x| ≤ 1 κ(x) = g(x) if 1 < |x| ≤ cκ 0 if |x| > cκ , where cκ ≥ 1 is a constant, and g(x) is some function such that |g(x)| ≤ 1. Let the taper’s l-scaled version be denoted as κl (x) := κ(x/l). Taking κ of trapezoidal shape, i.e., letting g(x) = 2 − |x| and cκ = 2, yields a simple taper that has been shown to work well in practice. McMurry and Politis [14] introduced a consistent estimator of Σ defined as the n × n matrix with i, j element given by κl (i − j)ˆ ρ(|i − j|); here, l serves the role of a bandwidth parameter satisfying l → ∞ but l/n → 0 as n → ∞. ˆ κ,l as an estimator In order to “robustify” the tapered estimator of McMurry and Politis [14], we propose Σ ˆ κ,l is given by κl (i − j)ˆ ˆ κ,l is not of Σ where the i, j element of Σ ρROBU ST (|i − j|). Note, however, that Σ ˆ κ,l = Tn DTnt be the spectral decomposition guaranteed to be positive definite. To address this problem, let Σ ˆ κ,l . Since Σ ˆ κ,l is symmetric, Tn will be an orthogonal matrix, and D = diag(d1 , . . . , dn ) which are the of Σ ˆ κ,l ). Define eigenvalues of Σ ()
()
D() := diag(d1 , . . . , dn() ), with di 9
:= max(di , /nζ )
(4)
where ≥ 0 and ζ > 1/2 are two constants. The choices ζ = 1 and = 1 works well in practice as in McMurry and Politis [14]. Finally, we define ˆ () := Tn D() T t . Σ n κ,l
(5)
ˆ () is nonnegative definite when = 0, and strictly positive definite when > 0. FurBy construction, Σ κ,l ˆ () inherits the robustness and consistency of Σ ˆ κ,l . Finally, note that the matrix that equals thermore, Σ κ,l ()
ˆ γˆROBU ST (0)Σ κ,l is positive definite, robust and consistent estimator of the autocovariance matrix of the data (X1 , . . . , Xn ) viewed as a vector. Remark 5.1. As mentioned at the end of Section 3.1, the estimator ρˆROBU ST (h) is not necessarily in the ˆ () fixes this problem. In particular, interval [−1, 1]. The construction of the nonnegative definite estimator Σ κ,l the whole vector of autocorrelations (ρ(0), ρ(1), . . . , ρ(n − 1))—where of course ρ(0) = 1—is estimated by ˆ () . As long as ≥ 0, this vector estimator can be considered as the the first part the first row of matrix Σ κ,l of a nonnegative definite sequence, and hence it can be considered as the first part of the autocorrelation sequence of some stationary time series; see e.g. Brockwell and Davis (1991). Hence, the vector estimator enjoys all the properties of an autocorrelation sequence, including the property of belonging to the interval [−1, 1].
6
Robust AR Model Fitting
6.1
Robust Yule-Walker Estimates
Consider again the causal AR(p) model of eq. (1), i.e., Xt = φ1 Xt−1 + . . . + φp Xk−p + zt . Now we do not need to assume that the driving noise zt is iid; it is sufficient that zt is a mean zero, white noise with variance σ 2 . In this context, autocovariance estimates can be directly used to derive AR coefficient estimates via the Yule-Walker equations: ˆ = φ1 γˆ (−1) + . . . + φp γˆ (k − p) + σ 2 γ(0) and ρˆ(k) = φ1 ρˆ(k − 1) + . . . + φp ρˆ(k − p) for k = 1, . . . , p.
(6)
However, if the classical autocovariance estimates are used in the above, a single outlier of size B perturbs the φ coefficient estimates by O(B/n); a pair of such outliers can perturb φˆ1 by O(B 2 /n). A simple way to attempt to address this vulnerability is to plug robust autocovariance estimates into the ˆ and ρˆROBU ST (k) instead of ρˆ(k). The resulting linear system (6), i.e., to use γˆROBU ST (0) in place of γ(0),
10
robust estimate of the coefficient vector φp := (φ1 , . . . , φp )0 is given by φˆp,ROBU ST := Sp−1 ρˆp
(7) 0
0
where for any m ≤ n we let ρˆm := (ˆ ρROBU ST (1), . . . , ρˆROBU ST (m)) and ρm := (ρ(1), . . . , ρ(m)) . Furtherˆ () defined in the more, in the above, Sp is the upper left p × p submatrix of the autocorrelation matrix Σ κ,l previous section. Since Sp is a Toeplitz matrix, its inverse Sp−1 can be found via fast algorithms such as the Durbin-Levinson algorithm.
6.2
Extended Yule-Walker
The ‘extended’ Yule-Walker equations are identical to the usual ones of eq. (6) except letting k = 1, . . . , p0 for some value p0 ≥ p, i.e., they are an overdetermined system: p0 equations with p unknowns φp = (φ1 , . . . , φp )0 . Politis [15] noted that the extended Yule-Walker equations can be used to provide a more robust estimator of φp . For example, in the AR(1) case with p = 1, letting p0 = 2 suggests that γˆ (1)/ˆ γ (0) and γˆ (2)/ˆ γ (1) are equally valid as estimators for φ1 that could be combined to yield an improved one. Generalizing this idea, fix p0 ≥ p, and let Sp0 ,p be the p0 ×p matrix with jth column equal to (ˆ ρROBU ST (1− j), ρˆROBU ST (2−j), . . . , ρˆROBU ST (p0 −j)); alternatively, we could define Sp0 ,p as the upper left p0 ×p submatrix ˆ () defined in Section 5. of the autocorrelation matrix Σ κ,l As mentioned above, the extended Yule-Walker equations with k = 1, . . . , p0 > p is an overdetermined system. Thus, we can not expect that ρp0 exactly equals Sp0 ,p φp . Define the discrepancy η = ρp0 − Sp0 ,p φp from which it follows that ρp0 = Sp0 ,p φp + η.
(8)
As suggested by Politis [15] , equation (8) can be viewed as a linear regression (with ‘errors-in-variables’) having response ρp0 , design matrix Sp0 ,p , parameter vector φp , and error vector η. Finding OLS estimates of φp in the regression model (8) gives doubly robust estimates: robust because ρˆROBU ST (h) were used but also because of the use of the extended Yule-Walker equations. One could alternatively use a robust regression technique to fit regression (8); the details are obvious and are omitted.
7
Numerical Work
7.1
Simulation without Outliers
First, we generated time series data X1 , . . . , Xn according to the MA(1) model Xt = Zt + φZt−1 (with no outliers) with φ ∈ {0.2, 0.5, 0.8}, n ∈ {50, 200, 800}, and Zt i.i.d. N(0, 1). We estimated the lag-1 and
11
φ lag-2 autocorrelations in different ways, and compared them to the true values ( 1+φ 2 and 0, respectively).
We did the same thing for the AR(1) model Xt = φXt−1 + Zt ; the true autocorrelations are φ and φ2 in this case. The estimators that were constructed were ρˆROBU ST (h) using the three aforementioned options for robust regression: L1, LTS and MM. As baselines for comparison, we included Ordinary Least Squares (OLS) regression, which as discussed above is nearly identical to the sample acf, and Ma and Genton’s [12] robust autocorrelation estimator (denoted MG).
12
φ
n
50
0.2
200
800
50
0.5
200
800
50
0.8
200
800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .16815 .17428 .15741 .12148 .16728 .18238 .18174 .18875 .18659 .18328 .19202 .19266 .19457 .20289 .19253 .35834 .36166 .35859 .38351 .35940 .39859 .39992 .39810 .40652 .39731 .39746 .39809 .39897 .39555 .39780 .45355 .45369 .44862 .46865 .45345 .48315 .48289 .48248 .49077 .48340 .48415 .48349 .48356 .47204 .48402
MSE .01669 .02465 .02938 .11283 .01731 .00458 .00629 .00827 .04622 .00489 .00120 .00127 .00190 .01614 .00122 .01685 .02319 .02194 .07726 .01748 .00216 .00308 .00520 .03394 .00252 .00094 .00111 .00175 .01439 .00100 .01053 .01663 .01992 .08112 .01106 .00242 .00322 .00470 .02759 .00256 .00055 .00067 .00121 .01296 .00059
Avg. ρˆ(2) -.04035 -.03364 -.04618 -.06980 -.04223 -.01316 -.01753 -.01559 -.02034 -.01330 .00173 .00152 .00080 .00342 .00154 -.03677 -.02692 -.01190 .00142 -.02757 -.00520 -.00571 -.00163 .01994 -.00528 -.00465 -.00344 -.00113 .00574 -.00395 -.05546 -.06168 -.06792 -.06159 -.05628 -.00775 -.00604 -.00235 .02308 -.00730 -.00434 -.00541 -.00320 .00645 -.00436
MSE .02312 .03676 .03622 .13513 .02533 .00546 .00714 .00861 .04300 .00574 .00108 .00135 .00213 .01447 .00123 .02702 .03660 .03290 .10233 .02745 .00516 .00707 .00862 .04868 .00560 .00183 .00239 .00258 .01894 .00199 .03023 .04081 .04046 .12601 .03074 .00667 .00877 .00847 .03534 .00663 .00166 .00186 .00202 .01402 .00166
Table 1: Uncontaminated MA(1) simulation results; averages of 200 trials.
13
φ
n
50
0.2
200
800
50
0.5
200
800
50
0.8
200
800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .16358 .15360 .17565 .18526 .16702 .20110 .20064 .19851 .19576 .20009 .19193 .19286 .19139 .19555 .19191 .44600 .44176 .45312 .46085 .44471 .48241 .47893 .48157 .48630 .48229 .49777 .49708 .49994 .50000 .49796 .72894 .70482 .72172 .69385 .72896 .78556 .78135 .78586 .78713 .78498 .79622 .79563 .79702 .80020 .79634
MSE .02592 .03837 .03564 .11907 .02758 .00439 .00550 .00733 .04101 .00459 .00125 .00162 .00206 .01551 .00137 .01603 .02597 .02454 .09045 .01738 .00417 .00494 .00635 .03007 .00429 .00100 .00125 .00147 .00983 .00105 .01682 .02413 .02256 .06811 .01790 .00191 .00235 .00291 .01646 .00193 .00045 .00052 .00076 .00573 .00048
Avg. ρˆ(2) .02875 .02700 .01710 -.00201 .02804 .02818 .02512 .02330 .02079 .02691 .04054 .04009 .04056 .05124 .04066 .18352 .18445 .19821 .21308 .18691 .23662 .23194 .23560 .22803 .23674 .24495 .24396 .24465 .24269 .24512 .52273 .48780 .51311 .49295 .51800 .61795 .61327 .61878 .61040 .61847 .63450 .63324 .63717 .64809 .63522
MSE .01956 .03465 .03553 .12429 .02197 .00552 .00688 .00762 .03917 .00562 .00123 .00146 .00211 .01590 .00124 .02630 .03796 .03591 .11105 .02687 .00681 .00776 .00937 .03912 .00699 .00157 .00202 .00210 .01308 .00165 .04186 .05783 .05671 .15527 .04563 .00502 .00565 .00646 .03228 .00489 .00142 .00166 .00185 .00765 .00149
Table 2: Uncontaminated AR(1) simulation results; averages of 200 trials.
As expected, the OLS (classical) estimator performed best in the no contamination case; see Tables 1 and 2. However, the MM estimator’s performance was nearly indistinguishable from OLS’s. The L1 and Ma-Genton estimators were somewhat less efficient, with MSEs roughly 1.5x to 2x that of the OLS estimator; LTS’s known terrible normal efficiency was clearly in evidence. Sample size did not affect the performance of the estimators relative to each other, but a larger sample
14
size reduced the downward bias of them all.
7.2
Simulation with Innovation Outliers
Next, we investigated estimator performance when faced with innovation outliers, modifying Zt to be distributed according to a Gaussian mixture, 90 or 96 percent N(0, 1) and 10 or 4 percent N(0, 625). From Table 3 we can see that for φ = −0.2, the Ma-Genton, L1, and MM estimators do a substantially better job of handling the innovation outliers than the sample acf. However, for larger values of φ and large sample sizes, our robust estimates of ρ(1) cluster toward φ instead of
φ 1+φ2 .
The reason for that is that any
innovation outlier not immediately followed by a second one creates a point on the scatterplot of the form (x + 1 , φx + 2 ) where |x| >> |i |; all of these high-magnitude points trace a single line of slope φ which are picked up by the robust estimators as the primary signal, and the other high-magnitude outlier points (which bring the OLS estimate in line) are ignored. See Figure 3 for an illustration. The Ma-Genton estimator, not being based on linear regression, is not affected by this pattern.
15
φ
Contam. %
n
50 4 800 -0.2 50 10 800
50 4 800 0.5 50 10 800
50 4 800 0.8 50 10 800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) -.19785 -.18534 -.17678 -.16145 -.18117 -.19071 -.18134 -.19446 -.18800 -.19424 -.19866 -.18570 -.18540 -.16230 -.18483 -.19148 -.17732 -.19368 -.19485 -.19312 .34683 .36554 .42316 .35159 .38470 .39890 .39308 .46748 .45444 .48818 .37823 .34596 .43980 .33623 .36369 .39977 .39091 .47008 .49193 .48947 .46616 .46974 .55699 .43306 .49038 .48720 .49182 .59438 .55985 .68805 .45878 .48799 .61845 .46685 .50545 .48400 .51178 .63333 .71091 .76902
MSE .01077 .02753 .00886 .07133 .00742 .00112 .00169 .00010 .00164 .00006 .01171 .02871 .00228 .03224 .00187 .00112 .00167 .00004 .00022 .00002 .04265 .02180 .01562 .08751 .02550 .00067 .00119 .00475 .01428 .00786 .00939 .02506 .01020 .06072 .03569 .00083 .00120 .00501 .01064 .00805 .01131 .01749 .03682 .10956 .07561 .00054 .00087 .01447 .02922 .06670 .00836 .01891 .04446 .12663 .11626 .00063 .00147 .02528 .08715 .09312
Avg. ρˆ(2) -.02147 -.03046 -.01231 -.01445 -.00452 -.00615 -.00437 .00032 .00205 .00028 -.01778 -.06054 -.00367 -.00381 -.00340 -.00318 -.00538 -.00017 -.00025 -.00011 -.05107 -.05204 -.02221 -.04056 -.02562 -.00156 -.00587 -.00097 -.00032 -.00121 -.03809 -.06730 -.00796 -.00761 .00016 -.00338 -.00774 .00072 .00257 .00006 -.04233 -.05979 -.00934 -.03247 -.01341 -.00442 -.01179 -.00013 .00078 -.00008 -.04923 -.06083 -.00939 -.01234 -.00938 -.00768 -.00867 -.00110 -.00049 -.00084
MSE .01086 .03823 .00707 .05180 .00842 .00154 .00191 .00011 .00058 .00007 .01596 .03894 .00309 .02583 .00314 .00155 .00205 .00006 .00013 .00004 .01870 .04099 .01304 .07510 .01032 .00148 .00252 .00014 .00088 .00010 .01739 04132 .00501 .01634 .00302 .00181 .00246 .00007 .00022 .00004 .03611 .03702 .01905 .05134 .01176 .00168 .00261 .00013 .00066 .00010 .01955 .04586 .00426 .01295 .00443 .00170 .00247 .00005 .00014 .00004
Table 3: MA(1) simulation results with innovation outliers; averages of 200 trials.
16
0 -60
-40
-20
x[i+1]
20
40
60
MA(1) (phi=0.8, n=800) with innovation outliers
-60
-40
-20
0
20
40
60
x[i], i=1..799
Figure 3: Xt vs. Xt+1 plot for the MA(1) model Xt = Zt + 0.8Zt−1 with innovation outliers. With an innovation outlier at Zt , (Xt−1 , Xt ) usually lies on the vertical line, (Xt , Xt+1 ) on the diagonal, and (Xt+1 , Xt+2 ) on the horizontal. The robust estimators tend to fit the diagonal line. φ
Contam. %
n
50 4 800 -0.2 50 10 800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) -.22576 -.20588 -.20750 -.18120 -.20344 -.19775 -.20515 -.19949 -.19582 -.19983 -.20993 -.22251 -.20522 -.19185 -.20467 -.19992 -.21774 -.20048 -.19996 -.20039
17
MSE .02227 .03829 .01018 .06381 .00958 .00145 .00160 .00009 .00053 .00005 .01148 .03201 .00131 .01821 .00070 .00125 .00209 .00004 .00016 .00002
Avg. ρˆ(2) .02086 .00309 .02877 .01468 .03090 .03814 .04088 .03948 .03674 .03962 .02600 .03557 .04086 .04927 .04440 .04020 .03777 .03959 .03800 .03967
MSE .01577 .03341 .01126 .05601 .00692 .00091 .00161 .00010 .00062 .00007 .00906 .03274 .00157 .01699 .00126 .00173 .00178 .00005 .00017 .00003
φ
Contam. %
n
50 4 800 0.5 50 10 800
50 4 800 0.8 50 10 800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .46520 .48690 .49198 .48511 .49183 .49840 .53888 .49969 .50038 .49966 .43619 .55736 .48964 .48506 .49613 .49832 .59379 .49924 .49902 .49941 .74099 .81219 .77752 .76933 .77987 .79691 .88504 .80011 .80013 .80001 .72992 .89090 .79232 .79450 .79677 .79714 .93719 .79990 .79943 .80009
MSE .00956 .02355 .00532 .02905 .00377 .00097 .00282 .00006 .00039 .00004 .04085 .02541 .00309 .00814 .00106 .00086 .00993 .00003 .00012 .00002 .01184 .01316 .00572 .02006 .00425 .63449 .00760 .00003 .00016 .00002 .01731 .01489 .00165 .00806 .00068 .00046 .01892 .00001 .00008 .00001
Avg. ρˆ(2) .19043 .19364 .22763 .19989 .23505 .24964 .26705 .25023 .25076 .24984 .17919 .23512 .23227 .24911 .24566 .24440 .28994 .24811 .24713 .24885 .53776 .61055 .59431 .59469 .59493 .00037 .74106 .63971 .64059 .64043 .53105 .70164 .61721 .61635 .62704 .63659 .80715 .63971 .64034 .64031
MSE .02019 .04024 .00791 .04247 .00649 .00152 .00255 .00009 .00039 .00007 .02531 .03739 .00662 .01338 .00249 .00151 .00367 .00007 .00018 .00004 .03219 .03626 .01536 .03543 .01619 .00104 .01129 .00008 .00040 .00006 .03459 .02350 .00596 .01611 .00465 .00137 .02847 .00004 .00012 .00003
Table 4: AR(1) simulation results with innovation outliers; averages of 200 trials.
From Table 4, we can see that our robust regression estimators ρˆROBU ST all shine in the AR(1) case with innovation outliers. This is unsurprising, since an AR(1) innovation outlier only pulls one point off the appropriate regression line, while generating several other high-leverage points right on the true line; see Figure 4. The Ma-Genton estimator performs about as poorly as OLS here; interestingly, their errors tend to be in opposite directions.
18
0
10
20
x[i+1]
30
40
50
x[i] vs. x[i+1] plot for AR(1) with innovation outlier, phi=0.8
0
10
20
30
40
50
x[i], i=1..49
Figure 4: (xt , xt+1 ) plot for a realization of length n = 50 from the AR(1) time series Xt = 0.8Xt−1 + Zt with one innovation outlier. φ1 , φ 2
Contam. %
n
50 4 800 0.5, 0.1 50 10 800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .49805 .58988 .53168 .54746 .54553 .55232 .64522 .55813 .61018 .57101 .50255 .70744 .53896 .59708 .56102 .55514 .74715 .55795 .63746 .55693
19
MSE .01063 .02912 .00797 .03708 .00746 .00110 .00940 .00034 .00749 .00151 .01400 .04561 .00265 .01678 .00395 .00106 .03765 .00021 .00934 .00030
Avg. ρˆ(2) .30211 .36280 .34237 .37782 .35229 .37241 .43532 .37576 .38777 .37671 .30139 .44201 .35993 .37594 .36632 .37448 .50552 .37708 .38582 .37730
MSE .05545 .03815 .01120 .04575 .00849 .00147 .00545 .00018 .00166 .00010 .04159 .03910 .00599 .01517 .00395 .00143 .01800 .00011 .00082 .00005
φ1 , φ 2
Contam. %
n
50 4 800 0.6, 0.3 50 10 800
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .73869 .85771 .83786 .85036 .85324 .85081 .96729 .90582 .91584 .91882 .70326 .91934 .84315 .88486 .88306 .84891 .98441 .90649 .92019 .92185
MSE .02945 .01835 .01150 .02913 .01226 .00060 .01227 .00242 .00372 .00383 .04351 .01062 .00793 .01410 .00714 .00076 .01621 .00248 .00409 .00421
Avg. ρˆ(2) .66437 .79481 .75807 .77909 .77049 .80749 .94209 .83797 .84984 .84796 .62912 .86392 .76536 .81898 .78694 .80323 .96065 .83758 .85299 .84226
MSE .04485 .03038 .02043 .03403 .01785 .00101 .01663 .00066 .00160 .00121 .05943 .01527 .01284 .01243 .00987 .00119 .02146 .00062 .00166 .00084
Table 5: AR(2) simulation results with innovation outliers, averages of 200 trials. True (ρ(1), ρ(2)) is ( 59 , ) in the (φ1 , φ2 ) = (0.6, 0.3) case. (φ1 , φ2 ) = (0.5, 0.1) case, and ( 76 , 57 70
17 ) 45
in the
Moving on to the AR(2) case of Table 5, we see that with innovation outliers, the L1 and MM robust estimators exhibit much better performance than OLS given a small (50) sample size, but the difference fades with a larger sample size. The Ma-Genton estimator performs relatively poorly across the board.
7.3
Simulation with Additive Outliers
Next, we investigated the performance of our estimators in the additive outlier case by perturbing one or two elements in the middle of the time series by a large number (where, as before, innovations are i.i.d. N(0, 1)). The Ma-Genton and MM-based estimators do the best here; the MM estimator seems to have a slight edge over MG although this is not uniformly evident in the entries of Table 6. The OLS estimator performed especially badly in the φ = 0.8 case, L1 was fairly good but failed the φ = 0.8 with n = 50 case, and LTS generally acted as a much less efficient MM.
20
φ
n
Contam. Pattern
25, 25
50
25, 0
25, -25 -0.2 25, 25
800
25, 0
25, -25
25, 25
50
25, 0
25, -25 0.8 25, 25
800
25, 0
25, -25
Estimator OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM OLS MG L1 LTS MM
Avg. ρˆ(1) .45142 -.15163 .00949 -.16946 -.11206 -.03535 -.23079 -.00350 -.22194 -.12271 -.48932 -.27360 -.22172 -.18993 -.11939 .21900 -.20102 -.12892 -.21102 -.18993 -.11684 -.20590 -.16603 -.20086 -.19773 -.38096 -.20358 -.19913 -.19461 -.18644 .49677 .73375 .71784 .76085 .80826 .08162 .69233 .34407 .71216 .70241 -.40196 .69580 .07515 .73795 .73087 .68855 .79444 .79499 .79494 .79541 .61925 .79651 .78514 .80580 .79848 .31915 .79172 .76533 .78979 .79527
MSE .42544 .03272 .04707 .09823 .03400 .02868 .03396 .03885 .11734 .03509 .08450 .03155 .02783 .09892 .04269 .17617 .00130 .00658 .01470 .00220 .00753 .00128 .00271 .01327 .00208 .03324 .00136 .00164 .01404 .00255 .09497 .01678 .02180 .05544 .03660 .52154 .02827 .25730 .07760 .02940 1.44785 .02682 .55373 .06227 .01986 .01329 .00052 .00071 .00556 .00047 .03378 .00048 .00097 .00493 .00046 .23308 .00065 .00208 .00514 .00050
Avg. ρˆ(2) -.04117 .01986 -.00093 .02005 .01052 -.02372 .03604 -.00308 .04899 .01925 .00144 .01328 .00269 .00917 .00924 .01230 .03990 .01522 .04796 .01481 .02222 .03766 .01995 .02622 .02053 .01549 .03339 .01230 .02058 .01274 -.00211 .48080 .02922 .41964 .40465 .04809 .46751 .13002 .44343 .42005 .03820 .50972 .05980 .45480 .44154 .40271 .62901 .59307 .62680 .63165 .49346 .63460 .61991 .63532 .63659 .39943 .62866 .59569 .63881 .63117
MSE .00796 .03361 .00337 .06239 .00573 .00719 .03247 .00348 .09225 .01019 .00328 .03401 .00268 .06258 .00569 .00140 .00146 .00135 .01365 .00112 .00138 .00138 .00165 .01303 .00118 .00167 .00181 .00178 .01134 .00143 .42308 .05831 .38034 .17343 .13308 .35910 .06854 .29226 .15611 .10891 .36352 .04958 .34270 .13595 .11602 .05906 .00148 .00407 .00936 .00126 .02354 .00134 .00197 .00849 .00125 .05927 .00160 .00394 .00996 .00136
Table 6: AR(1) simulation results with additive outliers; averages of 200 trials. In a length-n time series, an “a, b” contamination pattern means that a was added to the n th element and b was added to the ( n + 1)th element. 2 2
21
11 10 8
9
annual % rate
0
20
40
60
80
month
Figure 5: Austrian bank data.
7.4
Real Data Experiment: Austrian Bank Interest Rates
We also applied our robust estimators to some real-world data, namely monthly interest rates of an Austrian bank over a 91 month period; this data set has been previously discussed and analyzed by K¨ unsch [9], Ma and Genton [12], and Genton and Ruiz-Gazen [8]. The outliers around months 20 and 30 are clearly notable in Figure 5. Table 7 summarizes our findings.
Estimator OLS MG L1 LTS MM
Outliers replaced? no yes no no no no
ρˆ(1) .79184 .93911 .96571 .97222 .99451 .97194
ρˆ(3) .58923 .77912 .82703 .83459 .95588 .81113
ρˆ(4) .51249 .67222 .73727 .78351 .87975 .49292
ρˆ(5) .44414 .58084 .65968 .72603 .85556 .40119
ρˆ(6) .40440 .49877 .55046 .65957 .36749 .34198
ρˆ(12) .08583 .07229 -.18033 -.02786 -.94203 .04550
Table 7: Numerical results with Austrian bank data.
The L1 and MM estimates of typical month-to-month and season-to-season autocorrelation are more reasonable than the OLS estimate which is overly affected by the outliers. However, the LTS estimator was erratic, overestimating the low lag autocorrelations and yielding a bizarre value of -.94203 for the 12-month autocorrelation.
22
7.5
Simulation: Robust AR Model Fitting
Finally, we ran a small simulation to illustrate the performance of Politis’ [15] robust AR model fitting method when combined with our robust autocorrelation estimators ρˆROBU ST , i.e., the methodology from Section 6.2. The model was an AR(2) contaminated with innovation outliers.
φ1 , φ 2
n 50
0.5, 0.1 800
Estimator OLS L1 LTS MM OLS L1 LTS MM
ˆ Avg. φ(1) .58223 .53079 .53561 .55800 .62381 .51531 .61376 .52039
MSE .04698 .01876 .04073 .01390 .01950 .00071 .02175 .00074
ˆ Avg. φ(2) -.03085 .05283 .04985 .03558 .03664 .09386 .01312 .08972
MSE .04182 .01510 .02820 .01270 .00769 .00004 .01310 .00035
Table 8: AR(2) simulation results with innovation outliers (10 percent frequency, scale 25x normal), averages of 50 (with n = 800) or 200 (with n = 50) trials.
As we can see in Table 8, the robust AR model fitting method yields reasonable results even when it is applied to the raw sample acf. However, performance was noticeably better with n = 50 when combining it with the L1 or MM robust autocorrelation estimators; furthermore, in the n = 800 case, the performance advantage was overwhelming. Thus, these two methods (Politis’ [15] robust AR fitting and ρˆROBU ST ) are not redundant; they complement each other very well.
7.6
Computational Cost
When performing these simulations, the relative computational cost of the robust estimators (including Ma-Genton) tended to exceed that of the classical estimator by one to two orders of magnitude. (The exact multiplier depends on the type of robust regression employed, and how it was implemented.) However, the absolute computation time for each full set of simulations was never more than a few days (in R, on a five-year-old MacBook Pro). Thus, unless the data set needs to be analyzed in real time and/or is much larger than what we have simulated, the extra time requirement is not a problem. L1 regression is especially tractable. The rq() function in Koenker et al.’s R quantreg package came within 25 percent of lm() (and was actually faster than glm()) for length-80 and 800 time series; this is close enough that choice of implementation language has a much greater impact (see Table 9 below). For very long time series, the difference between L1 regression’s O(n2 ) time complexity OLS’s O(n) makes itself felt, but L1 remains realistic until n is in the millions.
23
n
Method OLS
80 L1 OLS 800 L1 OLS 8000 L1 OLS 80000 L1
Language R C R C R C R C R C R C R C R C
Total time 0.69s 0.0011s 0.84s 0.0069s 1.2s 0.01s 1.5s 0.14s 5.1s 0.11s 14s 6.6s 70.8s 1.0s 730s 590s
Table 9: Total time to compute ρˆ(1) and ρˆ(2) for 200 AR(2) (φ1 = 0.6, φ2 = 0.3) length-n time series. For the n = 80 case, we processed 20000 series and divided the final times by 100. R and C source code is posted at [url].
8
Conclusions A class of robust autocorrelation estimators ρˆROBU ST was proposed, based on interpreting the sample
acf as a linear (auto)regression to be fitted via robust regression techniques. Three members of this class, based on L1, LTS, and MM robust regression, respectively, were compared with the sample acf and the Ma-Genton robust estimator in a simulation study, after a brief discussion of their theoretical properties. MM regression exhibited an excellent combination of efficiency and robustness. Interestingly, L1 also performed well despite its inferior theoretical properties. However, the version of LTS we used was too inefficient to outperform the sample acf even in the presence of contamination. There was one contamination pattern we simulated that none of our robust estimators handled well, namely innovation outliers in the context of a MA(1) model, where many of the high-magnitude elements of the time series arguably exhibit a well-defined but different autocorrelation than the rest of the series. If this type of outlier may be present, we recommend that the Ma-Genton estimator be used instead. (It would be useful to develop automatic diagnostics to guard against this case.) Conversely, our MM and L1based estimators clearly outperform the Ma-Genton estimator in the context of AR(p) data with innovation outliers.
8.1
Acknowledgements
Many thanks are due to Tim McMurry for helpful R tips, and to Marc Genton for sharing with us the Austrian bank data.
24
References [1] Brockwell, P.J. and Davis, R.A., Time Series: Theory and Methods, Springer, New York, 1991. [2] Davis, R.A. and Mikosch, T., “The sample autocorrelations of financial time series models,” Nonlinear and Nonstationary Signal Processing, Cambridge University Press, Cambridge, pp. 247–274, 2000. [3] Denby, L. and Martin, R.D., ‘Robust estimation of the first-order autoregressive parameter,’ Journal of the American Statistical Association, vol. 74, no. 365, pp. 140–146, 1979. [4] Hampel, F.R. ‘Robust estimation, a condensed partial survey’, Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 27, pp. 87–104, 1973. [5] Huber, P.J. ‘Robust regression: Asymptotics, conjectures and Monte Carlo’, Annals of Statistics, vol. 1, pp. 799–821, 1973. [6] Genton, M. G., and Lucas, A. (2003), ‘Comprehensive definitions of breakdown-points for independent and dependent observations,’ Journal of the Royal Statistical Society, Series B, vol. 65, 81-94. [7] Genton, M. G., and Lucas, A. (2005), ‘Discussion of the paper ”Breakdown and groups” by L. Davies and U. Gather’, Annals of Statistics, vol. 33, 988-993. [8] Genton, M. G., and Ruiz-Gazen, A. (2010), ‘Visualizing influential observations in dependent data,’ it Journal of Computational and Graphical Statistics, vol. 19, 808-825. [9] K¨ unsch, H. ‘Infinitesimal robustness for autoregressive processes’, Ann. Statist., vol. 12, pp. 843–863, 1984. [10] Kreiss, J.-P., and Paparoditis, E. ‘Bootstrap methods for dependent data: A review.’ Journal of the Korean Statistical Society, vol. 40, no. 4, pp. 357-378, 2011. [11] Kreiss, J.-P., Paparoditis, E. and Politis, D.N. ‘On the Range of Validity of the Autoregressive Sieve Bootstrap’, Ann. Statist., vol. 39, No. 4, pp. 2103-2130, 2011. [12] Ma, Y. and Genton, M.G., “Highly robust estimation of the autocovariance function,” Journal of Time Series Analysis, vol. 21, no. 6, pp. 663–684, 2000. [13] Maronna, R.A., Martin, R.D., and Yohai, V.J., Robust Statistics: Theory and Methods, Wiley, 2006. [14] McMurry, T. and Politis, D.N., “Banded and tapered estimates of autocovariance matrices and the linear process bootstrap,” Journal of Time Series Analysis, vol. 31, pp. 471-482, 2010. [Corrigendum: vol. 33, 2012.]
25
[15] Politis, D.N., “An algorithm for robust fitting of autoregressive models,” Economics Letters, vol. 102, no. 2, pp. 128–131, 2009. [16] Politis, D.N., ‘Higher-order accurate, positive semi-definite estimation of large-sample covariance and spectral density matrices’, Econometric Theory, vol. 27, no. 4, pp. 703-744, 2011. [17] Politis, D.N., Romano, J.P. and Wolf, M. Subsampling, Springer, New York, 1999. [18] Rousseeuw, P.J. and Leroy, A.M., Robust Regression and Outlier Detection, John Wiley & Sons, New York, 1987. [19] Wu, W.B. and Pourahmadi, M., “Banding sample autocovariance matrices of stationary processes,” Statistica Sinica, vol. 19, no. 4, pp. 1755–1768, 2009. [20] Yohai, V.J., “High breakdown-point and high efficiency estimates for regression,” The Annals of Statistics, vol. 15, pp. 642–656, 1987.
26