testing the martingale difference hypothesis using ...

Report 20 Downloads 118 Views
TESTING THE MARTINGALE DIFFERENCE HYPOTHESIS USING INTEGRATED REGRESSION FUNCTIONS∗ By Carlos J. Escanciano and Carlos Velasco† Universidad Carlos III de Madrid August 30, 2003

Abstract This paper proposes an omnibus test for testing a generalized version of the martingale difference hypothesis (MDH). The generalized hypothesis includes the usual MDH or testing for conditional moments constancy such as conditional homoscedasticity (ARCH effects). Here we propose a unified approach for dealing with all of them. These hypotheses are long standing problems in econometric time series analysis, and typically have been tested using the sample autocorrelations or in the spectral domain using the periodogram. Since these hypotheses are not only about linear predictability, tests based on these statistics are inconsistent against uncorrelated processes in the alternative hypothesis. To circumvent this problem we use the pairwise integrated regression functions as measures of linear and nonlinear dependence. Our test is consistent against general pairwise Pitman’s local alternatives converging at the parametric rate and presents optimal power properties. There is no need to choose a lag order depending on sample size, to smooth the data or formulate a parametric alternative model. Moreover, our test is robust to higher order dependence, in particular to conditional heteroskedasticity. Under general dependence the asymptotic null distribution depends on the data generating process, so a bootstrap procedure is proposed and theoretically justified. A Monte Carlo study examines its finite sample performance and a final section investigates the martingale and conditional heteroskedasticity properties of the Pound/Dollar exchange rate. Keywords and Phrases: Martingale Difference Hypothesis; biparameter empirical processes; exchange rates; nonlinear dependence; convergence in Hilbert spaces. ∗ Research

funded by the Spanish Direcci´ on General de Ense˜ nanza Superior (DGES) reference number BEC2001-

1270. † We would like to thank W. Stute and M.A. Delgado for helpful comments and I. Lobato for his help with his program and the data.

1

1. INTRODUCTION The concept of martingale or martingale difference sequence (MDS) is central in many areas of economics and finance. Examples are the martingale model of consumption of Hall (1978), the optimal taxation model of Barro (1981) or perhaps the oldest example of the martingale theory of stock prices, e.g. Lo (1997), to mention only a few. The martingale difference hypothesis (MDH) states that the best predictor, in the sense of least mean square error, of a real time series given some information set is just the unconditional mean of the time series. In this paper we are concerned with the case in which the time series to be predicted is a measurable transformation of a real stationary time series Xt and the information set is just the past values of the time series. That is, we would like to test E[Yt | Xt−1, Xt−2, ...] = µ a.s.

µ ∈ R.

(1)

where Yt is a measurable transformation of Xt and µ = E[Yt ]. This hypothesis, referred as the generalized MDH or simply the MDH, contains interesting testing problems as special cases. For instance, when Yt is a power transformation of Xt , we are testing for conditional moments constancy. The usual MDH (when Yt coincides with Xt ), testing for ARCH effects or testing for conditional symmetry are examples of this case. All these hypotheses are important in economics and finance, for instance, testing for conditional symmetry has implications for portfolio selections. The debate on whether the dynamics of economic and financial time series are determined by the conditional mean or the conditional variance can also have important implications on many other applications including portfolio selection and asset pricing. Usually, these hypotheses have been typically tested using the autocorrelations or autocovariances or in the spectral domain using the periodogram. For instance, Cochrane (1988) proposed a variance ratio test for uncorrelatedness that has been widely used in finance. Durlauf (1991) proposed a spectral distribution based test, using the fact that under the MDH, the standardized spectral distribution function is a straight line. Recently, Deo (2000) has robustified Durlauf’s (1991) test against certain forms of conditional heteroskedasticity. However, all these tests are suitable for testing lack of serial correlation but not the MDH. In fact, they are not consistent against non-martingale difference sequences with zero autocorrelations, that is, when only nonlinear dependence is present, as commonly happens with economic and finantial data, see e.g. the application to exchange rates dynamics. These tests are inconsistent because they only employ information contained in the second sample moments of the process. To circumvent this problem we could take into account higher moments, as in Hinich and Patterson (1992). They proposed to use the bispectrum, i.e., the Fourier transform of the third order cumulants of the process. But again, this test is not consistent against non-martingale difference sequences with 2

zero third order cumulants, see Section 6. Roughly speaking, there have been two main approaches in the literature for designing consistent tests for (1). The first approach is based on checking an infinite number of orthogonality conditions, see for instance, Bierens (1984, 1990), Stute (1997), Bierens and Ploberger (1997), Koul and Stute (1999) or Whang (2000). The second line of research employs smoothed nonparametric estimates of the conditional expectation function, see for instance Hardle and Mammen (1993), Zheng (1996) or Li (1999). Tests based on the second methodology have standard asymptotic null distributions, but they usually require strong assumptions on the data generating process (DGP), see e.g. Li (1999). In addition, these smoothed tests only have local asymptotic power against local alternatives tending to the null at the parametric rate n−1/2 in some special cases, see Hardle and Mammen (1993). More importantly, they require subjective choices of smoothing parameters and kernel functions, and usually statistical inferences can be sensitive to these selections. On the other hand, tests statistics based on the first methodology do not demand in general the selection of any user-chosen parameters. They are consistent against Pitman’s alternatives converging at the parametric rate and do not require strong conditions about the DGP, and although they have non-standard asymptotic null distributions, these can be well approximated by bootstrap methods, see Section 5. We propose a MDH test based on this methodology that maintains such desirable properties. If It = (Xt, Xt−1, ...) is the information set at time t and Ft is the σ-field generated by It , the first methodology exploits the following equivalence principle E[Yt | It−1 ] = µ a.s., µ ∈ R ⇐⇒E[(Yt − µ)f(It−1 )] = 0, for all bounded Ft−1 -measurable weighting function f (·). Tests are usually based on the discrepancy of the sample analog of E[(Yt − µ)f(It−1 )] to zero. The problem of testing over all possible weighting functions can be reduced to testing the orthogonality condition over a parametric family of functions indexed by an auxiliary nuisance parameter. That is E[(Yt − µ)f(It−1 )] = 0 ∀f(·) ⇐⇒ E[(Yt − µ)W (It−1 , x)] = 0 ∀x ∈ Π, where the family W (It−1 , x) is such that the linear span of {W (It−1 , x)} is dense in the weak topology in the space of all bounded Ft−1 -measurable functions, see Stinchcombe and White (1998). Usually, in the literature two types of weighting functions has been considered, either characteristic and exponential functions, see e.g. Bierens (1984), or indicators functions, for instance Stute (1997). The former have the advantage of analyticity, but in order to be consistent against all alternatives, tests have to be based on a particular measure given the space Π of the auxiliary parameter. In the

3

second, the auxiliary parameter lives on the variables’ space and hence the natural measure is the empirical distribution function, although the family of indicators functions is not analytic. Most of the above references test the MDH conditioning on a finite information set, and therefore, they test a particular Markov property and not the MDH, which involves an infinite number of lags. This solution is unsatisfactory because there could be structure in the conditional mean at omitted lags. Often, the maximum power could be achieved by using the correct lag order of the alternative. However, prior information on the conditional mean structure is usually not available. In addition, when a large number of conditioned variables is considered, the empirical power of those tests could be seriously affected by the curse of the dimensionality, see Section 5 below. To the best of our knowledge, only De Jong (1996) and Hong (1999) present consistent tests for the MDH, both based on the exponential functions which take into account possible dependence at all lags. In the former, the Monte Carlo results are unsatisfactory and the test has very low empirical power, whereas in the second case, the test is not robust to conditional heteroskedasticity and higher order dependence because is based on the independence assumption. However, it is a well accepted fact that most financial and economic series which are hypothesized to be martingale differences display conditional heteroskedasticity and higher order dependence. Hong’s (1999) test also depends on a kernel, a bandwidth parameter and an integrating measure and, in general, statistical inferences are not robust to these choices. The aim of this paper is to develop a methodology to test the MDH that does not involve the choice of any lag order, kernel or weighting function, avoid the curse of the dimensionality and is consistent against all pairwise deviations from (1). In particular, our testing procedure is robust to higher order dependence, such as conditional heteroskedasticity, is simple to compute and performs quite well in finite samples as will be shown below. The layout of the article is as follows, in Section 2 we define the Integrated Pairwise Regression Functions and the Integrated Pairwise Autoregression Functions as our measures of dependence. In Section 3 we use these new dependence measures to set a general methodology to test (1). In Section 4 we study the asymptotic distribution of our tests under the null, under fixed and local alternatives. We also compare our test with a parametric t-test against a particular class of “large” local alternatives and show the asymptotically admissibility of the test under Gaussanity. Also directional optimal tests are proposed. In Section 5 we propose and justify a bootstrap approach and we present a simulation exercise comparing with competing tests. In Section 6 we describe the empirical application of our tests to exchange rates and we conclude in Section 7. All proofs are gathered in an appendix. In the sequel, C is a generic constant that may change from one expression to another. Let ∧ denotes the minimum, i.e, a ∧ b = min{a, b}. Unless indicated, all convergences are taken as the sample size n −→ ∞. 4

2. GENERALIZED DEPENDENCE MEASURES In this section we define a generalization of the usual autocovariances and crosscovariances to a nonlinear framework. It is well known that in the presence of nonlinearity (or non-Gaussianity) these measures do not characterize the dependence in the conditional mean and the practitioner needs more reliable measures such as the pairwise regression functions E[Yt | Xt−j = x]. In general, estimation of these functions involves smoothing estimation with subjective bandwidth choices. Robinson (1983) has studied the large sample properties of kernel estimators of lagged conditional means E[Xt | Xt−j ] for various lags j, see also Auestad and Tjøstheim (1990). A natural way to overcome the smoothing approach is to consider cumulative measures. Assume that the random variable Y is integrable, so that the regression function m(x) = E[Y − µ | X = x] is well defined (up to a null set). By a measure-theoretic argument, the regression function m(·) can be characterized by the integrated regression function γ(x) given by γ(x) = E[(Y − µ)I(X ≤ x)] =

Zx

E[Y − µ | X = z]F (dz),

−∞

where the second equality follows by the law of iterated expectations and F (·) is the probability distribution function of X. In a time series context is particularly interesting the case when Y = Yt and X = Yt−j , because the measures γ j (x) = E[(Yt − µ)I(Yt−j ≤ x)] can be called the Integrated Pairwise Autoregression Functions (IPAF). In the general case in which Yt is different from Xt , the measures γ j (x) = E[(Yt − µ)I(Xt−j ≤ x)]

(2)

are the Integrated Pairwise Regression Functions (IPRF). These measures are useful for testing interesting hypotheses in a nonlinear time series framework, specially when one is interested in conditional mean dependence such as the hypothesis (1). They are able to pick out both linear and nonlinear dependence in the conditional mean. The natural estimator of γ j (x) based on a sample {Yt , Xt }nt=1 is

with

γ bj (x) =

n 1 X (Yt − Y n−j )I(Xt−j ≤ x), n − j t=1+j

Y n−j =

n 1 X Yt . n − j t=1+j

5

(3)

Define also, b γ j (∞) = γ bj (−∞) := 0, ∀j ≥ 1. Then γ bj (x) can be viewed as random element in the Skorohod space D[−∞, ∞] of all functions on [−∞, ∞] which are right-continuous and have left-

limits, the so-called cadlag functions, see Billingsley (1968). The next lemma shows the asymptotic distribution of γ bj (x) under the null hypothesis (1).

Proposition 1: Suppose that {Xt }∞ t=1 is a stationary ergodic process with probability distribution function F (·) absolutely continuous with respect to Lebesgue measure. Assume that Yt is a measurable i h function of Xt , with E |Y1 |4(1+δ) < C, E Y1 4 |X1−j |1+δ < C for some δ > 0 and the density function of Yt | Xt−j , j ≥ 1, fj (y) is continuous and uniformly bounded, i.e. sup |fj (y)| < C. y

1

Then, under (1) the process (n−j) 2 γ bj (·) converges weakly to Bj (·) on the Skorohod space D[−∞, ∞],

where Bj (·) is a continuous Gaussian process with mean zero and covariance function Kj (x, x0 ) = E[(Y1 − µ)2 w1−j (x)w1−j (x0 )] where wt (x) := {I(Xt ≤ x) − F (x)}.

The assumptions of Proposition 1 are mild, in particular when Yt = Xt , the moment conditions are satisfied under finite sixth moment and are necessary for the tightness of γ bj (·). As a consequence

of Proposition 1 and the Continuous Mapping Theorem we have that KSY X (j) :=

sup x∈[−∞,∞]

¯ ¯ ¯ ¯ 1 1 ¯ ¯ ¯ ¯ bj (x)¯ = max ¯(n − j) 2 γ bj (Xt )¯ ¯(n − j) 2 γ 1≤t≤n

converges to the supremum of a Gaussian process, where the subscript Y X in KSY X (j) indicates that Y is the dependent variable and X the conditioning variable at lag j. In particular, under homoscedasticity, a standardized version of γ bj (x) has a standard Brownian Bridge as a limiting dis-

tribution and asymptotic inference is possible because the asymptotic quantiles are readily available, see Shorack and Wellner (1986). In the general case, the quantile can be approximated via a boot-

strap approach, see Section 5. With the bootstrap critical values we can calculate uniform confidence bands for γ bj (x) and the significance of γ bj (x) can be tested, see Section 6 below. Note that γ bj (x) can

be useful for detecting nonlinearities graphically (see Tong (1990), p.12). In contrast with kernel estimators, γ bj (x) does not depend on kernel and bandwidth choices, so the tests are straightforward

to implement. Our test statistic for the MDH uses all the measures b γ j (x) simultaneously to test a restricted version of (1).

6

3. PAIRWISE CONSISTENT HYPOTHESIS TESTING Given a time series {Yt , Xt }nt=1 one way of testing (1) is to consider the hypotheses E[Yt | Xt−1, Xt−2, ..., Xt−P ] = µ with P tending to infinite with the sample size. This approach brings some problems up, especially when using indicator functions because of the curse of dimensionality. Alternatively, we could consider P fixed, as for instance in Koul and Stute (1999), Park and Whang (1999) or Dominguez and Lobato (2000), but this is unsatisfactory from a theoretical point of view because there could be dependence at higher lags, and from a practical point of view the results when P is large might be deficient as is illustrated in Section 5. We propose instead to use all the IPRF defined in Section 2 to test (1). This pairwise approach leads to computationally feasible tests which are consistent against a broad class of alternatives. We test that all the IPRF are identically zero, i.e. H0 : γ j (x) = 0

∀j ≥ 1, ∀x ∈ R, a.s..

(4)

The alternative H1 is the negation of the null (4). If we define γ −j (·) = γ j (·) for j ≥ 1, we can consider the Fourier transform of the IPRF γ j (x), f (λ, x) =

∞ 1 X γ (x)e−ijλ 2π j=−∞ j

∀λ ∈ [−π, π], x ∈ R,

(5)

which contains all the information about the null hypothesis (4). Because we are taking into account linear and nonlinear dependence, f (λ, x) is a generalization of the spectral (or cross-spectral) density function. We can also consider the generalized spectral distribution function as the integral of f (λ, x), λπ Z H(λ, x) = 2 f (w, x)dw

∀λ ∈ [0, 1], x ∈ R,

0

that is, ∞ X sin jπλ . H(λ, x) = γ 0 (x)λ + 2 γ j (x) jπ j=1

(6)

Note that, both f (λ, x) and H(λ, x) exist as functions in an appropriate Hilbert space L2 (Π, ν) defined below. The generalized spectral distribution function contains all the information about the pairwise regressions functions and could be viewed as a generalization of the test statistic used in Durlauf (1991) and Deo (2000) because they considered the autocorrelations instead of the pairwise regression functions and only took into account the second moment implications of the MDH. With the use of H(λ, x) we consider all the pairwise implications of the MDH, including both linear and non-linear conditional dependencies. Notice that a more flexible weighting on the measure γ j (x) 7

is possible via a kernel function and a lag-bandwidth parameter, but this approach will introduce some arbitrariness in the test via the kernel and bandwidth choices. Our test is based on the sample analogue of (6), n−1 X

b H(λ, x) = γ b0 (x)λ + 2

1

j=1

(1 −

j 12 sin jπλ ) γ , bj (x) n jπ

with (1 − nj ) 2 a finite sample correction factor which delivers a better finite sample performance

and where {b γ j (x)}n−1 j=1 are given by (3). Notice that all the n − 1 lags in the sample are used, and therefore, there is no need to choose a lag order. Hong (2000) has proposed a related statistic to test for serial independence. Because (4) is equivalent to H(λ, x) = γ 0 (x)λ, the test is based on the b b0 (λ, x) := b discrepancy between H(λ, x) and H γ 0 (x)λ. That is, we consider the process

to test H0 .

√ n−1 X n 1 b 1 2 sin jπλ b 0 (λ, x)} = Sn (λ, x) = ( ) 2 {H(λ, x) − H (n − j) 2 b γ j (x) 2 jπ j=1

b Informally speaking, under the null (1) the generalized sample spectral distribution H(λ, x) will

b 0 (λ, x) and then, the process Sn (λ, x) will converge in distribution as be approximately equal to H

b b 0 (λ, x) and hence, Sn (λ, x) will n increases, while under the alternative H(λ, x) will differ from H diverge to infinity when n increases.

In order to evaluate the distance of Sn (λ, x) to zero, a norm has to be chosen. In this context the natural norm is the Cram´er-von Mises (CvM) norm. If Fn (x) is the usual empirical distribution function based on {Xt }nt=1 , the Cram´er-von Mises (CvM) norm is Dn2

:=

Z

(Sn (λ, x))2 Fn (dx)dλ =

n (n − j) X 2 γ b (Xt ). n(jπ)2 t=1 j j=1

n−1 X

(7)

Our tests reject the null hypothesis for large values of the test statistic Dn2 .

For simplicity we have restricted ourselves to the case of pairwise measures, but we could also employ a higher number of cross-moments. That is, in the same way that the bispectrum generalizes the spectral density function, we can consider the generalized bispectrum as the Fourier transform of the double sequence γ j,k (x, y) = E[(Yt −µ)I(Xt−j ≤ x)I(Xt−k ≤ y)] to test the MDH. Also, we have considered only the univariate case, but all our methodology and results hold in the multivariate case, that is, when Xt is a p-dimensional random vector being this one of the advantages of considering the empirical process Sn (λ, x) as a random element with values in a suitable Hilbert space.

8

4. ASYMPTOTIC THEORY 4.1. Asymptotic null distribution In this section, we first establish the null limit distribution of the biparameter process Sn (η) under (1). The null limit distribution of the new test is the limit distribution of a functional of Sn (η). To further elaborate the foregoing points we need some notation. Let ν the product measure of the probability distribution F and the Lebesgue measure on [0, 1]. Also, Π = [0, 1] × [−∞, ∞] and η := (λ, x) ∈ Π. We consider Sn (η) as a random element on the Hilbert space L2 (Π, ν) of all square integrable functions (with respect to the measure ν) with the inner product Z Z hf, gi := f (η)g(η)dν(η) = f(λ, x)g(λ, x)F (dx)dλ. Π

Π

L2 (Π, ν) is endowed with the natural Borel σ−field induced by the norm kf k = hf, f i1/2 , see Chapter VI in Parthasarathy (1967) for convergence results on Hilbert spaces. For recent applications in the econometric literature see Politis and Romano (1994), Chen and White (1996, 1998) or Chen and Fan (1999). If Z is a L2 (Π, ν)−valued random variable with probability distribution µZ , we say that Z has mean m ∈ L2 (Π, ν) if E[hZ, f i] = hm, fi ∀f ∈ L2 (Π, ν). If E kZk2 < ∞, then the covariance operator of Z (or µZ ), CZ (·) say, is a continuous, linear, symmetric, positive definite operator from L2 (Π, ν) to L2 (Π, ν) defined by CZ (f) = E[hZ, fiZ]. If

∞ P

j=1

|hCZ (fj ), fj i| < ∞ for all complete orthonormal systems {fj }, so that the sum

is independent of the choice of {fj }, we call CZ a nuclear operator and tr(CZ ) =

∞ P

∞ P

hCZ (fj ), fj i

j=1

hCZ (fj ), fj i is the

j=1

trace of CZ . Alternatively, we can define the trace in terms of the covariance function K(η, η0 ) := R Cov(Z(η), Z(η0 )) such as tr(CZ ) = K(η, η)F (dx)dλ whenever this integral is finite. Let =⇒ Π

denote weak convergence in the Hilbert space L2 (Π, ν) endowed with the norm metric. Consider the

following assumptions. Assumption A1: A1(a): {Xt }∞ t=1 is a stationary ergodic process with probability distribution function F (·) absolutely continuous with respect to Lebesgue measure. Also, Yt is a measurable function of Xt . A1(b): E |Y1 |2(1+δ) < C for some δ > 0. Note that Assumption A1 is mild, in particular it allows us to consider conditional heteroskedastic processes. Assumption A1(a) is standard in the independence testing literature, see e.g. Delgado (1996) or Koul and Stute (1999). Assumption A1(b) is much weaker than the moment assumptions used in Durlauf (1991) and Deo (2000) for testing the MDH, who assumed finite eigth moments. 9

The next theorem shows the null limit distribution of the L2 (Π, ν)−valued random element Sn (η) and is one of the key results of this paper. Theorem 1: Under Assumption A1 and (1), the process Sn (η) converges weakly to S(η) on L2 (Π, ν), where S(η) is a continuous Gaussian process with mean zero and covariance operator CZ (f) =

∞ ∞ X X

E[(Y1 − µ)2

j=1 j 0 =1

Z

f (η)f(η0 )w1−j (x)Ψj (λ)w1−k (x0 )Ψk (λ0 )dν(η)dν(η0 )]

Π

√ where f ∈ L2 (Π, ν), wt (x) is defined in Proposition 1, Ψj (λ) = ( 2 sin jπλ)/jπ, η = (λ, x) and

η0 = (λ0 , x0 ).

Remark 4.1. It is easy to show that under independence and using the equality ∞ ∞ X X Ψj (λ)Ψk (λ0 ) = (λ ∧ λ0 − λλ0 ),

(8)

j=1k=1

the covariance function of the process S(η) is equal to σ2 (F (x ∧ x0 ) − F (x)F (x0 ))(λ ∧ λ0 − λλ0 )

where σ2 = E[(Y1 − µ)2 ]. Therefore, using the classical quantile transformation u = F (x) and S(λ, u) := S(λ, F −1 (u)), continuous functionals based on the limit process σ−1 S(λ, u) have the

same distribution as continuous functionals based on the biparameter Brownian Brigde on [0, 1]2 . Thus, in this case the asymptotic null distribution is nuisance parameter-free after standardization by a consistent estimate of σ−1 , and the quantiles of the asymptotic distribution of norms of S(η) can be tabulated. Remark 4.2. We have considered convergence in the Hilbert space L2 (Π, ν) instead of convergence in the generalization of the Skorohod space D[Π] (see Bickel and Wichura 1971) because of the spectral structure of our test statistic. Under D[Π], the utilization of all lags j = 1, ..., n − 1 require more restrictive assumptions than those used in Theorem 1, compare Proposition 1 with Theorem 1. Proofs of all our asymptotics results on D[Π] can be obtained under the independence assumption from results similar to Hong (2000) or equivalently under m-dependence for an arbitrary m using the ideas of the proof of Theorem 1. Next corollary shows the asymptotic distribution of our test statistic and follows from the Continuous Mapping Theorem (Billingsley’s (1968) Theorem 5.1), Theorem 1 and Lemma 3.1 in Chang (1990). Corollary 1: Under (1) and Assumption A1 2 Dn2 −→d D∞ :=

Z

(S(η))2 F (dx)dλ,

10

Now, we shall show that the asymptotic distribution of Dn2 can be expressed as a weighted sum of independent χ21 random variables with weights depending on the DGP. This could be a basis to approximate the asymptotic distribution of Dn2 and obtain critical values of tests. Under A1, Z tr(CZ ) = K(η, η)F (dx)dλ < ∞, Π

then, the solutions {li } and {ψi (·)} of the eigenvalue problem Z K(η1 , η2 )ψi (η2 )dν(η 2 ) = li ψi (η1 ) Π

are such that {li } are nonnegative and the corresponding eigenfunctions {ψi (·)} forms a complete orthonormal basis for L2 (Π, ν). Hence any L2 (Π, ν)−valued random element has a Fourier expansion in terms of {ψi (·)}. In particular, we have Sn (η) =

∞ p X li ²ni ψi (η) in distribution, i=1

S(η) =

∞ p X li ²i ψi (η) in distribution, i=1

where ²i =

Z

S(η)ψi (η)dν(η),

Z

Sn (η)ψi (η)dν(η).

Π

and ²ni =

Π

∞ Note that by Theorem 1, {²i }∞ i=1 are i.i.d. N(0,1) random variables and {²ni }i=1 are at least

uncorrelated with unit variance. Then, by Parseval’s identity 2 D∞ =

∞ X p (²i li )2 .

(9)

i=1

Only in certain special cases, such as independence of {Xt }, the eigenvalues li and eigenfunctions ψi (·) can be calculated explicitly. 4.2. Consistency, Local alternatives and Optimal Directional tests The consistency properties of the Cram´er-von Mises test Dn2 are considered in the following theorems. Note that our testing framework avoids the need to possess prior information about the alternative hypothesis because as we shall show below the test is consistent against any fixed alternative of the null hypothesis (4) satisfying A1. 11

Theorem 2: Under Assumption A1 Z Z ∞ X 2 2 1 Dn −→P (H(η) − H0 (η))2 F (dx)dλ = γ 2j (x)F (dx). 2 n (jπ) j=1 Π

R

Under the alternative hypothesis, there exists at least one j ≥ 1, such that, γ j (x) 6= 0 for some subset of R with positive Lebesgue measure, and therefore, because F (x) is absolutely continuous ¯2 R¯ with respect the Lebesque measure, we have ¯γ j (x)¯ dF (x) > 0. Hence plim n2 Dn2 > 0 , i.e., R

the test is consistent against any fixed alternative of (4). In other words, under the alternative, P r(Dn2 > Cn ) −→ 1 for any nonstochastic sequence Cn = O(n), a property no attainable by the MDH tests of Durlauf (1991) and Deo (2000). To gain insight of the consistency properties of the test, the next theorem describes the behavior of our statistic under a sequence of alternative hypotheses tending to the null at the parametric rate n−1/2 . Consider the sequence of pairwise local alternatives HA,n : E[Yt − µ | Ft−1 ] =

gt , a.s., n1/2

(10)

where the random variables gt satisfies the following assumption. Assumption A2: {gt } is Ft−1 -measurable zero mean, square integrable and there exists a j ≥ 1, such that if µj (x) := E[gt I(Xt−j ≤ x)] then P r(µj (Xt ) = 0) < 1. We call these pairwise local alternatives. A change from H0 to HA,n results in a nonrandom shift function in the asymptotic distribution of Dn2 , as is shown in the next theorem. Theorem 3: Under the sequence of local alternatives (10) satisfying A2 and Assumption A1 Sn (η) =⇒ S(η) + G(η), where S(η) is the process defined in Theorem 1 and G(η) is the deterministic function √ ∞ X 2 sin jπλ . µj (x) G(η) = jπ j=1 Under the local alternatives (10), there exists at least one j 6= 0 such that µj (x) 6= 0 for some subset of R with positive Lebesgue measure. In particular, Z ¯ ∞ X ¯ 1 ¯µj (x)¯2 dF (x) > 0. kG(η)k = 2 (jπ) j=1 2

R

Therefore, tests based on continuous functionals of Sn (η) have nontrivial power against the local alternatives (10), which converge at the parametric rate n−1/2 , a property no attainable by tests 12

that use lag-bandwidth parameters, see e.g. Li (1999), or which use a fixed number of lags, as those of Park and Whang (1999) or Koul and Stute (1999). Nevertheless, Dn2 cannot detect alternatives that are pairwise MDS. Next corollary is an immediate consequence of the Continuous Mapping Theorem and Theorem 3. Corollary 2: Under the local alternatives (10) satisfying A2, and Assumption A1 Z Dn2 −→d (S(η) + G(η))2 F (dx)dλ. So far, we have considered general local alternatives satisfying A2 and we have shown that our test is able to detect such alternatives. This property is important, because usually one does not possess prior information about the alternative hypothesis at hand. When a particular type of alternative is in mind of the researcher, it can be desirable to use this information in the test procedure. We now propose a directional test for this particular situation by means of the principal components decomposition of the limit process S(η). Since HA,n delivers a nonrandom shift in the mean function of the Gaussian process S(η), tests for H0 against HA,n can be viewed as a e 0 : E [S(η)] = 0 against H eA,n : E [S(η)] = G(η). In a fundamental work, Grenander test for H

(1952) generalized the optimal Neyman-Pearson theory to this framework, see also Stute (1997) and references therein. In particular, we can deduce optimal directional tests for testing (4) against (10) by means of the Neyman-Pearson Lemma in its functional form. To begin with, applying again Parseval’s identity we have Z

(S(η) + G(η))2 dν(η) =

∞ X p (τ i + ²i li )2 , i=1

Π

where ²i , li and ψi are as before and {τ i } are the Fourier coefficients of G(η), that is Z τ i = G(η)ψi (η)dν(η). Π

Under the mild assumption of ∞ 2 X τ i

i=1

li

< ∞,

(11)

we have that the distribution of S(·) under the alternative HA,n , say µ1n , is absolutely continuous with respect to the distribution of S(·) under the null, see Grenander (1952). The Radom-Nikodym derivative equals 

 · ¸ Z G(η) g −→ exp − A(η) g(·) − dν(η) 2

(12)

Π

where A(η) =

∞ P

(τ i /li )ψi (η). The Neyman-Pearson Lemma and expression (12) immediately yield

i=1

13

that the optimal directional test consists in rejecting (4) in favor of (10) if and only if ∞ X τ i ²i i=1

li

≥ cα ,

where cα is such that the type I error is equal to α. Therefore, for a finite sample size an approximate Neyman-Pearson test for (4) against (10) is thus given by the critical region m X τ ib²i b i=1

where b²i = τi = b

Z

Π

Z

Π

b li

≥ cα ,

b i (η)Fn (dx)dλ, Sn (η)ψ b i (η)Fn (dx)dλ, Gn (η)ψ

√ n 1 X 2 sin jπλ , gt 1(Xt−j ≤ x) Gn (η) = n − j jπ j=1 t=1+j n−1 X

b and b and ψ li are estimations of the eigenfunctions ψi (·) and eigenvalues li , see e.g., Dauxois, Pousse i

and Romain (1982). Note that cα can be approximated via a bootstrap procedure, for instance the wild bootstrap procedure defined in Section 5. 4.3. Power analysis Now, we consider the parametrized local alternatives cσgt HA,n (c) = E[Yt − µ | Ft−1 ] = √ , a.s., n

(13)

with gt verifying A2, with E[gt2 ] = 1 and E[(Yt − µ)2 | Ft−1 ] = σ2 a.s.. To gain insights in the

asymptotic power properties of the Cram´er-von Mises test Dn2 , we study its asymptotic power as a function of c, i.e. ¡ 2 ¢ ΠCvM (c) := Pr D∞ rejects H0 | HA,n (c) .

In particular we are interested in the behavior of this function for both c → ∞ and c → 0. Following the arguments of Theorem 4 of Bierens and Ploberger (1997) we obtain the following result related with the rate of the asymptotic power function of the Cram´er-von Mises test as c −→ ∞. Corollary 3: Under the sequence of alternative hypothesis (13) and Assumptions A1-A2 for any positive constant K, ¡ 2 ¢ 1 lim c−2 ln Pr D∞ (c) ≤ K = − . 2

c−→∞

14

This result implies that if the test has nontrivial local power, then ΠCvM (c) approaches 1 at an exponential rate as c −→ ∞. This result is even stronger if compared with the asymptotic behaviour of the t-statistic for δ = 0 in the following regression Yt = µ + δgt∗ + ut , with ut = Yt −E[Yt | Ft−1 ], E[u2t | Ft−1 ] = σ 2 a.s., and where gt∗ is some “guess” of gt which satisfies also A2. If ρ = Corr(gt , gt∗ ) is the correlation coefficient between gt and gt∗ , and Πt (c) := lim Pr ( t test rejects H0 | HA,n (c)) , n−→∞

then, it is proved in Theorem 5 of Bierens and Ploberger (1997) that lim c−2 ln (1 − Πt (c)) = −

c−→∞

ρ2 . 2

This result implies that if the correlation coefficient ρ involved is not equal to 1 or -1, then there exists a c0 such that ΠCvM (c) > Πt (c) for c > c0 , that is, as long as the correlation between gt∗ and gt is not perfect, our Cram´er-von Mises test is more powerful than the t-test uniformly for large c’s. Alternatively, we also consider the case in which c −→ 0. Let P0 the probability measure associated ¢ ¡ 2 to S(η) under the null hypothesis and Kα such that Pr D∞ (c) ≤ Kα = α. Then, from a Taylor expansion of (12) around c = 0, see Stute (1997) p.31, we obtain the following result.

Corollary 4: Under Assumptions A1-A2 and assuming σ 2 = 1   Z ∞ 2X c   l−1 τ 2 1 − α − ΠCvM (c) = α + ε2i dP0  + o(c2 ) 2 i=1 i i

(14)

2 (c)≤K } {D∞ α

where {εi }∞ i=1 is a sequence of i.i.d. N(0,1) random variables.

The coefficient of c2 in (14) constitutes the curvature of the asymptotic power function of the Cram´er-

2 von Mises test D∞ (c). Since in the case of an arbitrary (unknown) σ 2 the test is based on Sn (η)/σ n

rather than Sn (η), with σ n a consistent estimate for σ, we have to replace c2 in (14) by c2 /σ 2 . This

features the loss of power if the noise variance increases. In a preceding section we have shown how to construct the (approximated) most powerful region for testing (4) against (10) by means of the Neyman-Pearson’s Lemma. Now, suppose we want to test (4) against the family of simple alternatives (13) parametrized by c. When c takes values of one sign one can obtain uniformly most powerful tests. Applying ideas from Grenander’s (1952) Section 4.5, we obtain that under (11), the test defined by the critical region Z A(η)S(η)dν(η) ≥ cα , Π

15

if c > 0 and the ≤ sign if c < 0, is the one-sided uniformly most powerful test for (4) against the alternatives (13). Similarly, assuming that |c| < M for some positive constant M , we have that the uniformly most powerful unbiased test has critical region ¯ ¯ ¯Z ¯ ¯ ¯ ¯ A(η)S(η)dν(η)¯ ≥ cα . ¯ ¯ ¯ ¯ Π

When one considers composite alternatives, uniformly most powerful tests are seldom the case, except for some specified alternatives, e.g. one-sided alternatives (13). Now, we show that under Gaussanity the Cram´er-von Mises test is asymptotically equivalent to a likelihood ratio test by means of an appropriate probability measure defined on the space of local alternatives, see Theorem 6 of Bierens and Ploberger (1997) for details. From this equivalence the asymptotic admissibility of our Cram´er-von Mises test under Gaussanity is established. Assumption B: Yt | Ft−1 ∼ N (µ, σ 2 ). Corollary 5: Under Assumptions A1-A2 and B the Cram´er-von Mises test is asymptotically admissible, that is, there does not exist a test that is uniformly more powerful.

5. BOOTSTRAP APPROXIMATION AND FINITE SAMPLE PERFORMANCE The asymptotic distribution of the Cram´er-von Mises test statistic can be expressed as a weighted sum of independent χ2 random variables by application of Mercer’s Theorem and, although the weights depend on the DGP, they can be approximated by various methods. A simple multiple χ2 approximation is due to Satterthwaite (1941, 1946). Under independence this approximation might be accurate, see Kock and Yang (1986) for an application. Here, we consider an alternative approach, the main idea to solve this problem is to estimate the distribution of Sn (η) by that of √ n−1 X 2 sin jπλ ∗ 1/2 ∗ , Sn (η) = (n − j) γ bj (x) jπ j=1 with

γ b∗j (x) = (n − j)−1

n X

t=1+j

(Yt − Y n−j ){I(Xt−j ≤ x) − Fn−j (x)}Wt ,

where {Wt } is a sequence of independent random variables with zero mean, unit variance, bounded support and also independent of the sequence {Xt }. This procedure has been called wild bootstrap (see, e.g. Wu (1986) and Hardle and Mammen (1993)). This approach is related with the approach of Hansen (1996). Next theorem shows the validity of the bootstrap and allows us to calculate asymptotically valid critical values for the tests. 16

Theorem 4: Under the null hypothesis (1), under any fixed alternative hypothesis or under the local alternatives (10), Sn∗ (η) =⇒ S(η), a.s. in L2 (Π, ν), ∗

where S(η) is the process defined in Theorem 1 and =⇒ a.s. denote weak convergence almost surely under the bootstrap law, that is, if the sample is χn ,



ρw (L(Sn∗ (η) | χn ), L(S(η))) → 0 a.s. as n −→ ∞ where L(Sn∗ (η) | χn ) is the law of Sn∗ (η) given the sample and ρw is any metric metrizing weak convergence in L2 (Π, ν), see Politis and Romano (1994). Therefore, we can approximate the asymptotic null distribution of the process Sn (η) by that of Sn∗ (η). In particular, we can simulate the critical values for the tests statistics Dn2 by the following algorithm: 1. Calculate the test statistic Dn2 with the original sample. 2. Generate Wt , a sequence of independent random variables with zero mean, unit variance and bounded support. Independent between different j’s and independent of the sample. 3. Compute γ b∗j (x), Sn∗ (η) and Dn∗2 .

4. Repeat steps 2 and 3 B times and compute the empirical (1 − α)th sample quantile of Dn∗2

∗2 with the B values, Dn,α . The proposed test rejects the null hypothesis at the significance level

∗2 α if Dn2 > Dn,α

Note that given the result obtained in Theorem 4, the proposed bootstrap test has a correct asymptotic level, is consistent and is able to detect alternatives tending to the null at the parametric rate n−1/2 . In order to examine the finite sample performance of the proposed test we carry out a simulation experiment with several DGPs under the null and under the alternative. Here we focus on the usual MDH, that is when Yt = Xt . The first block of models considered here have been used by Dominguez and Lobato (2000) and thus, they will be useful for comparing both tests. We also compare the tests with the conditional heteroskedasticity robust Durlauf’s (1991) test proposed by Deo (2000). We briefly describe our simulation setup. We denote Dn2 the new Cram´er-von-Mises test statistic n P defined in (7). Let Y = n−1 Yt be the usual sample mean. t=1

Dominguez and Lobato (2000) have considered a MDH test taking into account a fixed number

of lags. We denote by CvMP and KSP the Cram´er-von Mises and Kolmogorov-Smirnov statistics 17

respectively, with P as the number of lags used. These statistics are based on the multivariate integrated regression function, i.e. #2 " n n 1X X CvMP = 2 (Yt − Y )I(e zt,P ≤ zej,P ) , n j=1 t=1 ¯ ¯ ¯ ¯ n ¯ 1 X ¯ ¯ KSP = max ¯ √ (Yj − Y )I(e zj,P ≤ zei,P )¯¯ , 1≤i≤n ¯ n ¯ j=1

where zet,P = (Yt−1 , ..., Yt−P ) is the P -lagged values of the series. To save space, only results for

P =1 and 2 are presented. Note that, CvM1 and KS1 are functionals of a process studied by Koul and Stute (1999) in a more general context.

Recently, Deo (2000) has proposed a correction of the Durlauf’s (1991) test to take into account conditional heteroskedasticity. The corrected statistic is µ ¶2 n−1 X 1 2 DURC := nb aj jπ j=1 where, "

−1

ρj (n − j) b aj = b

n−j X t=1

b ρj = (n − j)−1

2

2

(Yt − Y ) (Yt+j − Y )

n−j X t=1

#−1/2

,

(Yt − Y )(Yt+j − Y ).

ρj instead of n−1 as used in Deo (2000) because it gives We have considered the factor (n − j)−1 in b

a better finite sample performance. Note that we do not consider a kernel or weighting function. Under the null hypothesis of the MDS and some additional assumptions (see Deo (2000)), d

DU RC −→

Z1

B 2 (t)dt as n −→ ∞.

0

where B(t) is the standard Brownian bridge on [0,1]. The 10%, 5% and 1% asymptotic critical values are obtained from Shorack and Wellner (1986, p.147) and are 0.347, 0.461 and 0.743 respectively, although we have also used in the simulations the empirical critical values. In the sequel εt ∼ i.i.d N (0, 1). The first block of models considered in the simulations are two martingale difference sequences: 1. A sequence of i.i.d N(0,1) variates. 2. GARCH(1,1) processes. yt = εt σ t

18

2 σ 2t = w + αyt−1 + βσ2t−1

with w = 0.001 and the following combinations for (α, β): (0.01, 0.97), (0.09, 0.89) and (0.09, 0.90), we call these processes GARCH1, GARCH2 and GARCH3 respectively. And the following non-martingale sequences: 3. Non-Linear Moving Average (NLMA) process yt = εt−1 εt−2 (εt−2 + εt + 1) 4. Bilinear Processes. yt = εt + b1 εt−1 yt−1 + b2 εt−1 yt−2 with (b1 , b2 ): (0.15, 0.05) and (0.25, 0.15), we call these process BIL-I and BIL-II respectively. Note that the second and third GARCH models have unbounded eight and sixth moment respectively. Also note that the NLMA process is uncorrelated, and therefore, the usual tests such as the Box and Pierce (1970), Durlauf (1991), Anderson (1993, 1997) or Hong (1996) have no asymptotic power against this model. We consider for the experiments under the null a sample size of n = 100 and under the alternative n = 100, 200 and 300. The number of Monte Carlo experiments is 1000 and the number of bootstrap replications is B = 500. In all the replications 200 pre-sample data values of the processes were generated and discarded. Random numbers were generated using IMSL ggnml subroutine. We √ √ √ employ a sequence {Wt } of i.i.d Bernoulli variates where P (W = 0.5(1 − 5)) = (1 + 5)/2 5 √ √ √ and P (W = 0.5(1 + 5)) = 1 − (1 + 5)/2 5. Note that the third moment of W is equal to 1 and hence the first three moments of the bootstrap series coincide with the three moments of the original series, see Stute, Gonzalez-Manteiga and Presedo-Quindimil (1998). In Table I and Table II we show the empirical rejection probabilities (RP) associated with the three nominal levels 10%, 5% and 1%. The results for Dn2 , CvM1 , KS1 , CvM2 , KS2 and DU RC show good size properties and are robust to thick tails given their behavior with GARCH models. For DU RC we also present in brackets the RP using the empirical critical values based on simulations with i.i.d standard normal variables and 10.000 replications. In Table III we report the empirical power against the NLMA process. It increases with the sample size n, as expected. Generally, the Kolmogorov-Smirnov test has more empirical power than Cram´er-von Mises tests. There is no test which dominates uniformly the others, although the statistics CvMP and KSP for P ≥ 2 have less power as P increases. Deo’s (2000) statistic, DU RC, has no power against this alternative, as expected, because this NLMA model is uncorrelated. Among 19

the statistics Dn2 , CvM1 and KS1 the difference is not substantial, because the NLMA process has dependence in the conditional mean only for the first lags. In Tables IV and V we show the RP for the bilinear models. As in the case of the NLMA, there is no test which dominates uniformly the others and, again, the statistics CvMP and KSP for P ≥ 2 have less power in almost all cases. In the BIL-II case, CvM1 and KS1 perform slightly better than Dn2 , but the difference is not substantial. The empirical power of DU RC is lower in all cases. In these examples the dependence is present only in the first lags, and therefore there is not too much difference in practice between Dn2 , CvM1 and KS1 . For models with higher dependence structure is expected that Dn2 will have more empirical power than CvM1 and KS1 because Dn2 considers all lags. For instance, we consider the following second block of models: 5. A linear ARMA model: (1 − 0.3L)Yt = (1 − 0.5L2 )εt , where L is the lag shift operator, i.e. LYt = Yt−1. 6. A non-Gaussian moving average model (NGMA): yt = exp(εt ) − 0.7 exp(εt−3 ). 7. A threshold autoregressive model (TAR) yt = 0.1yt−3 − 0.5yt−4 + εt if yt−3 ≥ 1 yt = −0.5yt−3 + 0.4yt−4 + εt if yt−3 < 1. The empirical power for a sample size of n = 100 and level 5% and the same design as Tables III-V is shown in Table VI. Now, Dn2 has more empirical power than CvMP and KSP , P = 1 and 2, because of the higher lag dependence. The conclusion for these cases is that is more preferred to sum-up pairwise information at various lags than consider many of them simultaneously. Though our additive approach restricts the hypothesis to be tested (from (1) to (4)), it is able to break the curse of the dimensionality that affects the statistics CvMP and KSP for moderate P . Summarizing, Dn2 has omnibus good empirical power against all linear and nonlinear dependencies, independently if the dependence is at high or low lags. 6. EXCHANGE RATES DYNAMICS In this section we investigate by means of our generalized spectral distribution test and the IPAF the dynamics of the daily log price changes of the British Pound exchange rate in terms of the US 20

Dollar exchange rate (BPUSD). This problem has been explored in, e.g., Hsieh (1989), Gallant, Hsieh and Tauchen (1991), Bera and Higgins (1997) and examined recently by Dominguez and Lobato (2000), among others. The data consist in two samples, first from January 2nd, 1974 to December 31st, 1983 (BPUSD1) and the second period from December 12th, 1985 to February 28th, 1991 (BPUSD2). For a better comparison, we discard the 10% final observations in BPUSD2 as in Bera and Higgins (1997). Then, the number of total observations for BPUSD1 and BPUSD2 are 2505 and 1210 respectively. The rates of change are calculated by taking the logarithmic differences between successive trading days, i.e., Yt = 100 log(rt /rt−1 ), where rt denote the U.S. Dollar price of a Pound at time t. Table VII provides summary statistics of the data. The sample distribution of the data has heavy tails in the two periods, especially in BPUSD1, and the kurtosis coefficients are substantially larger than that of the standard normal distribution (which is 3). Earlier investigations focused on the linear predictability (or lack thereof) of the exchange rates. Usually, exchange rates data are serially uncorrelated. Here we use Deo’s (2000) test statistic to check if the data are uncorrelated. Previous results have shown that BPUSD1 have little linear dependence, see e.g. Table 2 in Hsieh (1989). This is in agreement with the p-values of DU RC test in Table VIII. These results also show that BPUSD2 has not linear dependence, which is in agreement with the findings of Bera and Higgins (1997). There has been some evidence in the literature supporting that exchange rate changes exhibit nonlinear dependence, see e.g. Hsieh (1989). We now investigate the type of the serial dependence present in the BPUSD by means of the IPAF and the generalized spectral distribution tests. An important problem is to distinguish whether the serial dependence affects the conditional mean or the conditional variance. The solution of this problem is crucial because it has important implications in economics. Hsieh (1989) proposed a test based on third order cumulants to discriminate between both types of dependence and found evidence in favor of the multiplicative dependence, that is, evidence that the serial dependence affects the conditional variance, and rejecting the dependence in the conditional mean. To test these hypotheses we use our generalized spectral test and present the results for the two periods in Table VIII. To facilitate interpretations we show the p-values for Dn2 , CvMP and KSP for P = 1 and 2. For BPUSD1, Dominguez and Lobato (2000) found evidence for P = 1 against the MDH, but for P = 2 they supported the MDH. Note that this is a contradictory result. The statistic Dn2 shows strong evidence against the MDH, which disagrees with the findings of Hsieh (1989). These results can be explained by the fact that the data has third order cumulants equal to zero, but is not a MDS. For the second period we find evidence supporting the MDH, with all the tests. Then, a nonlinear model for the conditional mean can not explain the nonlinear dependence in the BPUSD2, in particular the bilinear model used in Bera and Higgins (1997). To gain insight in the serial dependence properties of the data we consider the IPAF for the

21

two periods and also for the squares of the data. In Tables IX and X we show the KolmogorovSmirnov tests statistics for the IPAF at different lags and different combinations of variables, and the bootstrap 95% quantile to indicate the significance of the IPAF for the BPUSD1 and BPUSD2. For the definition of the corresponding Kolmogorov-Smirnov tests statistics KSY X (j) see Section 2. Table IX reveals that the nonlinearity in the conditional mean is significative at lags 1 and 5 for the BPUSD1, confirming a weekly effect in this daily data set. The BPUSD1 also seems to be highly heteroskedastic at all lags and conditionally asymmetric at lag j = 1. The results in Table X display a different behavior for the BPUSD2, in which all the IPAF are not significant except for some integrated conditional variances. We also plot in Figures 1 and 2 the IPRF at lag j = 1 for the BPUSD1 data levels and squares respectively (Yt and Yt2 as dependent variables and Yt−1 as the conditioning variable). We also plot the corresponding uniform confidence bands under the null of the MDH and under the alternative. The pattern of the IPAF for the data confirms some stylized facts about exchange rate changes. For instance, Figure 2 shows the well-known “Leverage effect” in the exchange rates, in which volatility is higher when past rates changes are negative. The IPAF for Yt3 as dependent variable, which has not been shown for the sake of space, reveals that large changes in exchange rates are often negative. On the other hand, Figure 1 shows the IPAF for the BPUSD1 data and confirms that the MDH is rejected at lag j = 1, although the autocorrelation at lag j = 1 is not significatively different from zero. This reveals that the conditional mean at this lag can not be linear and there is nonlinear dependence. Because the slope of the IPAF at each point is proportional to the regression function we observe that the regression function has the same sign as the lagged exchange rate change. This feature supports the well-known fact that the sample autocorrelation at lag j = 1 of exchange rates is usually positive. Summarizing, our new tests find nonlinear dependence in the conditional mean of the BPUSD1 exchange rate changes, contrasting with previous studies which assume that exchange rate changes are very nearly to be unpredictable given past prices. The nonlinearity in the conditional mean of BPUSD1 suggest that additional effort has to be dedicated to investigate the form of such nonlinearity in the conditional mean before modeling the conditional variance. 7. CONCLUSIONS AND SUMMARY In this article we have introduced a new test for the MDH which is based on a functional of the generalized spectral distribution function based on the IPAF or the IPRF. These new dependence measures can play a valuable role in nonlinear time series analysis in the same way that the autocovariances and crosscovariances do in the linear setup, as illustrated for the analysis of exchange 22

rates dynamics. Our test is able to detect failures of the MDS assumption for processes that are uncorrelated. The asymptotic distribution of the test statistic and its asymptotic power properties have been discussed in detail and the asymptotic admissibility under Gaussanity has been proved. We have also proposed approximated optimal Neyman-Pearson directional tests. In the case of dependent data, the asymptotic null distribution depends on the DGP and hence, we have proposed to implement the test via a bootstrap procedure. We have justified this approximation theoretically and its performance has been shown through simulated examples. We have carried out an empirical comparison with recently proposed alternative tests for the MDH, showing that our test has omnibus empirical power against linear and non-linear dependencies. We have also shown some evidence against the MDH in exchange rates, in particular, we have shown that using these new measures of dependence one can obtain some very helpful insights in the kind of serial dependence present in both the conditional mean and variance. In particular we have used those measures to show that exchange rate changes can be predictable given past values of the series, although they can be not linearly predictable. A challenging problem is the specification of the nonlinearity in the conditional mean and the development of consistent specification tests in a general framework.

Department of Statistics and Econometrics University Carlos III of Madrid, 28903 Getafe, Madrid, Spain e-mail: [email protected]. and Department of Statistics and Econometrics University Carlos III of Madrid, 28911 Leganes, Madrid, Spain e-mail: [email protected].

23

APPENDIX: PROOFS In all proofs C, C1 and C2 are generic constants that may change from one expression to another. Let Zn (λ, x) =

n−1 X j=1

where rbj (x) =

1

(n − j) 2 rbj (x)



2 sin jπλ , jπ

n 1 X (Yt − µ){I(Xt−j ≤ x) − F (x)}. n − j t=1+j

First, consider the next lemma.

Lemma A.1: Under (1) and Assumption A1, kSn (η) − Zn (η)k −→P 0. Proof of Lemma A.1: Note that    n n X X 1 1 (Yt − µ)  (I(Xt−j ≤ x) − F (x)) , γ bj (x) = rbj (x) −  n − j t=1+j n − j t=1+j

(15)

and then, we have that Sn (η) = Zn (η) − Rn (η), where  √  n−1 n n X X X 1 1 2 sin jπλ . Rn (η) = (n − j)1/2  (Yt − µ)  (I(Xt−j ≤ x) − F (x)) n − j n − j jπ j=1 t=1+j t=1+j

(16)

Hence, 2  n X 1 1  (Yt − µ) kRn (η)k = (jπ)2 (n − j)1/2 t=1+j j=1 2

where

n−1 X

Fn−j (x) =

Z

[Fn−j (x) − F (x)]2 F (dx),

[−∞,∞]

n 1 X I(Xt−j ≤ x). n − j t=1+j

It is straightforward to show that under (1) for fixed K (n − j)−1/2

n X

t=1+j

(Yt − µ) = OP (1) ∀1 ≤ j < K,

whereas by stationarity, ergodicity and monotonicity of Fn−j (x) we have that the Glivenko-Cantelli Theorem holds, i.e. sup |Fn−j (x) − F (x)| = op (1) ∀1 ≤ j < K,

x∈R

24

then, apply a partition argument similar to that used in the proof of Theorem 1 below and conclude using Theorem 4.2 of Billinsgley (1968) (cf. Theorem A3). Hence kRn (η)k2 = op (1) as n goes to infinity. ¥

Proof of Proposition 1: From (15) it is easy to show that under the assumptions of Proposition 1 we have that for each fixed j, 1 ≤ j < n − 1 sup | (n − j)1/2 {b γ j (x) − rbj (x)} |= oP (1) as n → ∞.

x∈R

Therefore, it is sufficient to show the null distribution of the process rbj (x). To this end, we have to

show that the finite dimensional distributions converge to those of the specified Gaussian process and that the process rbj (x) is tight. The finite dimensional convergence follows easily by applying

the Cramer-Wald device and the martingale central limit theorem for stationary ergodic martingales differences of Billingsley (1961). The tightness follows from Lemma 3.1 in Koul and Stute (1999).

Proof of Theorem 1: We need to show that the finite dimensional projections hZn (η), hi are asymptotically normal ∀h ∈ L2 (Π, ν) with the appropriate covariance matrix and that the sequence {Zn (η)} is tight, see e.g. Theorem 1.8.4 in van der Vaart and Wellner (1996). These two conditions are satisfied if Theorems A1, A2 and A3 below hold. We write for some integer K, √ √ K n−1 X X 1 1 2 sin jπλ 2 sin jπλ 2 2 + (n − j) rbj (x) (n − j) rbj (x) Zn (η) = jπ jπ j=1 j=K+1 :

= ZnK (η) + RnK (η),

say. Theorem A1: Under the conditions of Theorem 1, for an arbitrary but fixed integer K the finite dimensional distributions of ZnK (η), hZnK (η), hi converge to those of Z K (η), hZ K (η), hi ∀h ∈ L2 (Π, ν), where Z K (η) is a Gaussian process with mean zero and covariance operator Z K X K X 2 2 σK,f := (Rf, f ) = E[(Y1 − µ) f (η)f(η 0 )w1−j (x)Ψj (λ)w1−k (x0 )Ψk (λ0 )dν(η)dν(η0 )] j=1 k=1

Π

Theorem A2: Under the conditions of Theorem 1, for an arbitrary but fixed integer K the sequence {ZnK (η)} is tight. Theorem A3: Under conditions of Theorem 1 the process RK n (η) verifies for all ε > 0 · ¸ ¯ ¯ Lim Lim P sup ¯RnK (η)¯ > ε = 0. K−→∞n−→∞

η∈Π

Proof of Theorem A1: Note that   Z n n K+1 n (t−1)∧K  X X X X X K K K hZnK , hi = (Yt − µ) (n − j)−1/2 h(η)wt−j (x)Ψj (λ)dν(η) := Sh,t = Sh,t + Sh,t ,   t=2

j=1

t=2

Π

t=2

t=K+2

(17)

25

K K where Sh,t (η) := (Yt − µ)Qh,t and Qh,t is implicitly defined. Under (1), {Sh,t , Ft } is an adapted

martingale difference sequence with stationary and ergodic differences for t ≥ K + 2. Applying Markov’s inequality it is easy to show that the first summand in the right-hand side of (17) goes to zero in probability. For the second, the CLT for martingales with stationary and ergodic differences (Billingsley (1961)) states that the process converges to a normal distribution. Now we check that the limit variance is the appropriate, i.e., V ar[hZnK , hi] −→ V ar[hZ K , hi]. Under (1) and stationarity σ bh

= V ar[hZnK , hi] =

:

n n X X K K E[Sh,t Sh,t ] = E[(Yt − µ)2 Q2h,t ] t=2

=

K X K n X X (n − j)−1/2 (n − k)−1/2 E[(Yt − µ)2 atjk ] j=1k=1

=

(18)

t=2

t=1+k∨j

K X K X

(n − j)−1/2 (n − k)−1/2 (n − k ∨ j)E[(Y1 − µ)2 a1jk ]

j=1k=1

−→

K X K X E[(Y1 − µ)2 a1jk ] as n −→ ∞, j=1k=1

where atjk =

Z

h(η)h(η0 )wt−j (x)Ψj (λ)wt−k (x0 )Ψk (λ0 )dν(η)dν(η0 )].

Π

Then Theorem A1 follows. ¥ Proof of Theorem A2: We apply Theorem 2.1 of Politis and Romano (1994). Again, we write   n (t−1)∧K  X X µ n ¶1/2 wt−j (x)Ψj (λ) (19) ZnK = n−1/2 (Yt − µ)   n−j t=2 j=1 :

= n−1/2

n K+1 n X X X K K K Sn,t = n−1/2 Sn,t + n−1/2 Sn,t , t=2

For each K ≥ 1 we have that n−1/2

t=2

t=K+2

K+1 P t=2

K Sn,t is tight because a each summand is tight, see Theorem

1.4 in Billingsley (1968), and there is a finite sum. Then, we concentrate on the second summand on the right hand side of (19). To verify Theorem 2.1 in Politis and Romano (1994) we have to show that following conditions hold: ° K °2 ° δ)] ≤ (Yt − Y n−j )2 1(¯(Yt − Y n−j )¯ > δ 0 /n)] a.s. n t=2 t=2

for some positive constants δ and δ 0 . By A1 the last expression converges almost surely to zero. Then the triangular array {ζ ∗nt } satisfies the conditions of the central limit theorem of Lindeberg-Feller, n P conditionally on almost all samples, so that ζ ∗nt =⇒∗ N (0, 1) a.s., using a Strong Law of Large t=2

Numbers for σ bh , we have that hSn∗ (η), hi =⇒∗ N(0, σ2h ).

∗ Second, we have to prove the tightness of the sequence {Sn∗ (η)}. Let Sn,t the bootstrap version of

∗ ∗ Sn,t as in (19) but with K = n − 1. But noting that Sn,t and Sn,s are independent given the sample ° ° 2 ∗ ° for s 6= t, it is sufficient for the tightness that E ∗ °Sn,t < ∞ a.s. for all samples, which is trivially

satisfied, see Example 1.8.5 in van der Vaart and Wellner (1996). The proof is finished. ¥

29

REFERENCES [1] Anderson, T.W. (1993): “Goodness of fit tests for spectral distributions”, Annals of Statistics, 21, 830-847. [2] –––— (1997): “Goodness-of-fit for autoregressive processes”, Journal of Time Series Analysis, 18, 321-339. [3] Auestad, B. and Tjφstheim, D. (1990): “Identification of nonlinear time series: First order characterization and order determination”, Biometrika, 77, 669-687. [4] Barro, R. J. (1981): “On the predictability of the tax rate changes”, NBER Working Paper, No 636. [5] Bera, A. K and Higgins, M. L. (1997): “Arch and bilinearity as competing models for nonlinear dependence”, Journal of Business and Economic Statistics, 15, 43-50. [6] Bickel, P.J. and Wichura, M.J. (1971): “Convergence criteria for multiparameter stochastic processes and some applications”, The Annals of Mathematical Statistics, 42, 1656-1670. [7] Bierens, H. J. (1984): “Model specification testing of time series regressions”, Journal of Econometrics, 26, 323-353. [8] –––— (1990): “A consistent conditional moment test of funcional form”, Ecomometrica, 58, 14431458. [9] Bierens, H. and Ploberger, W. (1997): “Asymptotic theory of integrated conditional moment test”, Econometrica, 65, 1129-1151. [10] Billingsley, P. (1961): “The Lindeberg-Levy theorem for martingales”, Proceedings of the American Mathematical Society, 12, 788-792. [11] –––— (1968): Convergence of Probability Measures. Wiley, New York. [12] Box, G. and Pierce, D., (1970): “Distribution of residual autocorrelations in autorregressive integrated moving average time series models”, Journal of American Statistical Association, 65, 1509-1527. [13] Chang, N. M. (1990): “Weak convergence of a self-consistent estimator of a survival function with doubly censored data”, Annals of Statictic, 18, 391-404. [14] Chen, X. and White, H. (1996): “Laws of large numbers for Hilbert space-valued mixingales with applications”, Econometric Theory, 12, 284-304. 30

[15] –––— (1998): “Central limit and functional central limit theorems for Hilbert-valued dependent heterogeneous arrays with applications”, Econometric Theory, 14, 260-284. [16] Chen, X. and Fan, Y. (1999): “Consistent hypothesis testing in semiparametric and nonparametric models for econometric time series”, Journal of Econometrics, 91, 373-401. [17] Cochrane, J. H. (1988): “How big is the radom walk in GNP?, Journal of Political Economy, 96, 893-920. [18] Dauxois, J., Pousse, A. and Romain, Y. (1982): “Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference”, Journal of Multivariate Analysis, 12, 136-154. [19] De jong, R.M. (1996): “The Bierens’ tets under data dependence”, Journal of Econometrics, 72, 1-32. [20] Delgado, M. A. (1996): “Testing serial independence using the sample distribution function”, Journal of the Time Series Analysis, 17, 271-285. [21] Deo, R. S. (2000): “Spectral tests of the martingale hypothesis under conditional heroscedasticity”, Journal of Econometrics, 99, 291-315. [22] Dominguez, M. and Lobato, I.N. (2000): “A consistent test for the martingale difference hypothesis” Preprint. [23] Durlauf, S. (1991): “Spectral-Based test for the martingale hypothesis”, Journal of Econometrics, 50, 1-19. [24] Gallant, A. R., Hsieh, D. A. and Tauchen, G. (1991): “On Fitting a Recalcitrant Series: The Pound/Dollar Exchange Rate, 1974-1983”, in Nonparametric and Semiparametric Methods in Econometrics and Statistics. Eds.. W. A. Barnett, J. Powell and G. Tauchen, Cambridge, U.K.. Cambridge University Press, pp. 199-240. [25] Granger, C.W.J. and Terasvirta, T. (1993): Modelling Nonlinear Economic Relationships, Oxford University Press: New York. [26] Grenander, U. (1952): “Stochastic Processes and Statistical Inference”, Ark. Mat. 1, 195-277. [27] Hall, R.E. (1978): “Stochastic implications of the life cycle-permanent income hypothesis: theory and evidence”, Journal of Political Economy, 86, 971-987. [28] Hansen, B. (1996): ”Inference when a nuisance parameter is not identified under the null hypothesis”, Econometrica, 64, 413-430. 31

[29] Hardle, W. and Mammen, E. (1993): “Comparing nonparametric versus parametric regresion fits”, Annals of Statistics, 21, 1926-1974 [30] Hinich, M. and Patterson, D. (1992): “A new diagnostic test of model inadequacy which uses the martingale difference criterion”, Journal of Time Series Analysis, 13, 233-252. [31] Hong, Y. (1996): “Consistent testing for serial correlation of unknown form”, Econometrica, 64, 837-864. [32] –––— (1998): “Testing for pairwise independence via the empirical distribution function”, Journal of the Royal Statistical Society Series B, 60, 429-453. [33] –––— (1999): “Hypothesis testing in time series via the empirical characteristic function”, Journal of American Statistical Association, 84, 1201-1220. [34] –––— (2000): “Generalized spectral test for serial dependence”, Journal of the Royal Statistical Society Series B, 62, part3. [35] Hsieh, D. A. (1989): “Testing for nonlinear dependence in daily foreign exchange rates”, Journal of Business, 62, 339-368. [36] Kock, B. and Yang, S. (1986): ”A method for testing the independence of two series that accounts potential Pattern in the Cross Correlations”, Journal of American Statistical Association, 81, 533-544. [37] Koul H.L. and Stute W. (1999): “Nonparametric model checks for time series”, The Annals of Statistics, 27, 204-236. [38] Li, Q. (1999): “Consistent model especification test for time series econometric models”, Journal of Econometrics, 92, 101-147 [39] Ljung, G.M. and Box, G.E.P. (1978): ”A measure of lack of fit in time series models”, Biometrika, 65, 297-303. [40] Lo, A.W. (1997): Market Efficiency: Stock Market Behaviour in Theory and Practice. (Vol. I and II): Edward Elgar. [41] Park, J. Y. and Whang, Y. J. (1999): “Testing for the martingale hypothesis”, Preprint. [42] Parthasarathy, K.R. (1967): Probability Measures on Metric Spaces. New York: Academic Press. [43] Politis, D. and J. Romano (1994): “Limit theorems for weakly dependent Hilbert space valued random variables with application to the stationary bootstrap”, Statistica Sinica, 4, 461-476.

32

[44] Robinson, P.M. (1983): “Nonparametric estimators for time series”, Journal of Time Series Analysis, 4, 185-207. [45] Satterthwaite, F. E. (1941): “Synthesis of variance”, Psychometrica, 6, 309-316. [46] –––— (1946): “An approximate distribution of estimates of variances components”, Biometrics Bulletin, 2, 110-114. [47] Shorack, G. and Wellner, J. (1986): Empirical Processes with Applications to Statistics. Wiley: New York. [48] Stinchcombe, M. and White, H. (1998): “Consistent specification testing with nuisance parameters present only under the alternative”, Econometric Theory, 14, 295-325. [49] Stute, W. (1997): “Nonparametic model checks for regression”, Annals of Statistics, 25, 613-641. [50] Stute, W., Gonzalez-Manteiga, W. and Presedo-Quindimil, M. (1998): “Bootstrap approximations in model checks for regression”, Journal of the American Statistical Association, 93, 141-149. [51] Su, J.Q. and Wei, L.J. (1991): “A lack of fit test for the mean funtion in a generalized lineal model”, Journal of the American Statistical Association, 86, 420-426. [52] Tong, H. (1990): Nonlinear Time Series: A Dynamic System Approach. Clarendon Press: Oxford. [53] van der Vaart, A.W. and Wellner, J.A (1996): Weak Convergence and Empirical Processes. Springer. New York. [54] Whang, Y-J. (2000): “Consistent bootstrap tests of parametric regression functions”, Journal of Econometrics, 98, 27-46 [55] Wu, C.F.J. (1986): “Jacknife, bootstrap and other resampling methods in regression analysis (with discussion)”, Annals of Statistics, 14, 1261-1350 [56] Zheng, X. (1996): “A consistent test of funtional form via nonparametric estimation technique”, Journal of Econometrics, 75, 263-289

33

Table I Size of Tests. IID

GARCH1

10%

5%

1%

10%

5%

1%

Dn2

9.9

5.0

0.8

10.2

4.7

0.9

CvM1

9.3

4.7

0.8

9.7

5.1

0.7

KS1

10.8

5.6

0.8

11.2

5.9

0.7

CvM2

10.3

6.1

0.9

10.6

4.9

1.3

KS2

11.5

6.5

1.9

10.4

6.2

1.2

10.8

5.0

1.1

10.8

4.8

0.9

[9.6]

[4.2]

[1.0]

[9.1]

[4.1]

[0.9]

DURC

Table II Size of Tests. GARCH2

GARCH3

10%

5%

1%

10%

5%

1%

Dn2

9.3

5.2

1.1

10.4

4.9

1.2

CvM1

9.2

4.6

1.1

10.4

4.9

1.1

KS1

10.1

5.4

1.3

11.0

6.1

1.3

CvM2

10.8

5.9

0.9

9.5

4.9

0.9

KS2

10.3

5.5

1.1

10.3

6.1

0.8

12.0

5.0

1.0

10.2

4.1

0.7

[8.8]

[3.9]

[0.6]

[8.4]

[4.1]

[0.6]

DURC

34

Table III Power of Tests NLMA

n=100

n=200

n=300

10%

5%

1%

10%

5%

1%

10%

5%

1%

Dn2

28.0

17.9

3.7

37.5

24.5

9.3

49.1

35.5

14.2

CvM1

26.4

16.1

3.6

37.1

24.5

9.9

49.6

35.6

14.1

KS1

26.9

16.3

3.5

41.4

26.7

10.5

52.9

40.1

17.4

CvM2

20.6

12.5

2.2

28.5

16.9

4.3

35.0

23.0

8.4

KS2

21.5

11.3

2.4

32.9

21.9

6.4

41.9

28.8

11.8

15.0

6.6

0.9

14.2

7.1

0.9

14.0

7.1

1.5

[12.8]

[6.1]

[1.0]

[11.8]

[5.7]

[0.8]

[12.9]

[6.2]

[1.1]

DURC

Table IV Power of Tests BIL-I

n=100

n=200

n=300

10%

5%

1%

10%

5%

1%

10%

5%

1%

Dn2

18.6

9.9

1.9

30.2

18.5

4.4

45.8

26.3

6.7

CvM1

19.2

9.8

1.8

29.2

17.2

4.0

46.9

26.5

7.1

KS1

21.1

10.2

2.8

28.4

17.8

4.8

43.9

29.2

8.1

CvM2

13.0

7.3

1.9

17.8

9.2

1.9

24.9

12.2

2.2

KS2

14.4

7.3

1.5

19.5

10.7

2.6

26.5

15.9

5.1

13.7

6.1

1.7

11.8

6.1

1.1

15.8

9.0

2.4

[12.2]

[5.3]

[1.6]

[13.5]

[6.5]

[1.8]

[14.1]

[8.5]

[2.2]

DURC

Table V Power of Tests BIL-II

n=100

n=200

n=300

10%

5%

1%

10%

5%

1%

10%

5%

1%

Dn2

39.3

23.6

7.1

75.1

56.1

21.4

90.0

78.9

37.4

CvM1

41.4

23.9

7.5

76.3

58.5

21.1

92.9

79.5

33.9

KS1

43.8

28.5

10.5

72.9

56.8

26.5

88.3

77.1

42.6

CvM2

22.6

12.1

2.1

40.8

22.0

4.3

61.1

38.1

9.3

KS2

27.6

17.1

5.5

48.1

33.8

14.2

67.2

53.7

26.9

21.4

11.9

2.8

23.1

14.2

4.9

35.3

24.0

9.8

[19.7]

[11.2]

[2.5]

[25.4]

[17.0]

[5.7]

[33.6]

[22.2]

[9.3]

DURC

35

Table VI Power of Tests n=100

ARMA

NGMA

TAR

Dn2

74.9

23.0

16.6

CvM1

35.2

8.3

7.9

KS1

32.0

7.9

7.9

CvM2

32.8

1.7

8.6

KS2

36.7

3.7

11.9

DURC

83.0

14.1

14.2

Table VII Summary Statistics of Log Price Changes for 100log(St /St−1 ) 1974-1983

1985-1991

Mean

-0.0185

-0.0227

Median

0.0000

0.0000

SD

0.5572

0.4766

Skewness

-0.5145

0.0315

Kurtosis

8.4885

4.7584

Maximum

3.4344

1.9597

Minimun

-3.8427

-2.2520

Table VIII P-values for the BPUSD 1974-1983

1985-1991

n

2505

1210

Dn2

0.002

0.943

CvM1

0.004

0.896

KS1

0.000

0.706

CvM2

0.016

0.870

KS2

0.000

0.673

DU RC

0.230

0.860

36

Table IX Lag j

Generalized dependence measures for BPUSD1 1974-1983 KSY Y (j) [q0.95 ] KSY 2 Y (j) [q0.95 ] KSY 2 Y 2 (j) [q0.95 ] KSY 3 Y (j) [q0.95 ]

1

1.95∗ [1.31]

2.02 [1.29]

2.74∗ [1.24]

1.48∗ [1.21]

2

0.91 [1.39]

1.56 [1.31]

2.91∗ [1.35]

0.78 [1.24]

3

0.71 [1.39]

2.23 [1.26]

2.78∗ [1.34]

0.55 [1.18]



4

1.14 [1.34]

1.59 [1.26]

2.30 [1.29]

0.53 [1.25]

5

1.46∗ [1.37]

2.76 [1.38]

3.50∗ [1.41]

0.94 [1.32]



6

0.56 [1.41]

1.59 [1.32]

2.77 [1.31]

0.86 [1.17]

7

0.91 [1.31]

1.35 [1.29]

2.41∗ [1.28]

0.52 [1.21]



8

0.88 [1.32]

1.93 [1.26]

2.44 [1.26]

1.08 [1.20]

9

1.09 [1.33]

1.71 [1.30]

2.17∗ [1.25]

0.99 [1.21]



10

0.84 [1.38]

1.47 [1.25]

2.45 [1.26]

0.46 [1.18]

20

1.03 [1.34]

1.62∗ [1.30]

2.16∗ [1.27]

0.82 [1.16]



30

0.66 [1.33]

1.21 [1.26]

1.98 [1.37]

0.57 [1.16]

40

0.74 [1.38]

1.26 [1.31]

1.56∗ [1.26]

0.86 [1.19]

50



1.18 [1.31] 1.91 [1.38] 1.28 [1.26] 1.41∗ [1.20] Note:∗ Significantly different from zero at the 5% level (bootstrap test) Table X

Lag j

Generalized dependence measures for BPUSD2 1985-1991 KSY Y (j) [q0.95 ] KSY 2 Y (j) [q0.95 ] KSY 2 Y 2 (j) [q0.95 ] KSY 3 Y (j) [q0.95 ]

1

0.50 [1.37]

1.40∗ [1.33]

0.82 [1.27]

0.76 [1.33]

2

0.78 [1.31]

0.51 [1.23]

0.67 [1.31]

0.63 [1.18]

3

0.86 [1.36]

1.11 [1.27]

1.63∗ [1.29]

0.82 [1.25]

4

0.68 [1.35]

0.82 [1.27]

1.31 [1.33]

0.69 [1.26]

5

0.46 [1.31]

1.19 [1.32]

1.66∗ [1.30]

0.77 [1.21]



6

0.70 [1.33]

1.17 [1.38]

1.89 [1.37]

0.66 [1.26]

7

0.83 [1.34]

0.82 [1.31]

1.43∗ [1.34]

0.66 [1.24]

8

0.55 [1.30]

0.77 [1.33]

1.20 [1.32]

0.79 [1.27]

9

1.20 [1.28]

0.71 [1.30]

0.94 [1.25]

0.89 [1.20]

10

0.88 [1.33]

1.15 [1.31]

1.55∗ [1.30]

0.82 [1.29]

20

0.91 [1.31]

0.89 [1.33]

1.06 [1.34]

1.27 [1.30]

30

0.96 [1.29]

0.69 [1.31]

1.16 [1.33]

0.96 [1.30]

40

0.96 [1.30]

0.88 [1.31]

1.13 [1.34]

0.71 [1.28]

50

1.07 [1.34] 1.23 [1.37] 1.18 [1.30] 0.72 [1.30] Note:∗ Significantly different from zero at the 5% level (bootstrap test)

37

0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 -0.1 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Figure 1. IPAF for Yt BPUSD1 at lag j=1.

0.1 0.08 0.06 0.04 0.02 0 -0.02 -0.04 -0.06 -0.08 -2

-1.5

-1

-0.5

0

0.5

1

1.5

Figure 2. IPAF for Yt2 BPUSD1 at lag j=1.

38

2