SIPPEL4IMSC normalization

Report 4 Downloads 238 Views
Quantifying changes in climate variability and extremes: pitfalls and their overcoming Sebastian Sippel 1 Jakob Zscheischler 2 Martin Heimann 1 Friederike E.L. Otto 3 Miguel Mahecha 1 1 2

Max Planck Institute for Biogeochemistry, Jena, Germany

Institute for Atmospheric and Climate Science, ETH Z¨ urich, Switzerland 3

Environmental Change Institute, University of Oxford, Oxford, UK

09.06.2016

Sippel et al. (MPI-BGC)

09.06.2016

1 / 18

Summary for Policymakers

Are temperature extremes changing?

ns of Vulnerability, tremes, d Disaster

extremes and the potential the climate extremes e exposure and vulnerability ystems. Observed changes

Probability of Occurrence

a)

more hot weather

less cold weather

more extreme hot weather

less extreme cold weather

Increased Variability Probability of Occurrence

ads to changes in the spatial extent, duration, e weather and climate in unprecedented climate events. Changes ed to changes in the mean, obability distributions, or all Some climate extremes (e.g., sult of an accumulation of nts that are not extreme endently. Many extreme ents continue to be the variability. Natural variability tor in shaping future the effect of anthropogenic ]

Shifted Mean

b)

Concept from IPCC (2012): Extremes could change by more cold weather

more hot weather

I changes in mean climate more extreme hot weather

more extreme cold weather

I changes in climate variability I changes in ‘symmetry’

Changed Symmetry Probability of Occurrence

an substantially affect esources and the capacity nities to prepare for and ers. [2.2, 2.7]

c)

Without climate change With climate change

more hot weather

near constant cold weather near constant extreme cold weather extreme cold cold

more extreme hot weather

Mean:

hot extreme hot

without and with weather change

Figure SPM.3 | The effect of changes in temperature distribution on extremes. Different changes in temperature distributions between present and Sippel et al. (MPI-BGC) future climate and their effects on extreme values of the distributions:

09.06.2016

2 / 18

Approaches to deal with extremes:

P(envnovel)

P(envref)

I

Building on a reference of normality

Sippel et al. (MPI-BGC)

09.06.2016

3 / 18

Introduction

Methodological basis

Carbon Cycle Impacts

Outlook

Are temperature extremes changing?

C HANGE IN TEMPERATURE VARIABILITY ?

i Xref Zi = Xs(X ref ) Normalization:

z=

X − X ref s(X ref )

Hansen al. (2012) “Areachange covered temperature anomalies in the Hansen et al et (2012), PNAS:PNAS ‘An important is theby emergence of a category of summertime categories definedextremely as hot (hot > outliers, 0.43 ),more verythan hotthree (> 2standard ), and extremely hot deviations warmer than thedivisions climatology for of the 1951–1980 base Anomalies are relative (> 3 ),(3σ) with analogous cold anomalies. period [...] now typically covers 10% of the land area.’ to 1951-1980 base period, with also from 1951-1980 data. Lowest row is Southern Hemisphere summer. ”

Sippel et al. (MPI-BGC)

09.06.2016

4 / 18

C HANGE IN TEMPERATURE VARIABILITY ? Are temperature extremes changing?

i Xref Zi = Xs(X ref ) Normalization:

z=

X − X ref s(X ref )

Hansen al. (2012) “Areachange covered temperature anomalies in the Hansen et al et (2012), PNAS:PNAS ‘An important is theby emergence of a category of summertime categories definedextremely as hot (hot > outliers, 0.43 ),more verythan hotthree (> 2standard ), and extremely hot deviations warmer than thedivisions climatology for of the 1951–1980 base Anomalies are relative (> 3 ),(3σ) with analogous cold anomalies. period [...] now typically covers 10% of the land area.’ to 1951-1980 base period, with also from 1951-1980 data. Lowest row is Southern Hemisphere summer. ” I We show that normalization approaches are systematically biased, if z-scores (i.e. the magnitude of extremes) are spatially aggregated for subsequent analysis.

Sippel et al. (MPI-BGC)

09.06.2016

4 / 18

The normalization issue

Consider a simple conceptual (& artificial) scenario: I simulate for many ‘grid cells’ each 60 random Gaussian variables (i.e. ‘60 years’), and

subsequently normalize with X 1951−1980 and s(X1951−1980 ) I count ‘2-sigma’ extremes across 10000 grid cells for each time step

2−sigma extremes

a 350 300 250 200 150 100 0

10

20

30

40

50

60

time steps

Sippel et al. (MPI-BGC)

09.06.2016

5 / 18

The normalization issue

Consider a simple conceptual (& artificial) scenario: I simulate for many ‘grid cells’ each 60 random Gaussian variables (i.e. ‘60 years’), and

subsequently normalize with X 1951−1980 and s(X1951−1980 ) I count ‘2-sigma’ extremes across 10000 grid cells for each time step

2−sigma extremes

a 350 Gaussian, i.i.d. variables Normalized extremes

300 250 200 150 100 0

10

20

30

40

50

60

time steps

Sippel et al. (MPI-BGC)

09.06.2016

5 / 18

... yields the following PDF’s: b

c

Anomalies

Standardization of anomalies

Standardization of anomalies Original Gaussian PDF Out−of base PDF Reference period PDF

−6

−4

−2

0

2

4

6

−6

−4

z−scores

−2

0

4

6

z−scores

I

whole PDF does not appear to have changed much

I

z-scores in the tails show a distinct change

I

i.e. future PDF’s show a heavier tail

Sippel et al. (MPI-BGC)

2

09.06.2016

6 / 18

How does the bias occur? sample size vs. ‘normalization bias’:

what Overestimation of extremes [%]

A 200 150 100

1−sigma extremes 2−sigma extremes 3−sigma extremes

50 20 20 10 0 20

Sippel et al. (MPI-BGC)

40

60 80 100 120 sample size in ref. period

140

09.06.2016

7 / 18

The underlying problem:

In the temporal domain: ... what is meant to be done: z=

X −µ σ

... and what is really being done: z=

X −µ ˆref σ ˆref

with known behaviour of the estimators (e.g. Storch and Zwiers (2001)): µ ˆ =

n 1 X n i=1

ˆ 2 σ2 = S =

xi ∼ N (µ,

σ2 ) n

n X 1 1 2 2 2 (Xi − µ) ˆ ∼ σ χn−1 n − 1 i=1 n−1

Sippel et al. (MPI-BGC)

09.06.2016

8 / 18

The underlying problem:

In the temporal domain: ... what is meant to be done: z=

X −µ σ

In the spatial domain: given any time t in the out-of-base period: I

Anomalies: Xanom = X − µ ˆref 1 Xanom ∼ N (0, σ 2 (1 + )) n

I

Standardization: z = r

... and what is really being done: z=

X −µ ˆref σ ˆref

with known behaviour of the estimators (e.g. Storch and Zwiers (2001)): µ ˆ =

n 1 X n i=1

ˆ 2 σ2 = S =

xi ∼ N (µ,

σ2 ) n

n X 1 1 2 2 2 (Xi − µ) ˆ ∼ σ χn−1 n − 1 i=1 n−1

Sippel et al. (MPI-BGC)

Xanom σ ˆ ref

1 n Normalization in the temporal domain causes biases in spatially aggregated data! z ∼ t(n − 1) ·

1+

09.06.2016

8 / 18

A correction for stationary time series I

B

6

6

Gaussian vs. tau quantiles 1:1 line '2−sigma extreme' '3−sigma extreme'

4

t−distribution (df = 14)

tau−distribution (Thompson, 1935)

A

2 0 −2 −4 −6

4 2 0 −2 −4 −6

−6

−4

−2

0

2

4

Gaussian distribution

6

−6

−4

−2

0

2

4

6

Gaussian distribution

assumptions: I

i.i.d. Gaussian variables

Sippel et al. (MPI-BGC)

09.06.2016

9 / 18

A correction for stationary time series II

2−sigma extremes

a 350 Gaussian, i.i.d. variables Normalized extremes

300 250 200 150 100 0

10

20

30

40

50

60

40

50

60

time steps

2−sigma extremes

a

500 450 400 350 300 250 200 150

Gaussian, i.i.d. variables Normalized extremes Correction, out−of−base extremes Correction, ref. period extremes

0

10

20

30 time steps

Sippel et al. (MPI-BGC)

09.06.2016

10 / 18

Correction of non-stationary time series

2-sigma extremes

b

500 450 400 350 300 250 200 150

linear trends, constant SD Non-changed extremes Normalized extremes Non-central t, δ known Non-central t, δ estimated Correction, ref. period

0

10

20

30

40

50

60

40

50

60

time steps

2-sigma extremes

c

500 450 400 350 300 250 200 150

linear trends and SD changes Non-changed extremes Normalized extremes Non-central t, δ and λ known Non-central t, δ and λ estimated Correction, ref. period

0

10

20

30 time steps

Sippel et al. (MPI-BGC)

09.06.2016

11 / 18

iation from the hemispheric mean. However, inthe weak warming trend in the UnitedExtremes States Application I: Temperature e, if it is a statistical fluke, the United States may rapid trend toward more extreme anomalies. chers suggest that high summer temperatures and United States in the 1930s can be accounted for by ity of sea surface temperature patterns (16, 17). Is this relevant in the literature? ers (18–20) have presented evidence that agriculplowing of the Great Plains) and crop failure in the

as a cause of these extreme events, offerin logical explanation. For example, it is said t wave was caused by an extreme atmospheric or the Texas heat wave was caused by La Niñ patterns. Certainly the locations of extrem given case depend on specific weather patte ing patterns and La Niñas have always be large areas of extreme warming have com with large global warming. Today’s extrem

ea covered by temperature anomalies in categories defined as hot (>0.43σ), very hot (>2σ), and extremely hot (>3 Hansen et al., 2012, PNAS, 10.1073, pp. 1-9 e period; σ is from 1951–1980 data.

nas.org/cgi/doi/10.1073/pnas.1205276109

Sippel et al. (MPI-BGC)

09.06.2016

12 / 18

Application I: Temperature Extremes

Land Area [%], T > +2 sigma

a JJA, 30°N−80°N

15

Out−of−base, standard normalization Out−of−base, corrected Ref. period, standard normalization Ref. period, corrected Change Point Piecewise linear regression

10 5 0 1950

1960

1970

1980

1990

2000

2010

2000

2010

Years

Land Area [%], T > +3 sigma

b 5

JJA, 30°N−80°N Out−of−base, standard normalization Out−of−base, corrected Ref. period, standard normalization Ref. period, corrected

4 3

Change Point

2 1 0 1950

1960

1970

1980

1990

Years

Sippel et al. (MPI-BGC)

09.06.2016

13 / 18

variability changes are occurring, we analyse weather station data19 that

each series has curves in Supple (CRU) climatology. We detrend each station data set of calculated yearly significant chan mean temperatures and derive a time-evolving series of standard devia- mid Europe a tions. To illustrate behaviour before and after 1980, we scale such that increased durin Figure 1e sho a 0.5 the ERA-40 da 1958–1970 0.4 standard deviat 1971–1980 0.3 1981–1990 the periods bef 0.2 1991–2001 large changes in 0.1 than 25%, both 0.0 –6 –4 –2 0 2 4 6 North America Temperature anomaly experienced inc b 0.5 of changes is se 1958–1970 0.4 Europe the larg 1971–1980 0.3 1981–1990 is the strong sp 0.2 1991–2001 coherent region 0.1 Hovmo¨ller p 0.0 –6 –4 –2 0 2 4 6 Figure 2 compl Temperature anomaly 10u latitude ba c 2.5 1968–1980, ca 2.0 Interim period 1.5 are apparent, a 1.0 suggestion, tow 0.5 increased stand s.d. of ERA-40 Normalized by 1963–1980 Ratio of two ERA-Interim 0.0 1995. 1960 1970 1980 1990 2000 2010 Year To assess wh d we analyse proj 150 Increase in s.d. after 1980 shows the glob Decrease in s.d. after 1980 Huntingford et al., 2013, Nature, 500, 327-331 100 model, HadGEM and cyan curv 50 Coupled Mode 0 has four ensem 0.0 0.5 1.0 1.5 2.0 Mean s.d. value in period up to 1980 (K) historical clima changes prescri e 90° The mean of the thick black curv 45° during the twe increase in vari is the mean of t 0° 09.06.2016 (moving 13 /averag 18

contributes to the University of East Anglia Climatic Research Unit Application II: Temperature Variability

Sippel et al. (MPI-BGC)

ude north

Occurrences

Global s.d.

Distribution

Distribution

Analysis of (temperature) variability:

Application II: Temperature variability

Standard Deviation

a

Artificial Example SD: Gaussian, i.i.d. variables SD: Normalized, Gaussian i.i.d. variables, ref. period 1963−1980

1.10 1.05 1.00 0.95 0.90 1960

1970

1980

1990

2000

2010

2000

2010

Years b

20th Century Reanalyses, Global, area−weighted, annual SD

Standard Deviation

Global SD SD: Normalized, ref. period 1963−1980 SD: Normalized, empirical correction, ref. period 1963−1980 SD: Normalized, analytical correction, ref. period 1921−1950 Change in global SD (1981−1996 vs. 1963−1980, in %): + 2 % time−averaged global SD + 13 % uncorrected, time−averaged normalized SD + 6 % empirically corrected, time−averaged normalized SD + 6 % analytically corrected, time−averaged normalized SD

1.5 1 0.5 0 1960

1970

1980

1990 Years

Sippel et al. (MPI-BGC)

09.06.2016

14 / 18

Application III: Normalization of Precipiation

Global-scale increase in precipitation totals and extremes? I

a

Normalization by dividing through the mean (for, e.g. precipitation) leads to similar artefacts, affecting both spatial means and variability 0.6

Original PDF Ref. period PDF Out−of−base PDF

0.5

Original time series Ref. period time series Normalized time series

1.04 Mean of GEV

0.4 Density

b 1.06

0.3 0.2

1.02 1.00 0.98

0.1 0.0

0.96 −1

0

1

2

3

4

0

10

20

30

40

50

60

Time step

Sippel et al. (MPI-BGC)

09.06.2016

14 / 18

Further relevance in the literature

I

Changes in precipitation, normalized by dividing through the sample mean of a reference period

I

Changes in the assymetry of temperatures

I

‘Ranking of weather events’

I

Temperature extreme indices

Sippel et al. (MPI-BGC)

09.06.2016

15 / 18

Conclusions

I

‘Reference period normalization’ is a highly critical step in data preprocessing and needs to be applied with caution

I

An un-adjusted assessment of the tails or variability of normalized, and spatially or temporally aggregated climatic data is systematically biased

I

We propose a correction that can account for those normalization issues

Sippel et al. (MPI-BGC)

09.06.2016

16 / 18

Advertisement: If you want to play...: I

github.com/sebastian-sippel/normalization

I

Have a look in the original paper: Geophysical Research Letters RESEARCH LETTER 10.1002/2015GL066307 Key Points: • Conventional normalization of spatiotemporal data sets with respect to a reference period induces artifacts • Normalization-induced artifacts are most severe if variability or extremes are under scrutiny • The study provides an analytical correction and accurate estimate of variability and extremes

Supporting Information: • Supporting Information S1

Correspondence to: S. Sippel, [email protected]

Citation: Sippel, S., J. Zscheischler, M. Heimann, F. E. L.Otto, J. Peters, and M. D. Mahecha (2015), Quantifying changes in climate variability and extremes: Pitfalls and their

Sippel et al. (MPI-BGC)

Quantifying changes in climate variability and extremes: Pitfalls and their overcoming Sebastian Sippel1,2 , Jakob Zscheischler1,2 , Martin Heimann1 , Friederike E. L. Otto3 , Jonas Peters4 , and Miguel D. Mahecha1,5 1 Max Planck Institute for Biogeochemistry, Jena, Germany, 2 Institute for Atmospheric and Climate Science, ETH Zürich,

Zürich, Switzerland, 3 Environmental Change Institute, University of Oxford, Oxford, UK, 4 Max Planck Institute for Intelligent Systems, Tübingen, Germany, 5 German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Leipzig, Germany

Abstract

Hot temperature extremes have increased substantially in frequency and magnitude over past decades. A widely used approach to quantify this phenomenon is standardizing temperature data relative to the local mean and variability of a reference period. Here we demonstrate that this conventional procedure leads to exaggerated estimates of increasing temperature variability and extremes. For example, the occurrence of “two-sigma extremes” would be overestimated by 48.2% compared to a given reference period of 30 years with time-invariant simulated Gaussian data. This corresponds to an increase from a 2.0% to 2.9% probability of such events. We derive an analytical correction revealing that these artifacts prevail in recent studies. Our analyses lead to a revision of earlier reports: For instance, we show that there is no evidence for a recent increase in normalized temperature variability. In conclusion, we provide an analytical pathway to describe changes in variability and extremes in climate observations and model simulations.

09.06.2016

17 / 18

Thanks for the attention!

Sippel et al. (MPI-BGC)

09.06.2016

17 / 18

What about the dependent period?

b

c

Anomalies

Standardization of anomalies

Standardization of anomalies Original Gaussian PDF Out−of base PDF Reference period PDF

−6

−4

−2

0 z−scores

2

4

6

−6

−4

−2

0

2

4

6

z−scores

I

Variance across grid cells changes to Xanom,ref ∼ N (0, σ 2 (1 − n1 ))

I

It can be shown that a dependent normalization leads to a distribution related to the t-distribution (Thompson 1935): √ τ · nref − 2 t= √ . (1) nref − 1 − τ 2 Sippel et al. (MPI-BGC)

09.06.2016

17 / 18

What about non-stationary time series? I Trend or changes in variance? 1−sigma extremes 2−sigma extremes 3−sigma extremes

50 20 20 10 0 −1.0

−0.5

0.0 0.5 1.0 Mean change [sigma]

1.5

2.0

D 200 150 100

what Overestimation of extremes [%]

what Overestimation of extremes [%]

C 200 150 100

1−sigma extremes 2−sigma extremes 3−sigma extremes

50 20 20 10 0 0.5

1.0 1.5 Change in SD [sigma]

2.0

Autocorrelation? what Overestimation of extremes [%]

B 200 150 100 50 20 20

1−sigma extremes 2−sigma extremes 3−sigma extremes

10 0 0.0

Sippel et al. (MPI-BGC)

0.2 0.4 0.6 0.8 Autocorrelation, AR(1) coefficient

09.06.2016

17 / 18

What about non-stationary time series? I

Consider an offset (δ) or changes in variance (λσ ) in the independent period: X (t) ∼ N (µ + δ(t), λ2 · σ 2 ).

r [ Xqanom2 −δ(t) + q δ(t) ] λ +n1 λ2 + n 1 Xanom (t) 1 ref ref = λ2 + · Yσˆ ref nref Yσˆ ref r δ(t) Xanom (t) 1 ⇒z = ∼ λ2 + · t 0 (ν = n − 1, ncp = q Yσˆ ref nref λ2 +

1

).

nref

Correction with the non-central t-distribution Tf ,δ = √Z +δ ! V /f

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

Latitude [°N]

c

80 20

60

15 40

10 5

20

0 0 1950

1960

1970

1980

1990

2000

Land area [%], T > +2 sigma

Application I: Temperature Extremes

2010

Latitude [°N]

d

80 40

60

20 40

0 −20

20

−40

0 1950

1960

1970

1980

1990

2000

Normalization−induced bias [%]

Years

2010

Years

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

Application III: Assymetry of temperatures

Increasing asymmetry in seasonal maxima?

www.nature.com/scientificreports

Kodra and Ganguly (2014) Scientific Reports, 4 (5884)

I Data preprocessing: Estimation of seasonal extremes per grid cell (i.e. maximum/minimum seasonal daily temperature) in a past (1970-1999) and future (2070-2099) period across CMIP5 models. Subsequent subtraction of the average historical extreme climatology from the future period. Subsequently, percentile-specific changes across grid cells are investigated:

I ”virtually all statistics of temperature extremes are projected to increase, GCMs consistently show asymmetry in projected changes of seasonal extrema. For each of summer and winter maxima and minima, the highest percentiles will increase significantly more than will the lowest.” (Kodra and Ganguly, 2014)

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

Application III: Assymetry of temperatures Artificial example with Gaussian data: Percentile Change w.r.t. 'historical climatology'

A

0.04

Extreme values in Gaussian i.i.d. variables: 'Future' − 'Past' (per percentile) Original time series Subtracted mean historical extreme climatology

0.02 0.00

−0.02 −0.04 0

20

40

60

80

100

Percentiles B

0.02

0.06

Original time series Subtracted mean historical extreme climatology

−0.02

Difference between upper and lower tail

Extreme values in Gaussian i.i.d. variables: 'Future' − 'Past' (per percentile, see text)

55th−45th perc.

Sippel et al. (MPI-BGC)

75th−25th perc.

95th−5th perc.

09.06.2016

18 / 18

Autocorrelation and normalization

Consider AR(1) process: Xt = Xt−1 + Z

(2)

For autocorrelated data (see e.g. Zieba 2010 Metrol. Meas. Syst.): σ 2 (X ) = [n + 2

n−1 X σ2 (n − k)ρk ] 2 n k=1

with σX2 AR(1) =

τ2 1−α2

follows for ‘independent’ (out-of-base) anomalies:

Var (XAR(1),anom ) =

n−1 1 2 X τ2 (1 + + (n − k)ρk ) 1 − α2 n n2 k=1

The presence of any positive autocorrelation further inflates the out-of-base variance!

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

Normalization of autocorrelated variables

1. The variance of the sample mean is larger for autocorrelated data Pn

(x −x)2

2. The standard estimator s 2 = i=1n−1i (e.g. Zieba 2010 and elsewhere!)

is biased for autocorrelated data

3. Standardized autocorrelated data are not Gaussian (as shown before)

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

1. An unbiased variance estimator for autocorrelated data: For autocorrelated data, one may introduce an effective sample size (e.g. Zieba 2010): n−1 X σ2 σ2 σ 2 (X ) = [n + 2 , (n − k)ρk ] 2 = n neff k=1

with neff =

n 1+2

Pn−1 (n−k) k=1

n

ρk

; i.e. neff < n for any positive autocorrelation.

It is well-known that s 2 is biased for autocorrelated data, because √ 2 2 2 E [s 2 ] = σ 2 − σn and σn < nσeff . Hence, standardization with E [ s 2 ] < 1 will (possibly strongly) increase the variance of the ‘normalized variables’, but can 2 eff (n−1) be corrected with scor = Cs 2 and C = nn(n (Zieba 2010). −1 ) eff

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

2. Standardization with autocorrelated data: It can be numerically verified that ‘normalized’ AR(1) data follows approximately a t-distribution with k=n-1.

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18

Summary: Autocorrelated data:

I

Generation of anomalies with autocorrelated data inflates the variance (=increase in standardized extremes), but can be corrected if the autocorrelation structure is known.

I

the variance estimator s 2 is biased for autocorrelated data (underestimated, i.e. increases standardized extremes), which can be corrected using the ‘effective sample size’

I

For practical applications, it can probably be assumed that ‘normalized’ AR(1) data with white noise innovations follow roughly a t-distribution with k=n-1.

Sippel et al. (MPI-BGC)

09.06.2016

18 / 18