Resampling-based confidence regions and multiple ... - VideoLectures

Report 1 Downloads 69 Views
Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Resampling-based confidence regions and multiple tests for a correlated random vector

joint work with

Sylvain Arlot1,2 Gilles Blanchard3 1 Universit´ e 2 Projet

´ Etienne Roquain4

Paris-Sud XI, Orsay

Select, Inria-Futurs

3 Fraunhofer 4 INRA,

FIRST.IDA, Berlin Jouy-en-Josas

Pascal Workshop and Pascal Challenge Type I and Type II errors for Multiple Simultaneous Hypothesis Testing May 16, 2007

1/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Model  Observations :

  Y = (Y , . . . , Y ) =   

Y 1 , . . . , Y n ∈ RK

1

n

Y11 . . . .. . .. . YK1 . . .

Y1n .. . .. . YKn

     

symmetric, e.g. N (µ, Σ)

i.i.d.

Unknown mean µ = (µk )k Unknown covariance matrix Σ Known upper bound σ 2 ≥ maxk var(Yk1 ) n t)   = P kY − µk∞ > tn−1/2

FWER(R) = P(∃k

s.t.

⇒ confidence region and control of the FWER

7/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k

≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ

−1

(α/(2K )).

• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?

8/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k

≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ

−1

(α/(2K )).

• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?

8/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k

≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ

−1

(α/(2K )).

• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?

8/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k

≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ

−1

(α/(2K )).

• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?

8/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Ideal threshold

√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k

Ideal threshold : t = qα? , 1 − α quantile of L



 n sup |Y − µ| .

qα? depends on Σ, unknown (and K 2  Kn) ⇒ qα? estimated by resampling.

9/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Ideal threshold

√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k

Ideal threshold : t = qα? , 1 − α quantile of L



 n sup |Y − µ| .

qα? depends on Σ, unknown (and K 2  Kn) ⇒ qα? estimated by resampling.

9/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Ideal threshold

√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k

Ideal threshold : t = qα? , 1 − α quantile of L



 n sup |Y − µ| .

qα? depends on Σ, unknown (and K 2  Kn) ⇒ qα? estimated by resampling.

9/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Goal

Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.

If R = {k s.t.



n|Yk | > tα (Y)}, FWER(R) ≤ α.

(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?

10/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Goal

Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.

If R = {k s.t.



n|Yk | > tα (Y)}, FWER(R) ≤ α.

(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?

10/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Goal

Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.

If R = {k s.t.



n|Yk | > tα (Y)}, FWER(R) ≤ α.

(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?

10/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Goal

Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.

If R = {k s.t.



n|Yk | > tα (Y)}, FWER(R) ≤ α.

(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?

10/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Outline

1

Introduction

2

Resampling

3

Concentration method

4

Quantile method

5

Step-down procedure

6

Conclusion

11/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Resampling principle [Efron 1979 ; ...]

Sample

Y 1, . . . , Y n

resampling

−→

(W1 , Y 1 ), . . . , (Wn , Y n )

weighted sample

Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)

12/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Resampling principle [Efron 1979 ; ...]

Sample

Y 1, . . . , Y n

resampling

−→

(W1 , Y 1 ), . . . , (Wn , Y n )

weighted sample

Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)

12/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Resampling principle [Efron 1979 ; ...]

Sample

Y 1, . . . , Y n

resampling

−→

(W1 , Y 1 ), . . . , (Wn , Y n )

weighted sample

Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)

12/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Resampling principle [Efron 1979 ; ...]

Sample

Y 1, . . . , Y n

resampling

−→

(W1 , Y 1 ), . . . , (Wn , Y n )

weighted sample

Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)

12/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Examples of resampling weights

Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =

n n−1 1i6=J ,

J ∼ U({1, . . . , n})

V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}

13/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Examples of resampling weights

Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =

n n−1 1i6=J ,

J ∼ U({1, . . . , n})

V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}

13/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Examples of resampling weights

Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =

n n−1 1i6=J ,

J ∼ U({1, . . . , n})

V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}

13/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Examples of resampling weights

Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =

n n−1 1i6=J ,

J ∼ U({1, . . . , n})

V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}

13/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Examples of resampling weights

Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =

n n−1 1i6=J ,

J ∼ U({1, . . . , n})

V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}

13/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Quantile method Ideal threshold : qα? , 1 − α quantile of L



nkY − µk∞



⇒ Resampling estimate of qα? :  √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n

YW

1X Wi Y i := n

Resampling empirical mean

i=1

n

W :=

1X Wi n i=1

qαquant (Y) depends only on Y, so it can be computed with real data

14/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Quantile method Ideal threshold : qα? , 1 − α quantile of L



nkY − µk∞



⇒ Resampling estimate of qα? :  √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n

YW

1X Wi Y i := n

Resampling empirical mean

i=1

n

W :=

1X Wi n i=1

qαquant (Y) depends only on Y, so it can be computed with real data

14/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Quantile method Ideal threshold : qα? , 1 − α quantile of L



nkY − µk∞



⇒ Resampling estimate of qα? :  √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n

YW

1X Wi Y i := n

Resampling empirical mean

i=1

n

W :=

1X Wi n i=1

qαquant (Y) depends only on Y, so it can be computed with real data

14/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method

kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2   Estimate E kY − µk∞ by resampling

⇒ qαconc (Y) = cst ×



  nE kYW − W Yk∞ |Y + remainder(σ, α, n)

Works if expectations (∝ (∝ n−1/2 )

p log(K )) are larger than fluctuations

15/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method

kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2   Estimate E kY − µk∞ by resampling

⇒ qαconc (Y) = cst ×



  nE kYW − W Yk∞ |Y + remainder(σ, α, n)

Works if expectations (∝ (∝ n−1/2 )

p log(K )) are larger than fluctuations

15/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method

kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2   Estimate E kY − µk∞ by resampling

⇒ qαconc (Y) = cst ×



  nE kYW − W Yk∞ |Y + remainder(σ, α, n)

Works if expectations (∝ (∝ n−1/2 )

p log(K )) are larger than fluctuations

15/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method

kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2   Estimate E kY − µk∞ by resampling

⇒ qαconc (Y) = cst ×



  nE kYW − W Yk∞ |Y + remainder(σ, α, n)

Works if expectations (∝ (∝ n−1/2 )

p log(K )) are larger than fluctuations

15/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Results from empirical process theory

n

(YW

1X − W Y) = (Wi − W )Y i = YW −W n i=1

⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].

16/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Results from empirical process theory

n

(YW

1X − W Y) = (Wi − W )Y i = YW −W n i=1

⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].

16/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Results from empirical process theory

n

(YW

1X − W Y) = (Wi − W )Y i = YW −W n i=1

⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].

16/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Classical procedures and existing results

Parametric statistics : no, because K  n. Asymptotic results : not valid, because K  n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations

17/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Classical procedures and existing results

Parametric statistics : no, because K  n. Asymptotic results : not valid, because K  n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations

17/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Classical procedures and existing results

Parametric statistics : no, because K  n. Asymptotic results : not valid, because K  n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations

17/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method : three ideas

  kY − µk∞ ' E kY − µk∞

    E kY − µk∞ ∝ E kYW − W Yk∞     E kYW − W Yk∞ |Y ' E kYW − W Yk∞

⇒ qαconc (Y) ∝



  nE kYW − W Yk∞ |Y + remainders(σ, n, α)

18/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method : three ideas

  kY − µk∞ ' E kY − µk∞

    E kY − µk∞ ∝ E kYW − W Yk∞     E kYW − W Yk∞ |Y ' E kYW − W Yk∞

⇒ qαconc (Y) ∝



  nE kYW − W Yk∞ |Y + remainders(σ, n, α)

18/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Concentration method : three ideas

  kY − µk∞ ' E kY − µk∞

    E kY − µk∞ ∝ E kYW − W Yk∞     E kYW − W Yk∞ |Y ' E kYW − W Yk∞

⇒ qαconc (Y) ∝



  nE kYW − W Yk∞ |Y + remainders(σ, n, α)

18/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

First result Theorem W exchangeable (Efron, Rademacher, Random hold-out, Leave-one-out). For every α ∈ (0; 1),  √    nE kYW − W Yk∞ |Y CW −1 conc,1 +σΦ (α/2) √ qα (Y) := +1 BW nBW satisfies

√ P

 nkY − µk∞ > qαconc,1 (Y) ≤ α

with σ 2 := maxk var(Yk1 ), and 

BW := E

1 n

n X

(Wi − W )2

1/2



1/2

> 0 et CW := (n/(n − 1))E(W1 − W )2

i=1

19/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Sketch of the proof

Expectations :     −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2   Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1

20/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Sketch of the proof

Expectations :     −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2   Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1

20/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Sketch of the proof

Expectations :     −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2   Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1

20/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.

21/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.

21/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.

21/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.

21/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.

21/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.

Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞

e 2

≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1

Complexity nn 2n n n/2



∝ n−1/2 2n n V

p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.

Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞

e 2

≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1

Complexity nn 2n n n/2



∝ n−1/2 2n n V

p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.

Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞

e 2

≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1

Complexity nn 2n n n/2



∝ n−1/2 2n n V

p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.

Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞

e 2

≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1

Complexity nn 2n n n/2



∝ n−1/2 2n n V

p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Simulations : n = 1000, K = 16384, σ = 1

23/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Second concentration threshold   If qαconc,1 (Y) was constant, we would take min qαconc,1 (Y), tαBonf as threshold. Theorem Same assumptions as Thm. 1. Then, ∀α, δ ∈ (0; 1), the threshold qαconc,2 (Y) equal to h

0



min @t Bonf

α(1−δ) ,

E



nkYW −W k∞ Y BW

satisfies

√ P

i

1

+ σΦ

−1

σCW −1 (α(1 − δ)/2) + √ Φ (αδ/2)A nBW

 nkY − µk∞ > qαconc,2 (Y) ≤ α

⇒ qαconc,2 (Y) always better than qαconc,1 (Y) and Bonferroni threshold (up to δ)

24/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Second concentration threshold   If qαconc,1 (Y) was constant, we would take min qαconc,1 (Y), tαBonf as threshold. Theorem Same assumptions as Thm. 1. Then, ∀α, δ ∈ (0; 1), the threshold qαconc,2 (Y) equal to h

0



min @t Bonf

α(1−δ) ,

E



nkYW −W k∞ Y BW

satisfies

√ P

i

1

+ σΦ

−1

σCW −1 (α(1 − δ)/2) + √ Φ (αδ/2)A nBW

 nkY − µk∞ > qαconc,2 (Y) ≤ α

⇒ qαconc,2 (Y) always better than qαconc,1 (Y) and Bonferroni threshold (up to δ)

24/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Simulations : n = 1000, K = 16384, σ = 1

25/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Method Rademacher weights : Wi i.i.d. ∼ 21 δ−1 + 21 δ1 Resampling heuristics suggests that qαquant (Y), the (1 − α) quantile of  √ L nkYW − W Yk∞ Y  should satisfy P kY − µk∞ > qαquant (Y) ≤ α.

 √ qαquant (Y) = inf x PW ( n||Y − µ||∞ > x) ≤ α     X −n X √ 1 n i = inf x 2 wi (Y − Y) > x ≤ α 1I n n ∞ n w ∈{−1,1}

i=1

26/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Theorem Theorem Let α, δ, γ ∈ (0, 1) and f a non-negative threshold with FWER bounded by αγ/2 : √ P

 αγ nkY − µk > f (Y) ≤ 2

Then, r qαquant+f (Y) =

quant qα(1−δ)(1−γ) (Y)

+

2 log(2/(δα)) f (Y) n

has a FWER bounded by α : √  P nkY − µk > qαquant+f (Y) ≤ α

27/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.

28/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.

28/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.

28/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Remarks

Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.

28/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Simulations : n = 1000, K = 16384, σ = 1

29/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Simulations : without the additive term ?

30/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure [Holm 1979 ; Westfall and Young 1993 ; Romano and Wolf 2005] sup 1≤k≤K



 Yk − µk ≥ sup Yk − µk µk =0

31/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure

Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define  k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).

32/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure

Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define  k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).

32/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure

Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define  k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).

32/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure

Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define  k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).

32/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Step-down procedure

Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define  k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).

32/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Simulations : with non-zero means, 0 ≤ µk ≤ 0.29, b = 24

33/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Numerical simulation : 0 ≤ µk ≤ 0.29, b = 24

Empirical means

Bonf. : t = 0.148

Holm : t = 0.146

Quant+Bonf : t = 0.123

Uncentered quant. : t = 0.122 (=S-d Q+B)

Quant. : t = 0.106

34/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Conclusions

Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures

35/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Conclusions

Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures

35/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Conclusions

Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures

35/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Conclusions

Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures

35/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Conclusions

Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures

35/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Future works

FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?

36/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Future works

FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?

36/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Future works

FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?

36/37

Introduction

Resampling

Concentration method

Quantile method

Step-down procedure

Conclusion

Thank you for your attention !

37/37