Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Resampling-based confidence regions and multiple tests for a correlated random vector
joint work with
Sylvain Arlot1,2 Gilles Blanchard3 1 Universit´ e 2 Projet
´ Etienne Roquain4
Paris-Sud XI, Orsay
Select, Inria-Futurs
3 Fraunhofer 4 INRA,
FIRST.IDA, Berlin Jouy-en-Josas
Pascal Workshop and Pascal Challenge Type I and Type II errors for Multiple Simultaneous Hypothesis Testing May 16, 2007
1/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Model Observations :
Y = (Y , . . . , Y ) =
Y 1 , . . . , Y n ∈ RK
1
n
Y11 . . . .. . .. . YK1 . . .
Y1n .. . .. . YKn
symmetric, e.g. N (µ, Σ)
i.i.d.
Unknown mean µ = (µk )k Unknown covariance matrix Σ Known upper bound σ 2 ≥ maxk var(Yk1 ) n t) = P kY − µk∞ > tn−1/2
FWER(R) = P(∃k
s.t.
⇒ confidence region and control of the FWER
7/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k
≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ
−1
(α/(2K )).
• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?
8/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k
≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ
−1
(α/(2K )).
• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?
8/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k
≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ
−1
(α/(2K )).
• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?
8/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Bonferroni threshold Union bound : √ FWER(R) ≤ K sup P( n|Yk − µk | > t) k
≤ 2K Φ(t/σ), where Φ is the standard Gaussian upper tail function. Bonferroni’s threshold : tαBonf = σΦ
−1
(α/(2K )).
• deterministic threshold • too conservative if there are strong correlations between the coordinates Yk (K ↔ 1 if Y1 = · · · = YK ) How to do better ?
8/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Ideal threshold
√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k
Ideal threshold : t = qα? , 1 − α quantile of L
√
n sup |Y − µ| .
qα? depends on Σ, unknown (and K 2 Kn) ⇒ qα? estimated by resampling.
9/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Ideal threshold
√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k
Ideal threshold : t = qα? , 1 − α quantile of L
√
n sup |Y − µ| .
qα? depends on Σ, unknown (and K 2 Kn) ⇒ qα? estimated by resampling.
9/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Ideal threshold
√ FWER(R) ≤ P(∃ k s.t. n|Yk − µk | > t) √ ≤ P( n sup |Yk − µk | > t) k
Ideal threshold : t = qα? , 1 − α quantile of L
√
n sup |Y − µ| .
qα? depends on Σ, unknown (and K 2 Kn) ⇒ qα? estimated by resampling.
9/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Goal
Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.
If R = {k s.t.
√
n|Yk | > tα (Y)}, FWER(R) ≤ α.
(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?
10/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Goal
Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.
If R = {k s.t.
√
n|Yk | > tα (Y)}, FWER(R) ≤ α.
(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?
10/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Goal
Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.
If R = {k s.t.
√
n|Yk | > tα (Y)}, FWER(R) ≤ α.
(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?
10/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Goal
Find a threshold tα (Y) such that √ (Pα ) : P( n||Y − µ||∞ > tα (Y)) ≤ α.
If R = {k s.t.
√
n|Yk | > tα (Y)}, FWER(R) ≤ α.
(Pα ) gives a confidence ball for µ. Non-asymptotic : ∀K , n tα (Y) should be close to qα?
10/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Outline
1
Introduction
2
Resampling
3
Concentration method
4
Quantile method
5
Step-down procedure
6
Conclusion
11/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Resampling principle [Efron 1979 ; ...]
Sample
Y 1, . . . , Y n
resampling
−→
(W1 , Y 1 ), . . . , (Wn , Y n )
weighted sample
Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)
12/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Resampling principle [Efron 1979 ; ...]
Sample
Y 1, . . . , Y n
resampling
−→
(W1 , Y 1 ), . . . , (Wn , Y n )
weighted sample
Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)
12/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Resampling principle [Efron 1979 ; ...]
Sample
Y 1, . . . , Y n
resampling
−→
(W1 , Y 1 ), . . . , (Wn , Y n )
weighted sample
Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)
12/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Resampling principle [Efron 1979 ; ...]
Sample
Y 1, . . . , Y n
resampling
−→
(W1 , Y 1 ), . . . , (Wn , Y n )
weighted sample
Weight vector : (W1 , . . . , Wn ), independent from Y “Y i is kept Wi times in the resample” Example : Efron’s bootstrap ⇔ n-sample with replacement ⇔ (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Heuristics : (true distribution, sample) ≈ (sample, resample)
12/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Examples of resampling weights
Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =
n n−1 1i6=J ,
J ∼ U({1, . . . , n})
V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}
13/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Examples of resampling weights
Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =
n n−1 1i6=J ,
J ∼ U({1, . . . , n})
V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}
13/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Examples of resampling weights
Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =
n n−1 1i6=J ,
J ∼ U({1, . . . , n})
V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}
13/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Examples of resampling weights
Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =
n n−1 1i6=J ,
J ∼ U({1, . . . , n})
V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}
13/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Examples of resampling weights
Efron’s bootstrap : (W1 , . . . , Wn ) ∼ M(n; n−1 , . . . n−1 ) Rademacher : Wi iid ∼ 12 δ−1 + 12 δ1 Random hold-out : Wi = 21i∈I , I uniform over subsets of {1, . . . , n} of cardinality n/2 Leave-one-out : Wi =
n n−1 1i6=J ,
J ∼ U({1, . . . , n})
V -fold cross-validation : Wi = VV−1 1i ∈B / J , J ∼ U({1, . . . , V }) for some partition B1 , . . . , BV of {1, . . . , n}
13/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Quantile method Ideal threshold : qα? , 1 − α quantile of L
√
nkY − µk∞
⇒ Resampling estimate of qα? : √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n
YW
1X Wi Y i := n
Resampling empirical mean
i=1
n
W :=
1X Wi n i=1
qαquant (Y) depends only on Y, so it can be computed with real data
14/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Quantile method Ideal threshold : qα? , 1 − α quantile of L
√
nkY − µk∞
⇒ Resampling estimate of qα? : √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n
YW
1X Wi Y i := n
Resampling empirical mean
i=1
n
W :=
1X Wi n i=1
qαquant (Y) depends only on Y, so it can be computed with real data
14/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Quantile method Ideal threshold : qα? , 1 − α quantile of L
√
nkY − µk∞
⇒ Resampling estimate of qα? : √ qαquant (Y), 1 − α quantile of L nkYW − W Yk∞ |Y n
YW
1X Wi Y i := n
Resampling empirical mean
i=1
n
W :=
1X Wi n i=1
qαquant (Y) depends only on Y, so it can be computed with real data
14/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method
kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2 Estimate E kY − µk∞ by resampling
⇒ qαconc (Y) = cst ×
√
nE kYW − W Yk∞ |Y + remainder(σ, α, n)
Works if expectations (∝ (∝ n−1/2 )
p log(K )) are larger than fluctuations
15/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method
kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2 Estimate E kY − µk∞ by resampling
⇒ qαconc (Y) = cst ×
√
nE kYW − W Yk∞ |Y + remainder(σ, α, n)
Works if expectations (∝ (∝ n−1/2 )
p log(K )) are larger than fluctuations
15/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method
kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2 Estimate E kY − µk∞ by resampling
⇒ qαconc (Y) = cst ×
√
nE kYW − W Yk∞ |Y + remainder(σ, α, n)
Works if expectations (∝ (∝ n−1/2 )
p log(K )) are larger than fluctuations
15/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method
kY − µk∞ concentrates around its expectation, standard-deviation ≤ σn−1/2 Estimate E kY − µk∞ by resampling
⇒ qαconc (Y) = cst ×
√
nE kYW − W Yk∞ |Y + remainder(σ, α, n)
Works if expectations (∝ (∝ n−1/2 )
p log(K )) are larger than fluctuations
15/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Results from empirical process theory
n
(YW
1X − W Y) = (Wi − W )Y i = YW −W n i=1
⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].
16/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Results from empirical process theory
n
(YW
1X − W Y) = (Wi − W )Y i = YW −W n i=1
⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].
16/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Results from empirical process theory
n
(YW
1X − W Y) = (Wi − W )Y i = YW −W n i=1
⇒ Empirical bootstrap process • Asymptotically (K fixed, n → ∞): many results, e.g. [van der Vaart and Wellner 1996] ⇒ both methods are asymptotically valid. • Non-asymptotic results (learning theory, bounded case) : Rademacher complexities [Koltchinskii 01], [Bartlett, Boucheron and Lugosi 02] Bootstrap penalization [Fromont 05].
16/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Classical procedures and existing results
Parametric statistics : no, because K n. Asymptotic results : not valid, because K n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations
17/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Classical procedures and existing results
Parametric statistics : no, because K n. Asymptotic results : not valid, because K n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations
17/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Classical procedures and existing results
Parametric statistics : no, because K n. Asymptotic results : not valid, because K n. Randomization tests : do not work if H0,k = {µk ≤ 0}. And do not give confidence regions. Holm’s procedure : unefficient with strong correlations
17/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method : three ideas
kY − µk∞ ' E kY − µk∞
E kY − µk∞ ∝ E kYW − W Yk∞ E kYW − W Yk∞ |Y ' E kYW − W Yk∞
⇒ qαconc (Y) ∝
√
nE kYW − W Yk∞ |Y + remainders(σ, n, α)
18/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method : three ideas
kY − µk∞ ' E kY − µk∞
E kY − µk∞ ∝ E kYW − W Yk∞ E kYW − W Yk∞ |Y ' E kYW − W Yk∞
⇒ qαconc (Y) ∝
√
nE kYW − W Yk∞ |Y + remainders(σ, n, α)
18/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Concentration method : three ideas
kY − µk∞ ' E kY − µk∞
E kY − µk∞ ∝ E kYW − W Yk∞ E kYW − W Yk∞ |Y ' E kYW − W Yk∞
⇒ qαconc (Y) ∝
√
nE kYW − W Yk∞ |Y + remainders(σ, n, α)
18/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
First result Theorem W exchangeable (Efron, Rademacher, Random hold-out, Leave-one-out). For every α ∈ (0; 1), √ nE kYW − W Yk∞ |Y CW −1 conc,1 +σΦ (α/2) √ qα (Y) := +1 BW nBW satisfies
√ P
nkY − µk∞ > qαconc,1 (Y) ≤ α
with σ 2 := maxk var(Yk1 ), and
BW := E
1 n
n X
(Wi − W )2
1/2
1/2
> 0 et CW := (n/(n − 1))E(W1 − W )2
i=1
19/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Sketch of the proof
Expectations : −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2 Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1
20/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Sketch of the proof
Expectations : −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2 Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1
20/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Sketch of the proof
Expectations : −1 E kY − µk∞ = BW E kYW − W Yk∞ Gaussian concentration theorem for kY − µk∞ : standard deviation ≤ σn−1/2 Gaussian concentration theorem for E kYW − W Yk∞ : standard deviation ≤ CW σn−1
20/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.
21/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.
21/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.
21/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.
21/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
BW and CW are independent from K and easy to compute. −1 In most cases, CW BW = O(1). k·k∞ can be replaced by k·kp , p ≥ 1, or by supk (·)+ ⇒ different shapes for the confidence regions. True for any exchangeable weight vector. Can be generalized to V -fold cross-validation weights (with p −1 CW BW ≈ n/V ) Can be extended (with larger constants) to symmetric bounded variables. Almost deterministic threshold.
21/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.
Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞
e 2
≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1
Complexity nn 2n n n/2
∝ n−1/2 2n n V
p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.
Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞
e 2
≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1
Complexity nn 2n n n/2
∝ n−1/2 2n n V
p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.
Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞
e 2
≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1
Complexity nn 2n n n/2
∝ n−1/2 2n n V
p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Choice of the weights Accuracy : ratio CW /BW in the deviation term Complexity when computing E[·|Y] : cardinal of the support of W = (Wi )i Weight Efron Rademacher R. h.-o. (n/2) Leave-one-out regular V -fold c.-v.
Accuracy CW /BW ≤ 12 (1 − n1 )−n −−−→ n→∞
e 2
≤ (1 − n−1/2 )−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = n−1 −−−→ 1 n→∞ q n = V −1
Complexity nn 2n n n/2
∝ n−1/2 2n n V
p For all exchangable weights, CW /BW ≥ n/(n − 1). ⇒ Leave-one-out and regular V -fold c.-v. seem good (V to be chosen). 22/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Simulations : n = 1000, K = 16384, σ = 1
23/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Second concentration threshold If qαconc,1 (Y) was constant, we would take min qαconc,1 (Y), tαBonf as threshold. Theorem Same assumptions as Thm. 1. Then, ∀α, δ ∈ (0; 1), the threshold qαconc,2 (Y) equal to h
0
√
min @t Bonf
α(1−δ) ,
E
nkYW −W k∞ Y BW
satisfies
√ P
i
1
+ σΦ
−1
σCW −1 (α(1 − δ)/2) + √ Φ (αδ/2)A nBW
nkY − µk∞ > qαconc,2 (Y) ≤ α
⇒ qαconc,2 (Y) always better than qαconc,1 (Y) and Bonferroni threshold (up to δ)
24/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Second concentration threshold If qαconc,1 (Y) was constant, we would take min qαconc,1 (Y), tαBonf as threshold. Theorem Same assumptions as Thm. 1. Then, ∀α, δ ∈ (0; 1), the threshold qαconc,2 (Y) equal to h
0
√
min @t Bonf
α(1−δ) ,
E
nkYW −W k∞ Y BW
satisfies
√ P
i
1
+ σΦ
−1
σCW −1 (α(1 − δ)/2) + √ Φ (αδ/2)A nBW
nkY − µk∞ > qαconc,2 (Y) ≤ α
⇒ qαconc,2 (Y) always better than qαconc,1 (Y) and Bonferroni threshold (up to δ)
24/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Simulations : n = 1000, K = 16384, σ = 1
25/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Method Rademacher weights : Wi i.i.d. ∼ 21 δ−1 + 21 δ1 Resampling heuristics suggests that qαquant (Y), the (1 − α) quantile of √ L nkYW − W Yk∞ Y should satisfy P kY − µk∞ > qαquant (Y) ≤ α.
√ qαquant (Y) = inf x PW ( n||Y − µ||∞ > x) ≤ α X −n X √ 1 n i = inf x 2 wi (Y − Y) > x ≤ α 1I n n ∞ n w ∈{−1,1}
i=1
26/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Theorem Theorem Let α, δ, γ ∈ (0, 1) and f a non-negative threshold with FWER bounded by αγ/2 : √ P
αγ nkY − µk > f (Y) ≤ 2
Then, r qαquant+f (Y) =
quant qα(1−δ)(1−γ) (Y)
+
2 log(2/(δα)) f (Y) n
has a FWER bounded by α : √ P nkY − µk > qαquant+f (Y) ≤ α
27/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.
28/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.
28/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.
28/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Remarks
Uses only the symmetry of Y around its mean The threshold f only appears in a second-order term. Gaussian case ⇒ three thresholds : Bonf , q conc,1 and q conc,2 . take f among tαγ/2 αγ/2 αγ/2 In simulation experiments, f is almost unnecessary.
28/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Simulations : n = 1000, K = 16384, σ = 1
29/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Simulations : without the additive term ?
30/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure [Holm 1979 ; Westfall and Young 1993 ; Romano and Wolf 2005] sup 1≤k≤K
Yk − µk ≥ sup Yk − µk µk =0
31/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure
Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).
32/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure
Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).
32/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure
Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).
32/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure
Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).
32/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Step-down procedure
Reorder the coordinates : Yσ(1) ≥ · · · ≥ Yσ(K ) Define the thresholds tk = t(Yσ(k) , . . . , Yσ(K ) ) for k = 1, . . . , K b Define k= 0 max k s.t. ∀k ≤ k, Yσ(k 0 ) > tk 0 (Y) Reject H0,k for all k ≤ kb ⇒ this procedure has a FWER controlled by α if each tk has (use that tK = t((Yk )k∈K ) is a non-decreasing function of K).
32/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Simulations : with non-zero means, 0 ≤ µk ≤ 0.29, b = 24
33/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Numerical simulation : 0 ≤ µk ≤ 0.29, b = 24
Empirical means
Bonf. : t = 0.148
Holm : t = 0.146
Quant+Bonf : t = 0.123
Uncentered quant. : t = 0.122 (=S-d Q+B)
Quant. : t = 0.106
34/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Conclusions
Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures
35/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Conclusions
Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures
35/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Conclusions
Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures
35/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Conclusions
Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures
35/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Conclusions
Two multiple testing procedures, with resampling techniques : concentration method (almost deterministic threshold) quantile method, with symmetrization techniques FWER controlled by α non-asymptotic better than Bonferroni if there are enough correlations step-down procedures
35/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Future works
FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?
36/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Future works
FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?
36/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Future works
FDR instead of FWER ? Theoretical study of power (with regular µ) ? Self-contained result for quantiles (without f ) ? Quantiles with other weights ? Quantiles without the symmetry assumption ? Real-data sets ?
36/37
Introduction
Resampling
Concentration method
Quantile method
Step-down procedure
Conclusion
Thank you for your attention !
37/37