Approximating Hereditary Discrepancy via Small Width Ellipsoids Aleksandar Nikolov
Kunal Talwar
Rutgers University MSR SVC
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
1 / 28
Introduction
Outline
1
Introduction
2
Ellipsoids
3
Upper Bound
4
Lower Bound
5
Conclusion
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
2 / 28
Introduction
Discrepancy of Set Systems
Given a collection of m subsets {S1 , . . . , Sm } of a size n universe U.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
3 / 28
Introduction
Discrepancy of Set Systems Color each universe element red or blue, so that each set is as balanced as possible.
Discrepancy: maximum imbalance (above: 1).
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
4 / 28
Introduction
Matrix Representation
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
5 / 28
Introduction
Matrix Representation
Nikolov, Talwar (Rutgers, MSR SVC)
1
4
7
2
5
8
3
6
9
Approximating Discrepancy
5 / 28
Introduction
Matrix Representation
1 2 3 1 1 1 1 1 0 0 0 0 0 0 0
Nikolov, Talwar (Rutgers, MSR SVC)
4 0 1 1 1
5 0 1 0 1
6 0 0 0 1
7 0 0 1 7
8 0 0 0 1
9 0 0 0 1
Approximating Discrepancy
−1 1 1 −1 1 −1 1 1 −1
1 0 = 0 −1
5 / 28
Introduction
Matrix Representation
1 1 1 0 0
2 1 1 0 0
3 1 0 0 0
4 0 1 1 1
5 0 1 0 1
6 0 0 0 1
7 0 0 1 7
disc(A) =
Nikolov, Talwar (Rutgers, MSR SVC)
8 0 0 0 1
9 0 0 0 1
−1 1 1 −1 1 −1 1 1 −1
1 0 = 0 −1
min kAxk∞
x∈{±1}n
Approximating Discrepancy
5 / 28
Introduction
Hereditary Discrepancy
For an m × n matrix A: Discrepancy: disc(A) =
min kAxk∞
x∈{±1}n
Hereditary Discrepancy herdisc(A) = max disc(A|S ) S⊆[n]
A|S : submatrix of columns indexed by S corresponds to restricted set system {S1 ∩ S, . . . , Sm ∩ S}.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
6 / 28
Introduction
Some Applications
Rounding:[Lov´asz, Spencer, and Vesztergombi, 1986] For any y ∈ [−1, 1]n , there exists x ∈ {±1}n such that kAx − Ay k∞ ≤ 2 herdisc(A). efficient, if discrepancy solutions can be computed efficiently used e.g. in [Rothvoß, 2013].
Sparsification: Constructing -approximations, and -nets. Private Data Analysis:[Nikolov, Talwar, and Zhang, 2013] Lower bounds on the necessary error to prevent a privacy breach.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
7 / 28
Introduction
Classical Results
p [Spencer, 1985] When A ∈ [−1, 1]m×n , herdisc(A) = O( n log m n ). [Beck and Fiala, 1981] When A = (ai )ni=1 , and ∀i : kai k1 ≤ 1, herdisc(A) ≤ 2. n [Banaszczyk, 1998] √ When A = (ai )i=1 , and ∀i : kai k2 ≤ 1, herdisc(A) ≤ O( log m).
Komlos Conjecture: herdisc(A) ≤ O(1).
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
8 / 28
Introduction
Hardness
[Charikar, Newman, and Nikolov, 2011] NP-hard to distinguish √ between disc(A) = 0 and disc(A) = Ω( n) for A and O(n) × n matrix. [Austrin, Guruswami, and H˚ astad, 2013] NP-hard to approximate herdisc to within a factor of 2. Is there super-constant hardness?
The problem “herdisc(A) ≤ t?” is in ΠP2 Is it in NP? Is it ΠP2 -hard?
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
9 / 28
Introduction
Approximating Discrepancy [Bansal, 2010] If herdisc(A) ≤ D, can find an x such that kAxk∞ ≤ O(D log m). But it’s possible that kAxk∞ D
[Lov´asz, Spencer, and Vesztergombi, 1986; Matouˇsek, 2013] A determinant lower bound for herdisc(A) is tight within a factor of O(log3/2 m). But not efficient! [Nikolov, Talwar, and Zhang, 2013] An O(log3 m)-approximation to herdisc(A) by relating it to the noise complexity of an efficient differentially private algorithm.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
10 / 28
Introduction
Approximating Discrepancy [Bansal, 2010] If herdisc(A) ≤ D, can find an x such that kAxk∞ ≤ O(D log m). But it’s possible that kAxk∞ D
[Lov´asz, Spencer, and Vesztergombi, 1986; Matouˇsek, 2013] A determinant lower bound for herdisc(A) is tight within a factor of O(log3/2 m). But not efficient! [Nikolov, Talwar, and Zhang, 2013] An O(log3 m)-approximation to herdisc(A) by relating it to the noise complexity of an efficient differentially private algorithm. This work: An O(log3/2 m)-approximation to herdisc(A). Simpler, more direct proof.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
10 / 28
Introduction
Our Result Theorem There exists an efficiently computable function f , s.t. p c f (A) ≤ herdisc(A) ≤ C log m f (A), log m for absolute constants c, C . herdisc(A) is a max over 2n subsets of a min over 2n colorings No easy to ceritfy upper or lower bound
We prove a simple geometric certificate gives both upper and lower bounds. First (approximate) formulation of herdisc as convex program.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
11 / 28
Ellipsoids
Outline
1
Introduction
2
Ellipsoids
3
Upper Bound
4
Lower Bound
5
Conclusion
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
12 / 28
Ellipsoids
The Min-Width Ellipsoid
(Centrally symmetric) ellipsoid: E = FB2m . m = [−1, 1]m . Hypercube: B∞ Convex Program (MWE): Let A = (a1 , . . . , an ), ai ∈ Rm . f (A) = min w over E , w subject to {a1 , . . . , am } ⊆ E ⊆ wB∞
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
13 / 28
Ellipsoids
The Min-Width Ellipsoid Minimize width w over all E and w s.t. {a1 , . . . , am } ⊆ E ⊆ wB∞
wB∞ a1
2w
Nikolov, Talwar (Rutgers, MSR SVC)
E = F B2 a2 0 a5
Approximating Discrepancy
a3 a4
14 / 28
Ellipsoids
Proof Strategy
√ Upper Bound: herdisc(A) ≤ C log mf (A) Banaszczyk’s discrepancy theorem.
Lower Bound:
c log m
≤ herdisc(A)
Extract a lower bound on herdisc(A) from any solution to a convex dual of the (MWE) program. Bound follows from strong duality.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
15 / 28
Upper Bound
Outline
1
Introduction
2
Ellipsoids
3
Upper Bound
4
Lower Bound
5
Conclusion
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
16 / 28
Upper Bound
Banaszczyk’s Theorem
Theorem ([Banaszczyk, 1998]) Let A = (a1 , . . . , an ), where kai k2 ≤ 1 for all i. Let K ⊆ Rm be a convex body so that 1 Pr[g ∈ K ] ≥ , 2 m for g ∼ N(0, 1) a standard guassian. Then ∃x ∈ {−1, 1}n so that Ax ∈ 10K .
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
17 / 28
Upper Bound
Applying the Theorem
Take some E = FB2 and w s.t. {a1 , . . . , am } ⊆ E ⊆ wB∞ . wB∞ a1
E = F B2 a2 0 a5
Nikolov, Talwar (Rutgers, MSR SVC)
a3 a4
Approximating Discrepancy
18 / 28
Upper Bound
Applying the Theorem {F −1 a1 , . . . , F −1 am } ⊆ B2 ⊆ K . K = wF −1 B∞ B2 F −1 a2 F −1 a3 F −1 a1
−1 0 F a4
F −1 a5
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
18 / 28
Upper Bound
Applying the Theorem
K = wF −1 B∞
Every facet of K is at least distance 1 from the origin. B2
F
−1
F −1 a3 F −1 a1
Because B2 ⊆ K .
a2 −1 0 F a4
F −1 a5
Chernoff √ bound + Union bound: Pr[g ∈ C log m K ] ≥ 12 . By B.’s Theorem: ∃x ∈ {−1, 1}n , so that F −1 Ax ∈ K √ ⇔ Ax ∈ w · C log√ m B∞ . ⇔ kAxk∞ ≤ w √ · C log m. disc(A) ≤ w · C log m.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
18 / 28
Upper Bound
The Bound is Hereditary The bound immediately works for A|S : {ai }i∈S ⊆ {a1 . . . , an } ⊆ E ⊆ wB∞ . I.e. E an w are feasible for A|S
wB∞ a1
E = F B2 a2 0 a5
Nikolov, Talwar (Rutgers, MSR SVC)
a3 a4
Approximating Discrepancy
19 / 28
Upper Bound
The Bound is Hereditary The bound immediately works for A|S : {ai }i∈S ⊆ {a1 . . . , an } ⊆ E ⊆ wB∞ . I.e. E an w are feasible for A|S
wB∞ a1
E = F B2 a2 0 a5
Nikolov, Talwar (Rutgers, MSR SVC)
a3 a4
Approximating Discrepancy
19 / 28
Upper Bound
The Bound is Hereditary The bound immediately works for A|S : {ai }i∈S ⊆ {a1 . . . , an } ⊆ E ⊆ wB∞ . I.e. E an w are feasible for A|S √ herdisc(A) ≤ w · C log m. wB∞ a1
E = F B2 a2 0 a5
Nikolov, Talwar (Rutgers, MSR SVC)
a3 a4
Approximating Discrepancy
19 / 28
Lower Bound
Outline
1
Introduction
2
Ellipsoids
3
Upper Bound
4
Lower Bound
5
Conclusion
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
20 / 28
Lower Bound
Spectral Lower Bound Smallest singular value: σmin (A) = minx
kAxk2 kxk2 .
Proposition For any m × n matrix A, any diagonal P ≥ 0, tr(P 2 ) = 1, 2 disc(A)2 ≥ nσmin (PA).
Comes from (the dual of) a convex relaxation of disc(A).
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
21 / 28
Lower Bound
Spectral Lower Bound Smallest singular value: σmin (A) = minx
kAxk2 kxk2 .
Proposition For any m × n matrix A, any diagonal P ≥ 0, tr(P 2 ) = 1, 2 disc(A)2 ≥ nσmin (PA).
Comes from (the dual of) a convex relaxation of disc(A). Implies for any S ⊆ [n]: 2 herdisc(A)2 ≥ |S|σmin (PA|S ).
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
21 / 28
Lower Bound
Proof. disc(A)2 =
min
2 n X m max Aij xj
x∈{−1,1}n i=1
Nikolov, Talwar (Rutgers, MSR SVC)
j=1
Approximating Discrepancy
22 / 28
Lower Bound
Proof. disc(A)2 =
≥
min
2 n X m max Aij xj
min
2 m n X X Aij xj (avaraging) Pii2
x∈{−1,1}n i=1
x∈{−1,1}n
Nikolov, Talwar (Rutgers, MSR SVC)
i=1
j=1
j=1
Approximating Discrepancy
22 / 28
Lower Bound
Proof. disc(A)2 =
≥ =
min
2 n X m max Aij xj
min
2 m n X X Aij xj (avaraging) Pii2
x∈{−1,1}n i=1
x∈{−1,1}n
min
x∈{−1,1}n
Nikolov, Talwar (Rutgers, MSR SVC)
j=1
i=1
j=1
kPAxk22
Approximating Discrepancy
22 / 28
Lower Bound
Proof. disc(A)2 =
≥ =
min
2 n X m max Aij xj
min
2 m n X X Aij xj (avaraging) Pii2
x∈{−1,1}n i=1
x∈{−1,1}n
min
x∈{−1,1}n
j=1
i=1
j=1
kPAxk22
2 ≥ nσmin (PA) (x ∈ {−1, 1}n ⇒ kxk2 = n1/2 )
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
22 / 28
Lower Bound
Dual of (MWE) Primal f (A) = min w subject to {a1 , . . . , am } ⊆ E ⊆ wB∞
Nuclear norm: kMkS1 is equal to the sum of singular values of M. Dual f (A) = max kPAQkS1 subject to P, Q ≥ 0, diagonal tr(P 2 ) = tr(Q 2 ) = 1 Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
23 / 28
Lower Bound
Spectral LB from the Dual
Lemma For any feasible P and Q, there exists a set S ⊆ [n] such that |S|σmin (PA|S )2 ≥
c2 kPAQk2S1 . (log m)2
The set S is efficiently computable. Spectral lowerbound ⇒ herdisc(A) ≥
Nikolov, Talwar (Rutgers, MSR SVC)
c log m f (A).
Approximating Discrepancy
24 / 28
Lower Bound
Restricted Invertibility Principle Theorem ([Bourgain and Tzafriri, 1987; Spielman and Srivastava, 2010]) Assume that any two nonzero singular values σi , σj of the m × k matrix M satisfy 12 ≤ σσji ≤ 2. Then there exists a subset S ⊆ [k] such that |S|σmin (M|S )2 ≥
Nikolov, Talwar (Rutgers, MSR SVC)
1 kMk2S1 64k
Approximating Discrepancy
25 / 28
Lower Bound
Restricted Invertibility Principle Theorem ([Bourgain and Tzafriri, 1987; Spielman and Srivastava, 2010]) Assume that any two nonzero singular values σi , σj of the m × k matrix M satisfy 12 ≤ σσji ≤ 2. Then there exists a subset S ⊆ [k] such that |S|σmin (M|S )2 ≥
1 kMk2S1 64k
Simple transformations to PAQ to get a matrix M: M satisfies the assumption of the restricted invertibility principle √
kMkS1 ≥
k log m kPAQkS1
Captures a large fraction of the dual value
All columns of M are projections of columns of PA Spectral lower bounds for M lower bound herdisc(A) Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
25 / 28
Conclusion
Outline
1
Introduction
2
Ellipsoids
3
Upper Bound
4
Lower Bound
5
Conclusion
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
26 / 28
Conclusion
Conclusion This work: O(log3/2 m) approximation for hereditary discrepancy Direct proof using geometric techniques Approximate characterization of hereditary discrepancy as a convex program Can use tools of convex analysis to understand herdisc.
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
27 / 28
Conclusion
Conclusion This work: O(log3/2 m) approximation for hereditary discrepancy Direct proof using geometric techniques Approximate characterization of hereditary discrepancy as a convex program Can use tools of convex analysis to understand herdisc.
Open: 2 + hardness of approximating hereditary discrepancy How far can f (A) be from herdisc(A)? Constructive proof of Banaszczyk’s theorem Improve the approximation ratio
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
27 / 28
Conclusion
Thank you!
Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
28 / 28
References
Per Austrin, Venkatesan Guruswami, and Johan H˚ astad. (2 + )-sat is np-hard. ECCC TR13-159, 2013., 2013. Wojciech Banaszczyk. Balancing vectors and gaussian measures of n-dimensional convex bodies. Random Structures & Algorithms, 12(4): 351–360, 1998. N. Bansal. Constructive algorithms for discrepancy minimization. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 3–10. IEEE, 2010. J´ ozsef Beck and Tibor Fiala. Integer-making theorems. Discrete Applied Mathematics, 3(1):1–8, 1981. J. Bourgain and L. Tzafriri. Invertibility of large submatrices with applications to the geometry of banach spaces and harmonic analysis. Israel journal of mathematics, 57(2):137–224, 1987. M. Charikar, A. Newman, and A. Nikolov. Tight hardness results for minimizing discrepancy. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1607–1614. SIAM, 2011. Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
28 / 28
Conclusion
L. Lov´asz, J. Spencer, and K. Vesztergombi. Discrepancy of set-systems and matrices. European Journal of Combinatorics, 7(2):151–160, 1986. Jiˇr´ı Matouˇsek. The determinant bound for discrepancy is almost tight. Proceedings of the American Mathematical Society, 141(2):451–460, 2013. Aleksandar Nikolov, Kunal Talwar, and Li Zhang. The geometry of differential privacy: the sparse and approximate cases. In Proceedings of the 45th annual ACM symposium on Symposium on theory of computing, STOC ’13, pages 351–360, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2029-0. doi: 10.1145/2488608.2488652. URL http://doi.acm.org/10.1145/2488608.2488652. Thomas Rothvoß. Approximating bin packing within o (log opt* log log opt) bins. In Foundations of Computer Science (FOCS), 2013 54th Annual IEEE Symposium on, 2013. Joel Spencer. Six standard deviations suffice. Transactions of the American Mathematical Society, 289(2):679–706, 1985. D.A. Spielman and N. Srivastava. An elementary proof of the restricted invertibility theorem. Israel Journal of Mathematics, pages 1–9, 2010. Nikolov, Talwar (Rutgers, MSR SVC)
Approximating Discrepancy
28 / 28