Optimal Private Halfspace Counting via Discrepancy S. Muthukrishnan
Aleksandar Nikolov
Rutgers U.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
1 / 21
Range Counting
Private Range Counting Public Input: A ground set P ⊆ Rd ; a range space, i.e. collection of sets R ⊆ 2P induced by some natural geometric sets Private input: Integer weight xp for each p ∈ P.
Goal: For all ranges R ∈ R, approximate privately X R(x) = xp p∈R
Accuracy: Mean squared error of an algorithm M is 1 X (R(x) − M(R, x))2 |R| R∈R
Muthu, A. Nikolov (Rutgers)
Private Range Counting
3 / 21
Range Counting
Halfspace Counting Each R ∈ R is the points of P contained in some halfspace in Rd . Query: what is the total weight of all points of P in halfspace R?
Fundamental in Computational Geometry. Other range queries can be expressed as halfspace queries by “lifting” them to a higher dimension. R(x) = x2 + x3 + x5 + x7
R x1
x2 x3 x4 x5
x6
x7 x8
Muthu, A. Nikolov (Rutgers)
Private Range Counting
4 / 21
Range Counting
Private Linear Queries
More general algebraic problem: Public Input: A query matrix A ∈ Rm×n
In range counting: each row of A is the indicator of a range
Private Input: A vector x ∈ Zn
In range counting: the private point weights
Goal: An algorithm M that approximates Ax and satisfies a privacy guarantee ((ε, δ)-differential privacy). Accuracy: Mean squared error is
Muthu, A. Nikolov (Rutgers)
1 m kAx
Private Range Counting
− M(A, x)k22
5 / 21
Range Counting
Differential Privacy
Definition An algorithm M with input domain Zn and output range Y is (ε, δ)-differentially private if for every n, every x, x0 with kx − x0 k1 ≤ 1, and every measurable S ⊆ Y , M satisfies Pr[M(x) ∈ S] ≤ e ε Pr[M(x0 ) ∈ S] + δ.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
6 / 21
Range Counting
What is Known about Halfspace Counting?
Lower Bounds: Ω(n) squared error necessary for arbitrary 0-1 A when m > n [DN03] Does not apply to halfspace counting! No superconstant lower bound known.
Upper Bounds: Randomized response gives O(n log m). For halfspaces m = O(nd ), therefore O(nd log n) error is sufficient.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
7 / 21
Range Counting
Our Results Lower bounds Private halfspace counting in Rd requires Ω(n1−1/d ) mean squared error. More generally: linear queries A require noise lower bounded by the (hereditary) combinatorial discrepancy of A (up to a log factor).
Upper bounds Halfspace counting can be approximated privately with O(n1−1/d ) mean squared error. More generally: range counting for ranges with shatter functions exponent d can be approximated with the same error. Bounds also extend to worst case error (up to polylog factors).
Both results use discrepancy theory. Muthu, A. Nikolov (Rutgers)
Private Range Counting
8 / 21
Lower Bounds
Lower bound: Dinur-Nissim attack
Assume: There exists M such that for any x w.h.p. kAx − M(A, x)k2 ≤ E .
Adversary’s Goal: Given output of M(A, x), compute x0 , kx − x0 k1 n. So M is not private. Procedure: Output any x0 s.t. kAx0 − M(A, x)k2 E (succeeds w.p. 1 − β).
We have kAx − Ax0 k2 ≤ kAx0 − M(A, x)k2 + kAx − M(A, x)k2 E .
Needed: E such that kAx0 − Axk2 E ⇒ kx − x0 k1 n.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
10 / 21
Lower Bounds
Discrepancy connection
Discrepancy: The adversary can succeed when E discα (A) =
min
b∈{0,±1}n kbk1 ≥αn
kAbk2
When α = 0, this is trivially 0. When α = 1, this is the classical combinatorial `2 discrepancy. Can we connect discα to disc1 when α ∈ (0, 1)?
Muthu, A. Nikolov (Rutgers)
Private Range Counting
11 / 21
Lower Bounds
A More Robust Lower Bound
herdiscα (A) = maxS⊆[n] discα (A|S ) Weaker success condition for the adversary: choose a subset S of [n] (based only on A) and then guess most of x restricted to S: still implies a contradiction with (ε, δ)-differential privacy adversary can succeeds when E herdiscα (A)
herdiscα (A) ≥ herdisc1 (A)/O(log n) (for constant α)
Muthu, A. Nikolov (Rutgers)
Private Range Counting
12 / 21
Lower Bounds
Putting it together
Theorem (Main Lower Bound) No algorithm M that satisfies ∀x ∈ {0, 1}n : Pr[kAx − M(A, x)k2 = o(herdisc1 (A)/ log n)] ≥ 1 − β, is (ε, δ)-differentially private for ε = O(1), and constant δ < 1 and β < 1. Halfspace counting: Mean squared error for private halfspace queries is Ω(n1−1/d / log n) Using the hereditary structure of halfspace range spaces, we can show mean squared error is Ω(n1−1/d ).
Muthu, A. Nikolov (Rutgers)
Private Range Counting
13 / 21
Upper Bounds
Two Tools: Input and Output Perturbation
Input perturbation: Compute ˜ x = x + Lap(1/ε)n and output A˜ x. Output perturbation: Output Ax + Lap(1/ε0 )m for ε0 chosen to satisfy (ε, δ)-differential privacy.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
15 / 21
Upper Bounds
When Do the Tools Work?
For range counting: input perturbation works well with small ranges (squared error linear in size of range) output perturbation works well when each point belongs to few ranges (squared error linear in maximum degree) But for halfspaces most ranges are large and most points belong to many ranges. Solution from discrepancy theory: halfspace ranges admit a nice decomposition [Mat95]. (works for range spaces with VC dimension d and shatter function exponent d)
Muthu, A. Nikolov (Rutgers)
Private Range Counting
16 / 21
Upper Bounds
Decomposition n Decompose R into a series of new range spaces {Ti }log i=1 such that approximating counts for each Ti gives the counts for R.
R is decomposed into:
Ti with many small sets (i large): can use input perturbation Ti with few large sets (i small): can use output perturbation
Do we achieve the right balance? No!
Values of i s.t. noise variances is O(n1−1/d ): Output perturbation
i1 =
log n d
Muthu, A. Nikolov (Rutgers)
−
log n d2
Input perturbation
i0 =
log n d
Private Range Counting
17 / 21
Upper Bounds
How to Make It work For i ∈ (i1 , i0 ): For any Ti , there are points p that belong to a lot of sets and incur large privacy loss i.e. we need noise with variance Ω(n) to preserve their privacy
But we control both maximum set size and number of sets in Ti ! Idea: use average privacy loss (privacy loss averaged over all p) The “average” p requires only O(n1−1/d ) noise to preserve its privacy
We find a set X s.t. the privacy of each p ∈ X can be preserved by noise with variance O(n1−1/d ) |X | ≥ |P|/2.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
18 / 21
Upper Bounds
A partial coloring style algorithm: For i ≥ i0 use input perturbation to approximate counts for Ti w.r.t. X For i < i0 , we add Laplace noise with variance O(n1−1/d 2(i0 −i)(1−d) ) to approximate counts for Ti w.r.t. X
This allows us to compute halfspace counts over X with squared error O(n1−1/d ). Recurse on P \ X (still a halfspace range space)
Muthu, A. Nikolov (Rutgers)
Private Range Counting
19 / 21
Conclusion
This work: Optimal upper and lower bounds for private halfspace counting Connection between discrepancy theory and noise lower bounds for differential privacy Other results: A lower bound of Ω((log n)d−1 ) for orthogonal range counting. Tight up to the dependence on d. Open question: Does discrepancy always characterize the error needed to preserve privacy of linear queries?
Thank you!
Muthu, A. Nikolov (Rutgers)
Private Range Counting
21 / 21
Conclusion
Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to non-interactive database privacy. In Proceedings of the 40th annual ACM symposium on Theory of computing, STOC ’08, pages 609–618, New York, NY, USA, 2008. ACM. Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, PODS ’03, pages 202–210, New York, NY, USA, 2003. ACM. C. Dwork, G. N. Rothblum, and S. Vadhan. Boosting and differential privacy. In Proc. 51st Annual IEEE Symp. Foundations of Computer Science (FOCS), pages 51–60, 2010. M. Hardt and G. N. Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. Muthu, A. Nikolov (Rutgers)
Private Range Counting
21 / 21
Conclusion
In Proc. 51st Annual IEEE Symp. Foundations of Computer Science (FOCS), pages 61–70, 2010. J. Matou˘sek. Tight upper bounds for the discrepancy of half-spaces. Discrete and Computational Geometry, 13(1):593–601, 1995.
Muthu, A. Nikolov (Rutgers)
Private Range Counting
21 / 21