NeurIPS 2019 Workshop
Vancouver, December 14
Room: East MR 8 + 15
View on SlidesLive
The goal of our workshop is to bring together privacy experts working in academia and industry to discuss the present and the future of privacy-aware technologies powered by machine learning. The workshop will focus on the technical aspects of privacy research and deployment with invited and contributed talks by distinguished researchers in the area. We encourage submissions exploring a broad range of research areas related to data privacy, including but not limited to:
Submission deadline: September 9, 2019, 23:59 UTC
Notification of acceptance: October 1, 2019
NeurIPS early registration deadline: October 23, 2019
Workshop: December 14, 2019 (Saturday)
The submission deadline has now passed. If your submission was accepted for a poster presentation, please make your posters 36W x 48H inches or 90 x 122 cm. Posters should be on light weight paper and should not be laminated. As you design your poster, you may find the following resource helpful: Guidelines for Creating Accessible Printed Posters.
Federated Learning for Data Privacy and Confidentiality @ NeurIPS: December 13, 2019
8:10 | Opening |
8:15 | Invited talk: Brendan McMahan — Privacy for Federated Learning, and Federated Learning for Privacy |
More details coming soon
|
|
9:05 |
Gaussian Differential Privacy
(contributed talk)
Jinshuo Dong, Aaron Roth and Weijie Su |
Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade.
This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling.
Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term ❝f-differential privacy❞ (f-DP).
This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations.
First, f-DP preserves the hypothesis testing interpretation.
In addition, f-DP allows for lossless reasoning about composition in an algebraic fashion.
Moreover, we provide a powerful technique to import existing results proven for original DP to f-DP and, as an application, obtain a simple subsampling theorem for f-DP.
In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the f-DP class that is referred to as ❝Gaussian differential privacy❞ (GDP),
defined based on testing two shifted Gaussians.
GDP is focal among the f-DP class because of a central limit theorem we prove.
More precisely, the privacy guarantees of any hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition.
The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms.
Taken together, this collection of attractive properties render f-DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis.
Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.
|
|
9:25 |
QUOTIENT: Two-Party Secure Neural Network Training & Prediction
(contributed talk)
Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adria Gascon |
Recently, there has been a wealth of effort devoted to the design of secure protocols for machine learning tasks.
Much of this is aimed at enabling secure prediction from highly-accurate Deep Neural Networks (DNNs).
However, as DNNs are trained on data, a key question is how such models can be also trained securely.
The few prior works on secure DNN training have focused either on designing custom protocols for existing training algorithms
or on developing tailored training algorithms and then applying generic secure protocols.
In this work, we investigate the advantages of designing training algorithms alongside a novel secure protocol, incorporating optimizations on both fronts.
We present QUOTIENT, a new method for discretized training of DNNs,
along with a customized secure two-party protocol for it.
QUOTIENT incorporates key components of state-of-the-art DNN training such as layer normalization and adaptive gradient methods,
and improves upon the state-of-the-art in DNN training in two-party computation.
Compared to prior work, we obtain an improvement of 50X in WAN time and 6% in absolute accuracy.
|
|
9:45 | Coffee break |
10:30 | Invited talk: Ashwin Machanavajjhala — Fair Decision Making using Privacy-Protected Data |
Data collected about individuals is regularly used to make decisions that impact those same individuals. We consider settings where sensitive personal data is used to decide who will receive resources or benefits. While it is well known that there is a tradeoff between protecting privacy and the accuracy of decisions, in this talk, I will describe our recent work on a first-of-its-kind empirical study into the impact of formally private mechanisms (based on differential privacy) on fair and equitable decision-making.
|
|
11:20 | Spotlight talks |
|
|
11:30 | Poster session |
12:30 | Lunch break |
14:00 | Invited talk: Lalitha Sankar — Fair Universal Representations via Generative Models and Model Auditing Guarantees |
There is a growing demand for ML methods that limit inappropriate use of protected information to avoid both disparate treatment and disparate impact. In this talk, we present Generative Adversarial rePresentations (GAP) as a data-driven framework that leverages recent advancements in adversarial learning to allow a data holder to learn universal representations that decouple a set of sensitive attributes from the rest of the dataset while allowing learning multiple downstream tasks. We will briefly highlight the theoretical and practical results of GAP.
In the second half of the talk we will focus on model auditing. Privacy concerns have led to the development of privacy-preserving approaches for learning models from sensitive data. Yet, in practice, models (even those learned with privacy guarantees) can inadvertently memorize unique training examples or leak sensitive features. To identify such privacy violations, existing model auditing techniques use finite adversaries defined as machine learning models with (a) access to some finite side information (e.g., a small auditing dataset), and (b) finite capacity (e.g., a fixed neural network architecture). In the second half of the talk, we present requirements under which an unsuccessful attempt to identify privacy violations by a finite adversary implies that no stronger adversary can succeed at such a task. We will do so via parameters that quantify the capabilities of the finite adversary, including the size of the neural network employed by such an adversary and the amount of side information it has access to as well as the regularity of the (perhaps privacy-guaranteeing) audited model. |
|
14:50 |
Pan-Private Uniformity Testing
(contributed talk)
Kareem Amin, Matthew Joseph and Jieming Mao |
A centrally differentially private algorithm maps raw data to differentially private outputs.
In contrast, a locally differentially private algorithm
may only access data through public interaction with data holders,
and this interaction must be a differentially private function of the data.
We study the intermediate model of pan-privacy.
Unlike a locally private algorithm, a pan-private algorithm receives data in the clear.
Unlike a centrally private algorithm,
the algorithm receives data one element at a time
and must maintain a differentially private internal state while processing this stream.
First, we show that pan-privacy against multiple intrusions on the internal state is
equivalent to sequentially interactive local privacy.
Next, we contextualize pan-privacy against a single intrusion
by analyzing the sample complexity of uniformity testing over domain [k].
Focusing on the dependence on k,
centrally private uniformity testing has sample complexity Θ(√k),
while noninteractive locally private uniformity testing has sample complexity Θ(k).
We show that the sample complexity of pan-private uniformity testing is Θ(k2/3).
By a new Ω(k) lower bound for the sequentially interactive setting,
we also separate pan-private from sequentially interactive locally private
and multi-intrusion pan-private uniformity testing.
|
|
15:10 |
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
(contributed talk)
Vitaly Feldman, Tomer Koren and Kunal Talwar |
We study differentially private (DP) algorithms for stochastic convex optimization:
the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. (2019) has established the optimal bound on the excess population loss achievable given n samples.
Unfortunately, their algorithm achieving this bound is relatively inefficient:
it requires O(min{n3/2,n5/2/d}) gradient computations, where d is the dimension of the optimization problem.
We describe two new techniques for deriving DP convex optimization algorithms both achieving the optimal bound on excess loss and using O(min{n,n2/d}) gradient computations.
In particular, the algorithms match the running time of the optimal non-private algorithms.
The first approach relies on the use of variable batch sizes and is analyzed using the privacy amplification by iteration technique of Feldman et al. (2018).
The second approach is based on a general reduction to the problem of localizing an approximately optimal solution with differential privacy.
Such localization, in turn, can be achieved using existing (non-private) uniformly stable optimization algorithms.
As in the earlier work, our algorithms require a mild smoothness assumption.
We also give a linear-time algorithm achieving the optimal bound on the excess loss for the strongly convex case,
as well as a faster algorithm for the non-smooth case.
|
|
15:30 | Coffee break |
16:15 | Invited talk: Philip Leclerc — Formal Privacy At Scale: The 2020 Decennial Census TopDown Disclosure Limitation Algorithm |
To control vulnerabilities to reconstruction-abetted re-identification attacks that leverage massive external data stores and cheap computation, the U.S. Census Bureau has elected to adopt a formally private approach to disclosure limitation in the 2020 Decennial Census of Population and Housing. To this end, a team of disclosure limitation specialists have worked over the past 3 years to design and implement the U.S. Census Bureau TopDown Disclosure Limitation Algorithm (TDA). This formally private algorithm generates Persons and Households micro-data, which will then be tabulated to produce the final set of demographic statistics published as a result of the 2020 Census enumeration. In this talk, I outline the main features of TDA, describe the current iteration of the process used to help policy makers decide how to set and allocate privacy-loss budget, and outline known issues with - and intended fixes for - the current implementation of TDA.
|
|
17:05 | Panel Discussion |
17:55 | Closing |
8:10 | 1st block [View recording] |
Watch the opening remarks, Brendan McMahan's invited talk, and the first two contributed talks.
|
|
10:30 | 2nd block [View recording] |
Watch Ashwin Machanavajjhala's invited talk.
|
|
14:00 | 3rd block [View recording] |
Watch Lalitha Sankar's invited talk and two more contributed talks.
|
|
16:15 | 4th block [View recording] |
Watch Philip Leclerc's invited talk and the panel discussion.
|
By taking a few simple steps—such as paying special attention to font sizes and captions— you can make your presentations and posters more accessible. Feel free to contact us about any accessibility concerns relating to the website, workshop, etc.