Privacy in Machine Learning

NeurIPS 2019 Workshop
Vancouver, December 14


Room: East MR 8 + 15
View on SlidesLive

Scope

The goal of our workshop is to bring together privacy experts working in academia and industry to discuss the present and the future of privacy-aware technologies powered by machine learning. The workshop will focus on the technical aspects of privacy research and deployment with invited and contributed talks by distinguished researchers in the area. We encourage submissions exploring a broad range of research areas related to data privacy, including but not limited to:

  • Differential privacy: theory, applications, and implementations
  • Privacy-preserving machine learning
  • Trade-offs between privacy and utility
  • Programming languages for privacy-preserving data analysis
  • Statistical and information-theoretic notions of privacy
  • Empirical and theoretical comparisons between different notions of privacy
  • Privacy attacks
  • Policy-making aspects of data privacy
  • Secure multi-party computation techniques for machine learning
  • Learning on encrypted data, homomorphic encryption
  • Distributed privacy-preserving algorithms
  • Privacy in autonomous systems
  • Online social networks privacy
  • Interplay between privacy and adversarial robustness in machine learning
  • Relations between privacy, fairness and transparency

Call For Papers & Important Dates

Download Full CFP


Submission deadline: September 9, 2019, 23:59 UTC
Notification of acceptance: October 1, 2019
NeurIPS early registration deadline: October 23, 2019
Workshop: December 14, 2019 (Saturday)

Instructions

The submission deadline has now passed. If your submission was accepted for a poster presentation, please make your posters 36W x 48H inches or 90 x 122 cm. Posters should be on light weight paper and should not be laminated. As you design your poster, you may find the following resource helpful: Guidelines for Creating Accessible Printed Posters.




Related Workshops

Federated Learning for Data Privacy and Confidentiality @ NeurIPS: December 13, 2019

Invited Speakers

  • Philip Leclerc (US Census)
  • Ashwin Machanavajjhala (Duke University)
  • Brendan McMahan (Google)
  • Lalitha Sankar (Arizona State University)

Schedule

8:10 Opening
8:15 Invited talk: Brendan McMahan — Privacy for Federated Learning, and Federated Learning for Privacy   
More details coming soon
9:05 Gaussian Differential Privacy (contributed talk)   
Jinshuo Dong, Aaron Roth and Weijie Su
Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term ❝f-differential privacy❞ (f-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, f-DP preserves the hypothesis testing interpretation. In addition, f-DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for original DP to f-DP and, as an application, obtain a simple subsampling theorem for f-DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the f-DP class that is referred to as ❝Gaussian differential privacy❞ (GDP), defined based on testing two shifted Gaussians. GDP is focal among the f-DP class because of a central limit theorem we prove. More precisely, the privacy guarantees of any hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition. The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms. Taken together, this collection of attractive properties render f-DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.
9:25 QUOTIENT: Two-Party Secure Neural Network Training & Prediction (contributed talk)   
Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adria Gascon
Recently, there has been a wealth of effort devoted to the design of secure protocols for machine learning tasks. Much of this is aimed at enabling secure prediction from highly-accurate Deep Neural Networks (DNNs). However, as DNNs are trained on data, a key question is how such models can be also trained securely. The few prior works on secure DNN training have focused either on designing custom protocols for existing training algorithms or on developing tailored training algorithms and then applying generic secure protocols. In this work, we investigate the advantages of designing training algorithms alongside a novel secure protocol, incorporating optimizations on both fronts. We present QUOTIENT, a new method for discretized training of DNNs, along with a customized secure two-party protocol for it. QUOTIENT incorporates key components of state-of-the-art DNN training such as layer normalization and adaptive gradient methods, and improves upon the state-of-the-art in DNN training in two-party computation. Compared to prior work, we obtain an improvement of 50X in WAN time and 6% in absolute accuracy.
9:45 Coffee break
10:30 Invited talk: Ashwin Machanavajjhala — Fair Decision Making using Privacy-Protected Data   
Data collected about individuals is regularly used to make decisions that impact those same individuals. We consider settings where sensitive personal data is used to decide who will receive resources or benefits. While it is well known that there is a tradeoff between protecting privacy and the accuracy of decisions, in this talk, I will describe our recent work on a first-of-its-kind empirical study into the impact of formally private mechanisms (based on differential privacy) on fair and equitable decision-making.
11:20 Spotlight talks   
  1. [Jonathan Lebensold, William Hamilton, Borja Balle and Doina Precup] Actor Critic with Differentially Private Critic (#08)
  2. [Andres Munoz, Umar Syed, Sergei Vassilvitskii and Ellen Vitercik] Private linear programming without constraint violations (#17)
  3. [Ios Kotsogiannis, Yuchao Tao, Xi He, Ashwin Machanavajjhala, Michael Hay and Gerome Miklau] PrivateSQL: A Differentially Private SQL Query Engine (#27)
  4. [Amrita Roy Chowdhury, Chenghong Wang, Xi He, Ashwin Machanavajjhala and Somesh Jha] Crypt$\epsilon$: Crypto-Assisted Differential Privacy on Untrusted Servers (#31)
  5. [Jiaming Xu and Dana Yang] Optimal Query Complexity of Private Sequential Learning (#32)
  6. [Hsiang Hsu, Shahab Asoodeh and Flavio Calmon] Discovering Information-Leaking Samples and Features (#43)
  7. [Martine De Cock, Rafael Dowsley, Anderson Nascimento, Davis Railsback, Jianwei Shen and Ariel Todoki] Fast Secure Logistic Regression for High Dimensional Gene Data (#44)
  8. [Giuseppe Vietri, Grace Tian, Mark Bun, Thomas Steinke and Steven Wu] New Oracle-Efficient Algorithms for Private Synthetic Data Release (#45)
11:30 Poster session
12:30 Lunch break
14:00 Invited talk: Lalitha Sankar — Fair Universal Representations via Generative Models and Model Auditing Guarantees   
There is a growing demand for ML methods that limit inappropriate use of protected information to avoid both disparate treatment and disparate impact. In this talk, we present Generative Adversarial rePresentations (GAP) as a data-driven framework that leverages recent advancements in adversarial learning to allow a data holder to learn universal representations that decouple a set of sensitive attributes from the rest of the dataset while allowing learning multiple downstream tasks. We will briefly highlight the theoretical and practical results of GAP.

In the second half of the talk we will focus on model auditing. Privacy concerns have led to the development of privacy-preserving approaches for learning models from sensitive data. Yet, in practice, models (even those learned with privacy guarantees) can inadvertently memorize unique training examples or leak sensitive features. To identify such privacy violations, existing model auditing techniques use finite adversaries defined as machine learning models with (a) access to some finite side information (e.g., a small auditing dataset), and (b) finite capacity (e.g., a fixed neural network architecture). In the second half of the talk, we present requirements under which an unsuccessful attempt to identify privacy violations by a finite adversary implies that no stronger adversary can succeed at such a task. We will do so via parameters that quantify the capabilities of the finite adversary, including the size of the neural network employed by such an adversary and the amount of side information it has access to as well as the regularity of the (perhaps privacy-guaranteeing) audited model.
14:50 Pan-Private Uniformity Testing (contributed talk)   
Kareem Amin, Matthew Joseph and Jieming Mao
A centrally differentially private algorithm maps raw data to differentially private outputs. In contrast, a locally differentially private algorithm may only access data through public interaction with data holders, and this interaction must be a differentially private function of the data. We study the intermediate model of pan-privacy. Unlike a locally private algorithm, a pan-private algorithm receives data in the clear. Unlike a centrally private algorithm, the algorithm receives data one element at a time and must maintain a differentially private internal state while processing this stream. First, we show that pan-privacy against multiple intrusions on the internal state is equivalent to sequentially interactive local privacy. Next, we contextualize pan-privacy against a single intrusion by analyzing the sample complexity of uniformity testing over domain [k]. Focusing on the dependence on k, centrally private uniformity testing has sample complexity Θ(√k), while noninteractive locally private uniformity testing has sample complexity Θ(k). We show that the sample complexity of pan-private uniformity testing is Θ(k2/3). By a new Ω(k) lower bound for the sequentially interactive setting, we also separate pan-private from sequentially interactive locally private and multi-intrusion pan-private uniformity testing.
15:10 Private Stochastic Convex Optimization: Optimal Rates in Linear Time (contributed talk)   
Vitaly Feldman, Tomer Koren and Kunal Talwar
We study differentially private (DP) algorithms for stochastic convex optimization: the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. (2019) has established the optimal bound on the excess population loss achievable given n samples. Unfortunately, their algorithm achieving this bound is relatively inefficient: it requires O(min{n3/2,n5/2/d}) gradient computations, where d is the dimension of the optimization problem. We describe two new techniques for deriving DP convex optimization algorithms both achieving the optimal bound on excess loss and using O(min{n,n2/d}) gradient computations. In particular, the algorithms match the running time of the optimal non-private algorithms. The first approach relies on the use of variable batch sizes and is analyzed using the privacy amplification by iteration technique of Feldman et al. (2018). The second approach is based on a general reduction to the problem of localizing an approximately optimal solution with differential privacy. Such localization, in turn, can be achieved using existing (non-private) uniformly stable optimization algorithms. As in the earlier work, our algorithms require a mild smoothness assumption. We also give a linear-time algorithm achieving the optimal bound on the excess loss for the strongly convex case, as well as a faster algorithm for the non-smooth case.
15:30 Coffee break
16:15 Invited talk: Philip Leclerc — Formal Privacy At Scale: The 2020 Decennial Census TopDown Disclosure Limitation Algorithm   
To control vulnerabilities to reconstruction-abetted re-identification attacks that leverage massive external data stores and cheap computation, the U.S. Census Bureau has elected to adopt a formally private approach to disclosure limitation in the 2020 Decennial Census of Population and Housing. To this end, a team of disclosure limitation specialists have worked over the past 3 years to design and implement the U.S. Census Bureau TopDown Disclosure Limitation Algorithm (TDA). This formally private algorithm generates Persons and Households micro-data, which will then be tabulated to produce the final set of demographic statistics published as a result of the 2020 Census enumeration. In this talk, I outline the main features of TDA, describe the current iteration of the process used to help policy makers decide how to set and allocate privacy-loss budget, and outline known issues with - and intended fixes for - the current implementation of TDA.
17:05 Panel Discussion
17:55 Closing

View on SlidesLive

8:10 1st block      [View recording]
Watch the opening remarks, Brendan McMahan's invited talk, and the first two contributed talks.
10:30 2nd block     [View recording]
Watch Ashwin Machanavajjhala's invited talk.
14:00 3rd block      [View recording]
Watch Lalitha Sankar's invited talk and two more contributed talks.
16:15 4th block      [View recording]
Watch Philip Leclerc's invited talk and the panel discussion.

Accepted Papers

Clément Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman and Lydia Zakynthinou
Private Identity Testing for High-Dimensional Distributions    [arxiv]
Seth Neel, Zhiwei Steven Wu, Aaron Roth and Giuseppe Vietri
Differentially Private Objective Perturbation: Beyond Smoothness and Convexity    [arxiv]
Jonathan Lebensold, William Hamilton, Borja Balle and Doina Precup
Actor Critic with Differentially Private Critic    [arxiv]
Samyadeep Basu, Rauf Izmailov and Chris Mesterharm
Membership Model Inversion Attacks for Deep Networks    [arxiv]
Gautam Kamath, Janardhan Kulkarni, Zhiwei Steven Wu and Huanyu Zhang
Privately Learning Markov Random Fields    [pdf]
Fatemehsadat Mireshghallah, Mohammadkazem Taram, Prakash Ramrakhyani, Dean Tullsen and Hadi Esmaeilzadeh
Shredder: Learning Noise Distributions to Protect Inference Privacy    [arxiv]
Jinshuo Dong, Aaron Roth and Weijie Su
Gaussian Differential Privacy (contributed talk)    [arxiv]
Andres Munoz, Umar Syed, Sergei Vassilvitskii and Ellen Vitercik
Private Linear Programming Without Constraint Violations    [pdf]
Hafiz Imtiaz, Jafar Mohammadi and Anand D. Sarwate
Correlation-Assisted Distributed Differentially Private Estimation    [arxiv]
Naoise Holohan, Stefano Braghin, Pol Mac Aonghusa and Killian Levacher
Diffprivlib: The IBM Differential Privacy Library    [arxiv]
Antti Koskela, Joonas Jälkö and Antti Honkela
Computing Exact Guarantees for Differential Privacy    [arxiv]
Joonas Jälkö, Antti Honkela and Samuel Kaski
Privacy-Preserving Data Sharing via Probabilistic Modelling    [pdf]
Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adria Gascon
QUOTIENT: Two-Party Secure Neural Network Training and Prediction (contributed talk)    [arxiv]
Dingfan Chen, Ning Yu, Yang Zhang and Mario Fritz
GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs    [pdf]
Si Kai Lee, Luigi Gresele, Mijung Park and Krikamol Muandet
Private Causal Inference using Propensity Scores    [arxiv]
Kareem Amin, Matthew Joseph and Jieming Mao
Pan-Private Uniformity Testing (contributed talk)    [arxiv]
Ios Kotsogiannis, Yuchao Tao, Xi He, Ashwin Machanavajjhala, Michael Hay and Gerome Miklau
PrivateSQL: A Differentially Private SQL Query Engine    [pdf]
Chao Jin, Ahmad Qaisar Ahmad Al Badawi, Balagopal Unnikrishnan, Jie Lin, Fook Mun Chan, James Brown, J. Peter Campbell, Michael F. Chiang, Jayashree Kalpathy-Cramer, Vijay Chandrasekhar, Pavitra Krishnaswamy and Khin Mi Mi Aung
CareNets: Efficient Homomorphic CNN for High Resolution Images    [pdf]
Amrita Roy Chowdhury, Chenghong Wang, Xi He, Ashwin Machanavajjhala and Somesh Jha
Cryptϵ: Crypto-Assisted Differential Privacy on Untrusted Servers    [pdf]
Nhathai Phan, My Thai, Devu Shila and Ruoming Jin
Differentially Private Lifelong Learning    [pdf]
Alessandro Epasto, Hossein Esfandiari, Vahab Mirrokni, Andreas Munoz Medina, Umar Syed and Sergei Vassilvitskii
Anonymizing List Data    [pdf]
Mrinank Sharma, Michael Hutchinson, Siddharth Swaroop, Antti Honkela and Richard Turner
Differentially Private Federated Variational Inference    [arxiv]
Hassan Takabi, Robert Podschwadt, Jeff Druce, Curt Wu and Kevin Procopio
Privacy preserving Neural Network Inference on Encrypted Data with GPUs    [arxiv]
Casey Meehan and Kamalika Chaudhuri
Location Trace Privacy Under Conditional Priors    [arxiv]
Zhengli Zhao, Nicolas Papernot, Sameer Singh, Neoklis Polyzotis and Augustus Odena
Improving Differentially Private Models via Active Learning    [arxiv]
Hsiang Hsu, Shahab Asoodeh and Flavio Calmon
Discovering Information-Leaking Samples and Features    [pdf]
Martine De Cock, Rafael Dowsley, Anderson Nascimento, Davis Railsback, Jianwei Shen and Ariel Todoki
Fast Secure Logistic Regression for High Dimensional Gene Data    [pdf]
Giuseppe Vietri, Grace Tian, Mark Bun, Thomas Steinke and Steven Wu
New Oracle-Efficient Algorithms for Private Synthetic Data Release    [pdf]
Shadi Rahimian, Tribhuvanesh Orekondy and Mario Fritz
Differential Privacy Defenses and Sampling Attacks for Membership Inference    [pdf]
Amos Beimel, Aleksandra Korolova, Kobbi Nissim, Or Sheffet and Uri Stemmer
The Power of Synergy in Differential Privacy: Combining a Small Curator with Local Randomizers    [pdf]

Organization


Workshop organizers

  • Borja Balle (DeepMind)
  • Kamalika Chaudhuri (UC San Diego)
  • Antti Honkela (University of Helsinki)
  • Antti Koskela (University of Helsinki)
  • Casey Meehan (UC San Diego)
  • Mijung Park (Max Planck Institute for Intelligent Systems)
  • Mary Anne Smart (UC San Diego)
  • Adrian Weller (Alan Turing Institute & Cambridge)

Program Committee

  • James Bell (University of Cambridge)
  • Aurélien Bellet (INRIA)
  • Mark Bun (Boston University)
  • Christos Dimitrakakis (Chalmers University / University of Lille / Harvard University)
  • James Foulds (University of Maryland, Baltimore County)
  • Matt Fredrikson (Carnegie Mellon University)
  • Marco Gaboardi (University at Buffalo, SUNY)
  • Adria Gascon (The Alan Turing Institute / Warwick University)
  • Alon Gonen (Princeton University)
  • Peter Kairouz (Google AI)
  • Gautam Kamath (University of Waterloo)
  • Marcel Keller (Data61)
  • Nadin Kokciyan (King's College London)
  • Aleksandra Korolova (University of Southern California)
  • Audra McMillan (Boston University and Northeastern University)
  • Olga Ohrimenko (Microsoft)
  • Jun Sakuma (University of Tsukuba)
  • Anand Sarwate (Rutgers University)
  • Phillipp Schoppmann (Humboldt University of Berlin)
  • Or Sheffet (University of Alberta)
  • Kana Shimizu (Computational Biology Research Center, AIST)
  • Thomas Steinke (IBM)
  • Kunal Talwar (Google)
  • Carmela Troncoso (Ecole Polytechnique Fédérale de Lausanne)
  • Yu-Xiang Wang (Carnegie Mellon University)

Accessibility


By taking a few simple steps—such as paying special attention to font sizes and captions— you can make your presentations and posters more accessible. Feel free to contact us about any accessibility concerns relating to the website, workshop, etc.