About the project: Experiment with Differential Privacy using IBM's Diffprivlib
Under IBM India’s Global Remote Mentoring Program, I participated in an open-source project on ‘Experimenting with Differential Privacy using IBM’s Diffprivlib’ between September 2020 - August 2021.
Table of Contents
About
Guides
- Ms. Seetha Subramaniam, IBM
- Ms. Deepshikha Sinha, IBM
- Prof. Ashutosh Muchrikar, CCEW
Teammates
- Atmaja Jape, CCEW
- Tanya Sikarwar, CCEW
Background
Differential privacy addresses the paradox of learning nothing about an individual while learning useful information about a population. Essentially, it is a definition that formalizes the idea that a query should not reveal whether any one person is present in a dataset, much less what their data are.
Some important terms used in Differential Privacy -
- Sensitivity
- Privacy Loss
- Privacy Budget
- Mechanism
Refer this section for more information on Differential Privacy.
Our Contributions
During the course of this project, we worked on the following -
-
Extending the functionality of Diffprivlib
Implemented the following differentially private statistical utilities and data visualization tools:
- Percentiles
- Median
- Interquartile Range
- Frequency Polygon
- Bivariate Histogram
-
Demonstrating privacy-preserving data analysis for real-world use cases
Using the U.S. Census Bureau dataset, we performed a case study for analysing the algorithms.
-
Promoting the use of IBM’s Open Source library, Diffprivlib, in an accessible manner
Built a User Interface to expose our implementations using Python StreamLit.
-
Enabling data analysts to experiment, investigate and develop applications using differential privacy
We performed an extensive empirical analysis of different percentile calculation methods, including our own extended Optimal Histogram method.
We also presented our work to Security Division, IBM Bangalore. We also presented our work to Dr. Naoise Holohan, one of the leading developers of Diffprivlib. The feedback was amazing and it was very fulfilling to show our work to seniors from industry.