ML Privacy


While concerns have been raised about the destructive potential of AI, there is a more immediate and tangible threat, human intruders who seek to use data for malicious purposes. Videos, photos, emails, banking transactions, browsing history, GPS tracks, and other personal data is continuously collected and stored in the cloud by organizations for analysis. That data may be circulated around the Internet without the data owner's knowledge and be at risk of exposure to malicious entities. A few recent data leakages are described by [1].

The complete problem of maintaining privacy is complex. It is distributed temporarily since data owner's present and past actions can compromise privacy. It is distributed spatially as the data owner has personal information in multiple accounts, devices, and physical locations. The appraoch Professor Kung's group and I explore is privacy protection in the context of machine learning, at the time and location the data owner submits data to a machine learning service. We seek to develop systems which allow the data owner to control the privacy of his data and utilize the idea of Compressive Privacy [2].


Compressive Privacy in Machine Learning for Classification

We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification. For more details please see [3]

Compressive Privacy in Facial Recognition

To protect images in facial recognition application, we developed a method of selecting features from Fast Fourier Transform and Wavelet Transform which prevent reconstruction of the original image while maintaining accuracy. The two images below are reconstructions after a Wavelet Transform and our filtering method. For more details please see [4].



Related Literature