Geometric Transformations for Privacy-preserving Data Classification

Keke Chen and Ling Liu

 


This project investigates a random rotation perturbation approach for privacy preserving data classification. The goal of our rotation-perturbation approach is two-fold: preserving the accuracy of classifiers and preserving the privacy of data. To achieve the first goal, we identify that many classification models utilize the geometric properties of datasets, which can be preserved by rotation transformation. We prove that the three types of well-known classifiers will deliver the same performance over the rotation perturbed dataset as over the original dataset. As a result, our random rotation-based perturbation guarantees no loss of accuracy for three popular classification methods. To reach the second goal, we propose a multi-column privacy model to address the problems of evaluating privacy quality for multidimensional perturbation. With this metric, we develop a local optimal algorithm to find the good rotation perturbation in terms of privacy guarantee. We also analyze three types of inference attacks: naive estimation, ICA-based reconstruction, and distribution-based attacks with the privacy model. Our initial experiments show that the random rotation perturbation can provide high privacy guarantee while maintaining zero-loss of accuracy for the discussed classifiers.

More related transformations will be investigated to meet the requirements of different privacy-preserving mining tasks and models.

 

Related papers:

  • Keke Chen and Ling Liu: "Privacy-Preserving Data Classification with Rotation Perturbation ", Proc. of IEEE Intl. Conf on Data Mining 2005 (ICDM05). (pdf)
  • Keke Chen and Ling Liu: "Geometric Data Perturbation Approach to Privacy-Preserving Data Classification", under review
  • Keke Chen and Ling Liu, Geometric Data Perturbation for Privacy-preserving Multiparty Distributed Classification, under review