skip to main content
research-article

Functional mechanism: regression analysis under differential privacy

Authors Info & Claims
Published:01 July 2012Publication History
Skip Abstract Section

Abstract

ε-differential privacy is the state-of-the-art model for releasing sensitive information while protecting privacy. Numerous methods have been proposed to enforce ε-differential privacy in various analytical tasks, e.g., regression analysis. Existing solutions for regression analysis, however, are either limited to non-standard types of regression or unable to produce accurate regression results. Motivated by this, we propose the Functional Mechanism, a differentially private method designed for a large class of optimization-based analyses. The main idea is to enforce ε-differential privacy by perturbing the objective function of the optimization problem, rather than its results. As case studies, we apply the functional mechanism to address two most widely used regression models, namely, linear regression and logistic regression. Both theoretical analysis and thorough experimental evaluations show that the functional mechanism is highly effective and efficient, and it significantly outperforms existing solutions.

References

  1. T. Apostol. Calculus. Jon Wiley & Sons, 1967.Google ScholarGoogle Scholar
  2. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In Proceedings of the 27th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 273--282, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Bhaskar, S. Laxman, A. Smith, and A. Thakurta. Discovering frequent patterns in sensitive data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 503--512, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Chaudhuri and C. Monteleoni. Privacy-preserving logistic regression. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems, pages 289--296, 2008.Google ScholarGoogle Scholar
  5. K. Chaudhuri, C. Monteleoni, and A. D. Sarwate. Differentially private empirical risk minimization. Journal of Machine Learning Research, 12:1069--1109, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Cormode, C. M. Procopiuc, E. Shen, D. Srivastava, and T. Yu. Differentially private spatial decompositions. In Proceedings of the 28th International Conference on Data Engineering, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Cormode, C. M. Procopiuc, D. Srivastava, and T. T. L. Tran. Differentially private publication of sparse data. In Proceedings of the 15th International Conference on Database Theory, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Ding, M. Winslett, J. Han, and Z. Li. Differentially private data cubes: optimizing noise sources and consistency. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 217--228, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3th Theory of Cryptography Conference, pages 265--284, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Feldman, A. Fiat, H. Kaplan, and K. Nissim. Private coresets. In Proceedings on 41th Annual ACM Symposium on Theory of Computing, pages 361--370, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Friedman and A. Schuster. Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 493--502, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Götz, A. Machanavajjhala, G. Wang, X. Xiao, and J. Gehrke. Publishing search logs - a comparative study of privacy guarantees. IEEE Transactions on Knowledge and Data Engineering, 24(3):520--532, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. Proceedings of the VLDB Endowment, 3(1):1021--1032, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 42(1):80--86, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas. Releasing search queries and clicks privately. In Proceedings of the 18th International Conference on World Wide Web, pages 171--180, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Lei. Differentially private m-estimators. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems, 2011.Google ScholarGoogle Scholar
  17. C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In Proceedings of the 27th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 123--134, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. D. Li, Z. Zhang, M. Winslett, and Y. Yang. Compressive mechanism: Utilizing sparse respresentation in differential privacy. In Proceedings of the ACM Workshop on Privacy in the Electronic Society, pages 177--182, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. F. McSherry and R. Mahajan. Differentially-private network trace analysis. In Proceedings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 123--134, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F. McSherry and I. Mironov. Differentially private recommender systems: Building privacy into the Netflix prize contenders. In Proceedings of the 1tth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 627--636, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. McSherry and K. Talwar. Mechanism design via differential privacy. In Proceedings of 48th Annual IEEE Symposium on Foundations of Computer Science, pages 94--103, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Minnesota Population Center. Integrated public use microdata series -- international: Version 5.0. 2009. https://international.ipums.org.Google ScholarGoogle Scholar
  23. K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings on 39th Annual ACM Symposium on Theory of Computing, pages 75--84, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 735--746, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. I. P. Rubinstein, P. L. Bartlett, L. Huang, and N. Taft. Learning in a large function space: Privacy-preserving mechanisms for SVM learning. Journal of Privacy and Confidentiality, to appear 2012.Google ScholarGoogle Scholar
  26. W. Rudin. Principles of Mathematical Analysis (3rd Edition). McGraw-Hill, 1976.Google ScholarGoogle Scholar
  27. A. Smith. Privacy-preserving statistical estimation with optimal convergence rate. In Proceedings on 43th Annual ACM Symposium on Theory of Computing, pages 813--822, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Strang. Introduction to Linear Algebra. Addison Wesley, 4th edition, 1999.Google ScholarGoogle Scholar
  29. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, 58(1):267--288, 1996.Google ScholarGoogle Scholar
  30. X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In Proceedings of the 26th International Conference on Data Engineering, pages 225--236, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  31. J. Xu, Z. Zhang, X. Xiao, Y. Yang, and G. Yu. Differentially private histogram publication. In Proceedings of the 28th International Conference on Data Engineering, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 5, Issue 11
    July 2012
    608 pages

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 July 2012
    Published in pvldb Volume 5, Issue 11

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader