Data Synthesis

Zhang, Ting

doi:10.1007/978-3-319-32010-6_503

Ting Zhang³

41 Accesses

Definition/Introduction

While traditionally data synthesis often refers to descriptive or interpretative narrative and tabulation in studies like meta-analyses, in the big data context, data synthesis refer to the process of creating synthetic data. In the big data context, the digital technology provides unprecedented tremendous data information. The rich data across various fields can jointly offer extensive information about individual persons or organization for finance, economics, health, other research, evaluation, policy making, etc. However, fortunately our laws necessarily protect our privacy and data confidentiality; this necessary data protection becomes increasing important in our big data world where thefts and various levels of data breach could become much easier. The synthetic data has the same or highly similar attributes of the real data for many analytic purposes but masks the original data for more privacy and confidentiality. Synthetic data was first proposed by...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 549.99; Price excludes VAT (USA)

Hardcover Book: USD 599.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abowd, J.M., & Lane, J.I. (2004). New Approaches to Confidentiality Protection: Synthetic Data, Remote Access and Research Data Centers. In Domingo-Ferrer, J. & Torra, V. (Eds.), Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, Barcelona, Spain, June 9-11, 2004, Proceedings. (pp. 282-289). Berlin: Springer.
Google Scholar
Abowd, J. M., & Woodcock, S. D. (2004). Multiply-imputing confidential characteristics and file links in longitudinal linked data. In Domingo-Ferrer, J. & Torra, V. (Eds.), Privacy in Statistical Databases: CASC Project International Workshop, PSD 2004, Barcelona, Spain, June 9-11, 2004, Proceedings. (pp. 290–297). Berlin: Springer.
Google Scholar
Dalenius, T., & Reiss, S. P. (1982). Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6, 73–85.
Google Scholar
Drechsler, J., & Reiter, J. P. (2010). Sampling with synthesis: A new approach for releasing public use census microdata. Journal of the American Statistical Association, 105(492), 1347–1357.
Google Scholar
Rubin, D. B. (1993). Discussion: Statistical disclosure limitation. Journal of Official Statistics, 9, 462–468.
Google Scholar
Winkler, W. E. (2007). Examples of easy-to-implement, widely used methods of masking for which analytic properties are not justified. Tech. Rep., U.S. Census Bureau Research Report Series, No. 2007–21.
Google Scholar
Yancey, W. E., Winkler, W. E., & Creecy, R. H. (2002). Disclosure risk assessment in perturbative microdata protection. In J. Domingo-Ferrer (Ed.), Inference control in statistical databases (pp. 135–152). Berlin: Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Accounting, Finance and Economics, Merrick School of Business, University of Baltimore, Baltimore, MD, USA
Ting Zhang

Authors

Ting Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ting Zhang .

Editor information

Editors and Affiliations

George Mason University, Fairfax, VA, USA
Laurie A. Schintler
George Mason University, Fairfax, VA, USA
Connie L. McNeely

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Zhang, T. (2022). Data Synthesis. In: Schintler, L.A., McNeely, C.L. (eds) Encyclopedia of Big Data. Springer, Cham. https://doi.org/10.1007/978-3-319-32010-6_503

Download citation

DOI: https://doi.org/10.1007/978-3-319-32010-6_503
Published: 12 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32009-0
Online ISBN: 978-3-319-32010-6
eBook Packages: Business and ManagementReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences

Publish with us

Policies and ethics