Evidence-based medicine is useful for clinical decisions. A cornerstone of the evidence-based medicine is the hierarchical system of classifying levels of evidence. In this system, randomized controlled trials are often assigned the highest level because randomization is the most reliable method to control confounding factors. However, not all randomized controlled trials are conducted properly and their results should be scrutinized carefully [1]. Major limitations of this method are the high cost of conducting adequately powered studies, and the amount of time consumption and frustration to authors by extensive regulatory requirements, delays in approval, and unnecessary bureaucratic procedures. Another major limitation is that clinical trials involve selected patients with informed consent who are treated according to protocols that might not represent real-world practice. Possible solutions to the limitations above have been described such as registry-based randomized clinical trials [2, 3]. However, again, maintaining registries is costly, and data elements must be manually abstracted.
Alternatively to traditional registries, obtaining electronic health records within a healthcare system is often cumbersome, and sharing electronic data across health systems remains extremely uncommon, particularly because of questions relating to patient privacy and data ownership [4]. In the era of computers and digital transformation and governance, another possible solution is the use of synthetic data derivatives with the incorporation of artificial intelligence (AI) and machine learning (ML) methods.
Artificial intelligence and machine learning
AI is the reproduction of human intelligence via special programs and computers that are trained in a way that simulates human cognitive functions. ML is a category of AI that refers to algorithms designed on computers, which learn via training and imputing new data. Both techniques are promising and helpful in a variety of medical fields to improve patients care in the diagnosis, management, research, and systems analysis. Orthopaedic surgery is amenable to digital transformation by AI. As the amount of patients’ data increases rapidly, efficient process and analysis of all gathered information in order to conduct research and decide on the best therapies for any given orthopaedic disease is a very challenging task [5,6,7]. In that setting, clinicians will continue to play a vital role in research; AI will not make clinical values redundant, but it will make them more important [8].
Synthetic data generation
Synthetic data is non-reversible, artificially created data that replicates the statistical characteristics and correlations of real-world, raw data. Synthetic healthcare data does not contain identifiable information (e.g., names and dates of birth) because it uses a statistical approach to create a brand new data set using both discrete and non-discrete variables of interest. Synthetic data protects patient privacy while preserving maximum data utility to conduct research faster, to impact operational processes positively, and to improve patient outcomes ultimately while saving costs and resources [9]. Specifically, synthetic data comes from new groups of patients that do not correspond to real patients but at the same time, it has the same statistical properties and general characteristics as the initial one [10].
Synthetic data generation can be done with statistical stimulation or computational derivation. Statistical simulation uses real data in order to generate artificial data that simulates with great accuracy the disease distribution in the population and has similar characteristics with the real one, derived from patients. Statistical simulation is appropriate for broad descriptive analyses. The main limitation of this method is that it is not able to correlate patient comorbidities with clinical endpoints [11]. Computational derivation uses special computer algorithms to create synthetic data on demand, based on real patients’ data in real time. The synthetic data includes a similar number of patients as the real and also maintains the distribution and covariance structure of variables in the original data [4, 10]. The resulting synthetic data no longer contains data on individual patients but rather is a collection of observations which maintain the statistical properties of the original data. Since the synthetic data does not contain details on real patients, synthetic data can be shared easily and faster between researchers at different institutions. Although the synthetic patients are not real, they are not fake either. They have the same characteristics as the real including similar number of patients and similar distribution [4, 10]. Therefore, synthetic data can be a very promising alternative to classic clinical registries that come with great costs and restrictions about data sharing, paving the path for increased data availability and exchanges between large health centers [9, 12]. Limitations of synthetic healthcare data generation are statistically significant differences in some variables and predictors between real and synthetic databases; therefore, by using these models, the researchers may not be able to clarify the level of impact individual predictors have in multivariable studies.
A variety of synthetic data generation methods have been developed across a wide range of domains [11]. Anyone with access to the internet has free access to AI applications to generate synthetic patients and research that can be simulated with models for disease progression and standards of healthcare, and to construct research manuscripts, abstracts, and letters to the editor. Synthetic healthcare data has the potential to speed up medical innovation. In this setting, AI methods may offer benefits to journals, publishers, readers, and patients. However, AI applications cannot be listed as authors, and AI methods must be described in detail in the Methods section [13].
At International Orthopaedics, we aim to publish quality research, and we encourage novel methods to conduct research provided that they are clearly described and detailed explained in the methods section of the respective studies. We concur for the necessity of large healthcare data for evidence-based medicine, and we are alert for the protection of patients’ data privacy and identifier issues. Definitely, AI methods cannot be listed as authors in papers [13], but synthetic patients can be used as materials in studies. We expect that in the near future, the use and analysis of synthetic data will advance research by bypassing the inherent limitations of traditional methods and reducing the barriers to data sharing. The use of artificial intelligence to generate publishable material may become common; however, not mentioning it in the material and methods section is a potential fraud that authors and editors should know and avoid.
References
Burns PB, Rohrich RJ, Chung KC (2011) The levels of evidence and their role in evidence-based medicine. Plast Reconstr Surg 128(1):305–310. https://doi.org/10.1097/PRS.0b013e318219c171
Lauer MS, D’Agostino RBSR (2013) The randomized registry trial—the next disruptive technology in clinical research? N Eng J Med 369:1579–1581
James S, Rao SV, Granger CB (2015) Registry-based randomized clinical trials–a new clinical trial paradigm. Nat Rev Cardiol 12(5):312–316. https://doi.org/10.1038/nrcardio.2015.33
Greenberg JK, Landman JM, Kelly MP, Pennicooke BH, Molina CA, Foraker RE (2022) Ray WZ (2022) Leveraging artificial intelligence and synthetic data derivatives for spine surgery research. Global Spine J 3:21925682221085536. https://doi.org/10.1177/21925682221085535
Benzakour A, Altsitzioglou P, Lemée JM, Ahmad A, Mavrogenis AF, Benzakour T (2023) Artificial intelligence in spine surgery. Int Orthop 47(2):457–465. https://doi.org/10.1007/s00264-022-05517-8
Lopez IB, Benzakour A, Mavrogenis A, Benzakour T, Ahmad A, Lemée JM (2023) Robotics in spine surgery: systematic review of literature. Int Orthop 47(2):447–456. https://doi.org/10.1007/s00264-022-05508-9
Scarlat MM, Sun J, Fucs PMB, Giannoudis P, Mavrogenis AF, Benzakour T, Quaile A, Waddell JP (2020) Maintaining education, research and innovation in orthopaedic surgery during the COVID-19 pandemic. The role of virtual platforms. From presential to virtual, front and side effects of the pandemic. Int Orthop 44(11):2197–2202. https://doi.org/10.1007/s00264-020-04848-8
Panchmatia JR, Visenio MR, Panch T (2018) The role of artificial intelligence in orthopaedic surgery. Br J Hosp Med (Lond) 79(12):676–681. https://doi.org/10.12968/hmed.2018.79.12.676
Lieber D (2021) The people in this medical research are fake. The innovations are real. WSJ 2021 April 6. https://www.wsj.com/articles/the-people-in-this-medical-research-are-fake-the-innovations-are-real-11617717623?reflink=desktopwebshare_permalink.
Foraker RE, Yu SC, Gupta A, Michelson AP, Pineda Soto JA, Colvin R, Loh F, Kollef MH, Maddox T, Evanoff B, Dror H, Zamstein N, Lai AM, Payne PRO (2020) Spot the difference: comparing results of analyses from real patient data and synthetic derivatives. JAMIA Open 3(4):557–566. https://doi.org/10.1093/jamiaopen/ooaa060
Walonoski J, Kramer M, Nichols J, Quina A, Moesel C, Hall D, Duffett C, Dube K, Gallagher T, McLachlan S (2018) Synthea: an approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record. J Am Med Inform Assoc 25(3):230–238. https://doi.org/10.1093/jamia/ocx079.Erratum.In:JAmMedInformAssoc.2018Jul1;25(7):921
Pugely AJ, Martin CT, Harwood J, Ong KL, Bozic KJ, Callaghan JJ (2015) Database and registry research in orthopaedic surgery: part 2: clinical registry data. J Bone Joint Surg Am 97(21):1799–1808. https://doi.org/10.2106/JBJS.O.00134
Leopold SS, Haddad FS, Sandell LJ (2023) Swiontkowski M (2023) Artificial intelligence applications and scholarly publication in orthopaedic surgery. J Bone Joint Surg Am. https://doi.org/10.2106/JBJS.23.00293
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mavrogenis, A.F., Scarlat, M.M. Artificial intelligence publications: synthetic data, patients, and papers. International Orthopaedics (SICOT) 47, 1395–1396 (2023). https://doi.org/10.1007/s00264-023-05830-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00264-023-05830-w