Methods Inf Med 2015; 54(04): 328-337
DOI: 10.3414/ME14-01-0093
Original Articles
Schattauer GmbH

Linked Records of Children with Traumatic Brain Injury[*]

Probabilistic Linkage without Use of Protected Health Information
T. D. Bennett
1   Pediatric Critical Care, University of Utah School of Medicine, Salt Lake City, UT, USA
2   Current address: Pediatric Critical Care, University of Colorado School of Medicine, Aurora, CO, USA
,
J. M. Dean
1   Pediatric Critical Care, University of Utah School of Medicine, Salt Lake City, UT, USA
,
H. T. Keenan
1   Pediatric Critical Care, University of Utah School of Medicine, Salt Lake City, UT, USA
,
M. H. McGlincy
3   Strategic Matching, Inc., Morrisonville, NY, USA
,
A. M. Thomas
1   Pediatric Critical Care, University of Utah School of Medicine, Salt Lake City, UT, USA
,
L. J. Cook
1   Pediatric Critical Care, University of Utah School of Medicine, Salt Lake City, UT, USA
› Author Affiliations
Further Information

Publication History

received: 15 September 2014

accepted: 15 March 2015

Publication Date:
22 January 2018 (online)

Summary

Objective: Record linkage may create powerful datasets with which investigators can conduct comparative effectiveness studies evaluating the impact of tests or interventions on health. All linkages of health care data files to date have used protected health information (PHI) in their linkage variables. A technique to link datasets without using PHI would be advantageous both to preserve privacy and to increase the number of potential linkages.

Methods: We applied probabilistic linkage to records of injured children in the National Trauma Data Bank (NTDB, N = 156,357) and the Pediatric Health Information Systems (PHIS, N = 104,049) databases from 2007 to 2010. 49 match variables without PHI were used, many of them administrative variables and indicators for procedures recorded as International Classification of Diseases, 9th revision, Clinical Modification codes. We validated the accuracy of the linkage using identified data from a single center that submits to both databases.

Results: We accurately linked the PHIS and NTDB records for 69% of children with any injury, and 88% of those with severe traumatic brain injury eligible for a study of intervention effectiveness (positive predictive value of 98%, specificity of 99.99%). Accurate linkage was associated with longer lengths of stay, more severe injuries, and multiple injuries.

Conclusion: In populations with substantial illness or injury severity, accurate record linkage may be possible in the absence of PHI. This methodology may enable linkages and, in turn, comparative effectiveness studies that would be unlikely or impossible otherwise.

* Supplementary material published on our web-site www.methods-online.com


 
  • References

  • 1 Weiss NS. The new world of data linkages in clinical epidemiology: are we being brave or foolhardy?. Epidemiology 2011; 22 (03) 292-294.
  • 2 Herzog TN, Sheuren FJ, Winkler WE. Data quality and record linkage techniques. Springer: 2007
  • 3 Winkler WE. Overview of Record Linkage and Current Research Directions. Washington, DC: Statistical Research Division, U.S. Census Bureau; 2006
  • 4 Fellegi IP, Sunter AB. A Theory for Record Linkage. Journal of the American Statistical Association 1969; 64 (328) 1183-1210.
  • 5 Roos LL, Wajda A. Record linkage strategies. Part I: Estimating information and evaluating approaches. Methods Inf Med 1991; 30 (02) 117-123.
  • 6 Newcombe HB, Kennedy JM, Axford SJ, James AP. Automatic linkage of vital records. Science 1959; 130 3381 954-959.
  • 7 Newcombe HB, and Kennedy JM. Record Linkage: Making Maximum Use of the Discriminating Power of Identifying Information. Communications of the Association of Computing Machinery 1962; 5 (11) 563-566.
  • 8 United States Department of Health and Human Services. Understanding Health Information Privacy. 2014. [cited May 16, 2014 ]. Available from http://www.hhs.gov/ocr/privacy/hipaa/understanding/index.html.
  • 9 Navathe AS, Clancy C, Glied S. Advancing research data infrastructure for patient-centered outcomes research. JAMA 2011; 306 (11) 1254-1255.
  • 10 Cook LJ, Olson LM, Dean JM. Probabilistic record linkage: relationships between file sizes, identifiers and match weights. Methods Inf Med 2001; 40 (03) 196-203.
  • 11 Belin TR, Ishwaran H, Duan N, Berry S, Kanouse D. Identifying likely duplicates by record linkage in a survey of prostitutes. In Gelman A, Meng X. editors Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives. Wiley: 2004
  • 12 Gerber JS, Newland JG, Coffin SE, Hall M, Thurm C, Prasad PA. et al. Variability in antibiotic use at children’s hospitals. Pediatrics 2010; 126 (06) 1067-1073.
  • 13 Weiss PF, Klink AJ, Hexem K, Burnham JM, Leonard MB, Keren R. et al. Variation in inpatient therapy and diagnostic evaluation of children with Henoch Schoenlein purpura. J Pediatr 2009; 155 (06) 812-8. e1.
  • 14 Slonim AD, Khandelwal S, He J, Hall M, Stockwell DC, Turenne WM. et al. Characteristics associated with pediatric inpatient death. Pediatrics 2010; 125 (06) 1208-1216.
  • 15 Conway PH, Keren R. Factors associated with variability in outcomes for children hospitalized with urinary tract infection. J Pediatr 2009; 154 (06) 789-796.
  • 16 American College of Surgeons Committee on Trauma. National Trauma Data Bank Research Data Set User Manual, Admission Year 2009. Chicago, IL: December 2010
  • 17 Langlois JA, Rutland-Brown W, Thomas KE. Traumatic Brain Injury in the United States: Emergency Department Visits, Hospitalizations, and Deaths. Division of Injury Response, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services. 2006
  • 18 Tri-Analytics Inc. and The Johns Hopkins University. ICDMAP-90 Software User’s Guide. 1997
  • 19 Centers for Disease Control and Prevention. Recommended framework for presenting injury mortality data. MMWR 1997; 46: RR-14.
  • 20 Barell V, Aharonson-Daniel L, Fingerhut LA, Mackenzie EJ, Ziv A, Boyko V. et al. An introduction to the Barell body region by nature of injury diagnosis matrix. Inj Prev 2002; 8 (02) 91-96.
  • 21 McGlincy MH. A Bayesian Record Linkage Methodology for Multiple Imputation of Missing Links. Section on Survey Research Methods, American Statistical Association 2004; 4001-4008.
  • 22 Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC; 2004
  • 23 McGlincy MH. Using Test Databases to Evaluate Record Linkage Models and Train Linkage Practitioners. Section on Survey Research Methods, American Statistical Association 2006; 3404-3410.
  • 24 Shannon CE. A Mathematical Theory of Communication. The Bell System Technical Journal 1948; 27 (03) 379-423. 623–656.
  • 25 Schneider TD. Information Theory Primer. 2013. [May 16, 2014]. Available from http://schneider. ncifcrf.gov/papers/primer/.
  • 26 Mason CA, Tu S. Data linkage using probabilistic decision rules: a primer. Birth defects research, Part A: Clinical and molecular teratology 2008; 82 (11) 812-821.
  • 27 Winkler WE. Advanced Methods for Record Linkage. Washington, DC: Statistical Division, United States Bureau of the Census; 1994
  • 28 Winkler WE. Improved Decision Rules in the Fellegi-Sunter Model of Record Linkage. Washington, DC: Statistical Division, United States Bureau of the Census; 1993
  • 29 Schnell R, Bachteler T, Reiher J. Privacy-preserving record linkage using Bloom filters. BMC medical informatics and decision making 2009; 9: 41.
  • 30 Randall SM, Ferrante AM, Boyd JH, Bauer JK, Semmens JB. Privacy-preserving record linkage on large real world datasets. Journal of Biomedical Informatics. 2013
  • 31 Quantin C, Bouzelat H, Allaert FA, Benhamiche AM, Faivre J, Dusserre L. Automatic record hash coding and linkage for epidemiological follow-up data confidentiality. Methods Inf Med 1998; 37 (03) 271-277.
  • 32 Kuzu M, Kantarcioglu M, Durham EA, Toth C, Malin B. A practical approach to achieve private medical record linkage in light of public resources. J Am Med Inform Assoc 2013; 20 (02) 285-292.
  • 33 Weber SC, Lowe H, Das A, Ferris T. A simple heuristic for blindfolded record linkage. J Am Med Inform Assoc 2012; 19 e (01) e157-161.
  • 34 Deans KJ, Cooper JN, Rangel SJ, Raval MV, Minneci PC, Moss RL. Enhancing NSQIP-Pediatric through integration with the Pediatric Health Information System. J Pediatr Surg 2014; 49 (01) 207-212. discussion 12.
  • 35 Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J 2009; 157 (06) 995-1000.
  • 36 Pasquali SK, Jacobs JP, Shook GJ, O’Brien SM, Hall M, Jacobs ML. et al. Linking clinical registry data with administrative data using indirect identifiers: implementation and validation in the congenital heart surgery population. Am Heart J 2010; 160 (06) 1099-1104.
  • 37 Saatman KE, Duhaime AC, Bullock R, Maas AI, Valadka A, Manley GT. Classification of traumatic brain injury for targeted therapies. J Neurotrauma 2008; 25 (07) 719-738.
  • 38 Larsen MD. Comments on hierarchical Bayesian record linkage. Joint Statistical Meeting, Proceedings of the Survey Methods Section. 2002: 1995-2000.
  • 39 Jaro MA. Probabilistic linkage of large public health data files. Stat Med 1995; 14 5–7 491-498.