Skip to content
Licensed Unlicensed Requires Authentication Published online by De Gruyter May 10, 2023

Exact correction factor for estimating the OR in the presence of sparse data with a zero cell in 2 × 2 tables

  • Malavika Babu , Thenmozhi Mani , Marimuthu Sappani , Sebastian George , Shrikant I. Bangdiwala and Lakshmanan Jeyaseelan EMAIL logo

Abstract

In case-control studies, odds ratios (OR) are calculated from 2 × 2 tables and in some instances, we observe small cell counts or zero counts in one of the cells. The corrections to calculate the ORs in the presence of empty cells are available in literature. Some of these include Yates continuity correction and Agresti and Coull correction. However, the available methods provided different corrections and the situations where each could be applied are not very apparent. Therefore, the current research proposes an iterative algorithm of estimating an exact (optimum) correction factor for the respective sample size. This was evaluated by simulating data with varying proportions and sample sizes. The estimated correction factor was considered after obtaining the bias, standard error of odds ratio, root mean square error and the coverage probability. Also, we have presented a linear function to identify the exact correction factor using sample size and proportion.


Corresponding author: Lakshmanan Jeyaseelan, College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates, E-mail:

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. George, J, Thomas, K, Jeyaseelan, L, Peter, JV, Cherian, AM. Hyponatraemia and hiccups. Natl Med J India 1996;9:107–9.Search in Google Scholar

2. Sangeetha, U, Subbiah, M, Srinivasan, MR. Estimation of confidence intervals for Multinomial proportions of sparse contingency tables using Bayesian methods. Int J Sci Eng Res Pub 2013;3:7.Search in Google Scholar

3. Agresti, A. Introduction to categorical data analysis, 2nd ed. Hoboken: John Wiley & Sons, Inc; 2007:394 p.Search in Google Scholar

4. Sweeting, MJ, Sutton, AJ, Lambert, PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Stat Med 2004;23:1351–75. https://doi.org/10.1002/sim.1761.Search in Google Scholar

5. Yates, F. Contingency tables involving small numbers and the χ 2 test. Supplement to the. J Roy Stat Soc 1934;1:217. https://doi.org/10.2307/2983604.Search in Google Scholar

6. Agresti, A, Coull, BA. Approximate is better than “exact” for interval estimation of binomial proportions. Am Stat 1998;52:119–26.Search in Google Scholar

7. Haviland, MG. Yates’s correction for continuity and the analysis of 2 × 2 contingency tables. Stat Med 1990;9:363–7. https://doi.org/10.1002/sim.4780090403.Search in Google Scholar

8. Subbiah, M, Srinivasan, MR. Classification of 2×2 sparse data sets with zero cells. Stat Probabil Lett 2008;78:3212–5. https://doi.org/10.1016/j.spl.2008.06.023.Search in Google Scholar

9. Lyles, RH, Guo, Y, Greenland, S. Reducing bias and mean squared error associated with regression-based odds ratio estimators. J Stat Plan Inference 2012;142:3235–41. https://doi.org/10.1016/j.jspi.2012.05.005.Search in Google Scholar

10. Agresti, A, Hitchcock, DB. Bayesian inference for categorical data analysis. JISS 2005;14:297–330. https://doi.org/10.1007/s10260-005-0121-y.Search in Google Scholar

11. Greenland, S. Bayesian perspectives for epidemiological research. II. Regression analysis. Int J Epidemiol 2007;36:195–202. https://doi.org/10.1093/ije/dyl289.Search in Google Scholar

12. Galindo-Garre, F, Vermunt, JK, Ato-García, M. Bayesian approaches to the problem of sparse tables in log- linear modeling. In: Proceedings of the fifth International conference on logic and methodology; 2011.Search in Google Scholar

13. Greenland, S, Schwartzbaum, JA, Finkle, WD. Problems due to small samples and sparse data in conditional logistic regression analysis. Am J Epidemiol 2000;151:531–9. https://doi.org/10.1093/oxfordjournals.aje.a010240.Search in Google Scholar

14. Efron, B. Empirical Bayes methods for combining likelihoods. J Am Stat Assoc 1996;91:538–50. https://doi.org/10.1080/01621459.1996.10476919.Search in Google Scholar

15. Xie, M, Singh, K, Strawderman, WE. Confidence distributions and a unifying framework for meta-analysis. J Am Stat Assoc 2011;106:320–33. https://doi.org/10.1198/jasa.2011.tm09803.Search in Google Scholar

16. Walter, SD, Cook, RJ. A Comparison of several point estimators of the odds ratio in a single 2 X 2 contingency table. Biometrics 1991;47:795. https://doi.org/10.2307/2532640.Search in Google Scholar

17. Walter, SD. The distribution of Levin’s measure of attributable risk. Biometrika 1975;62:371–2. https://doi.org/10.1093/biomet/62.2.371.Search in Google Scholar

18. Efron, B, Tibshirani, RJ. An introduction to the bootstrap [Internet]. Boston, MA: Springer US; 1993. Available from: http://link.springer.com/10.1007/978-1-4899-4541-9 [Accessed 19 Apr 2021].Search in Google Scholar

19. Nair, BR, Rajshekhar, V. Factors predicting the need for prolonged (>24 Months) antituberculous treatment in patients with Brain tuberculomas. World Neurosurg 2019;125:e236–47. https://doi.org/10.1016/j.wneu.2019.01.053.Search in Google Scholar

20. Puhr, R, Heinze, G, Nold, M, Lusa, L, Geroldinger, A. Firth’s logistic regression with rare events: accurate effect estimates and predictions? Stat Med 2017;36:2302–17.Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/ijb-2022-0040).


Received: 2022-01-05
Accepted: 2023-03-27
Published Online: 2023-05-10

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 27.4.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2022-0040/html
Scroll to top button