Skip to main content

Comparative Evaluation of Classification Indexes and Outlier Detection of Microcytic Anaemias in a Portuguese Sample

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13566))

Abstract

Anaemia is often caused by a nutritional problem or by genetic diseases. The world prevalence of anaemia is estimated to be 24.8%, strengthening the need for appropriate discrimination methods between the different types of this disease, an essential step to choosing the best treatment and offering genetic counselling. Several indexes based on haematological features have been proposed to address the challenge of microcytic anaemias classification. However, they have not been tested extensively nor optimised for different countries. Here we test existing binary classification indexes in a Portuguese sample of 390 patients diagnosed with microcytic anaemia and propose novel classification methods to discriminate between the disease classes. We show that existing indexes for the binary classification of Iron Deficiency Anaemia (IDA) and \(\beta \)-thalassaemia trait are well adapted to this sample, with RDWI (red cell distribution width index) achieving a median accuracy of 95.4%, a performance we were also able to achieve using Random Forests. The multi-class classification was also developed to discriminate between three microcytic anaemias and healthy subjects, presenting a median accuracy of 93.0%. In addition, we developed a semi-automatic method to identify outliers, which were shown to correspond to subjects with unexpected features given their class and who may correspond to clinical misclassification that require further analysis. The results illustrate that it is possible to achieve excellent performance using just the information obtained through an affordable Complete Blood Count test, thus highlighting the potential of artificial intelligence in classifying microcytic anaemias.

Partially supported through FCT (UIDB/50021/2020, PTDC/CCI-BIO/4180/2020, DSAIPA/DS/0026/2019) and EU Horizon 2020 (No. 951970). INSEF, developed within the scope of the Pre-defined Project of the Programa Iniciativas em Saúde Pública, was promoted by INSA-DEP and benefited from financial support granted by Iceland, Liechtenstein and Norway, through the EEA Grants.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/kiecodes/genetic-algorithms.

References

  1. Aslan, D., Gümrük, F., Gürgey, A., Altay, C.: Importance of RDW value in differential diagnosis of hypochrome anemias. Am. J. Hematol. 69(1), 31–33 (2002)

    Article  Google Scholar 

  2. Bengfort, B., Bilbro, R.: Yellowbrick: visualizing the scikit-learn model selection process. J. Open Source Softw. 4(35), 1075 (2019)

    Article  Google Scholar 

  3. Camaschella, C.: Iron-deficiency anemia. N. Engl. J. Med. 372(19), 1832–1843 (2015)

    Article  Google Scholar 

  4. Cascio, M.J., DeLoughery, T.G.: Anemia: evaluation and diagnostic tests. Med. Clin. 101(2), 263–284 (2017)

    Google Scholar 

  5. England, J., Bain, B., Fraser, P.: Differentiation of iron deficiency from thalassaemia trait. Lancet 301(7818), 1514 (1973)

    Article  Google Scholar 

  6. Faleiro, B.D.: Hereditary anemia - characterization of the genetic basis and subjacent mechanisms. Tese de mestrado em Biologia Humana e Ambiente, Universidade de Lisboa, Faculdade de Ciências (2020)

    Google Scholar 

  7. Fonseca, C., Marques, F., Robalo Nunes, A., Belo, A., Brilhante, D., Cortez, J.: Prevalence of anaemia and iron deficiency in Portugal: the EMPIRE study. Intern. Med. J. 46(4), 470–478 (2016)

    Article  Google Scholar 

  8. Harris, C.R., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020)

    Article  Google Scholar 

  9. Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007)

    Article  Google Scholar 

  10. Jahangiri, M., Rahim, F., Malehi, A.S.: Diagnostic performance of hematological discrimination indices to discriminate between \(\beta \)eta thalassemia trait and iron deficiency anemia and using cluster analysis: introducing two new indices tested in Iranian population. Sci. Rep. 9(1), 1–13 (2019)

    Article  Google Scholar 

  11. Jaiswal, M., Srivastava, A., Siddiqui, T.J.: Machine learning algorithms for anemia disease prediction. In: Khare, A., Tiwary, U.S., Sethi, I.K., Singh, N. (eds.) Recent Trends in Communication, Computing, and Electronics. LNEE, vol. 524, pp. 463–469. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-2685-1_44

    Chapter  Google Scholar 

  12. Jamieson, K., Talwalkar, A.: Non-stochastic best arm identification and hyperparameter optimization. In: Artificial Intelligence and Statistics, pp. 240–248 (2016)

    Google Scholar 

  13. Kabootarizadeh, L., Jamshidnezhad, A., Koohmareh, Z.: Differential diagnosis of iron-deficiency anemia from \(\beta \)-thalassemia trait using an intelligent model in comparison with discriminant indexes. Acta Informatica Medica 27(2), 78 (2019)

    Article  Google Scholar 

  14. Matos, J.F., et al.: Comparison of discriminative indices for iron deficiency anemia and \(\beta \) thalassemia trait in a Brazilian population. Hematology 18(3), 169–174 (2013)

    Google Scholar 

  15. McKinney, W.: Data structures for statistical computing in Python. In: Proceedings of the 9th Python in Science Conference, pp. 56–61 (2010)

    Google Scholar 

  16. Nunes, B., et al.: The first Portuguese national health examination survey (2015): design, planning and implementation. J. Public Health 41(3), 511–517 (2019)

    Article  Google Scholar 

  17. Old, J.: Screening and genetic diagnosis of haemoglobin disorders. Blood Rev. 17(1), 43–53 (2003)

    Article  Google Scholar 

  18. Patel, B.A., Parikh, A.: Impact analysis of the complete blood count parameter using Naive Bayes. In: 2020 International Conference on Inventive Computation Technologies (ICICT), pp. 7–12 (2020)

    Google Scholar 

  19. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  20. Purwar, S., Tripathi, R.K., Ranjan, R., Saxena, R.: Detection of microcytic hypochromia using CBC and blood film features extracted from convolution neural network by different classifiers. Multimed. Tools Appl. 79(7), 4573–4595 (2020)

    Article  Google Scholar 

  21. Samões, C., et al.: Prevalence of anemia in the Portuguese adult population: results from the first national health examination survey (INSEF 2015). J. Public Health 1–8 (2020)

    Google Scholar 

  22. Sirdah, M., Tarazi, I., Al Najjar, E., Al Haddad, R.: Evaluation of the diagnostic reliability of different RBC indices and formulas in the differentiation of the \(\beta \)-thalassaemia minor from iron deficiency in Palestinian population. Int. J. Lab. Hematol. 30(4), 324–330 (2008)

    Article  Google Scholar 

  23. Tefferi, A.: Anemia in adults: a contemporary approach to diagnosis. Mayo Clin. Proc. 78(10), 1274–1280 (2003)

    Article  Google Scholar 

  24. WHO: Worldwide prevalence of anaemia 1993–2005: Who global database on anaemia. World Health Organization (2008)

    Google Scholar 

  25. WHO: Serum ferritin concentrations for the assessment of iron status and iron deficiency in populations. World Health Organization (2011)

    Google Scholar 

Download references

Acknowledgements

The data used to test the existing indexes and to train and test the predictive models were obtained from the Biobank of the Human Genetics Department of the National Institute of Health Dr. Ricardo (INSA). In addition, some data were added from the INSEF 2015 project [16] carried out by Department of Epidemiology of INSA. The authors wish to thank Marta Barreto, Bárbara Faleiro, and Daniela Santos for their support in the molecular diagnosis of some samples.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Susana Vinga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Leitão, B.N., Faustino, P., Vinga, S. (2022). Comparative Evaluation of Classification Indexes and Outlier Detection of Microcytic Anaemias in a Portuguese Sample. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16474-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16473-6

  • Online ISBN: 978-3-031-16474-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics