Skip to main content

Resampling Strategies for Mitigating Class Imbalance of ASD Dataset on the Performance of Machine Learning Classifiers

  • Conference paper
  • First Online:
Advanced Computational and Communication Paradigms (ICACCP 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 535))

  • 179 Accesses

Abstract

In the supplied ASD dataset, it is typically seen that there is an extremely large imbalance in the number of samples for two classes, leading to an imbalance. Without addressing this issue, applying binary classification algorithms to such data would produce an extremely biased result. It affects the relationships between features as well. These misclassifications could affect the decision regarding medical treatment and result in a protracted delay for those who urgently require medical intervention. In the current study, we use a variety of resampling strategies to address the issue of class imbalance. Precision, Recall, and F1-score are used as evaluation measures for all models. We have also looked at AUC score, which demonstrates encouraging outcomes for the use of resampling methods for imbalanced dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Parellada M, Penzol MJ, Pina L, Moreno C, Gonz´alez-Vioque E, Zalsman G, Arango C (2014) The neurobiology of autism spectrum disorders. Eur Psychiatry 29(1):11–19. https://doi.org/10.1016/j.eurpsy.2013.02.005

  2. Lord C, Risi S, DiLavore PS, Shulman C, Thurm A, Pickles A (2006) Autism from 2 to 9 years of age. Arch Gen Psychiatry 63(6):694–701. https://doi.org/10.1001/archpsyc.63.6.694

    Article  Google Scholar 

  3. Hyman SL, Levy SE, Myers SM (2020) Identification, evaluation, and management of children with autism spectrum disorder. Pediatrics 145(1):694–701. https://doi.org/10.1542/peds.2019-3447

    Article  Google Scholar 

  4. Association AP (2013) Diagnostic and statistical manual of mental disorders, 5th edn. American Psychiatric Association. https://doi.org/ https://doi.org/10.1176/appi.books.9780890425596

  5. Allison C, Baron-Cohen S, Wheelwright S, Charman T, Richler J, Pasco G, Brayne C (2008) The q-chat (quantitative checklist for autism in toddlers): a normally distributed quantitative measure of autistic traits at 18–24 months of age: preliminary report. J Autism Dev Disord 38(8):1414–1425. https://doi.org/10.1007/s10803-007-0509-7

    Article  Google Scholar 

  6. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority oversampling technique. J Artif Intell Res 16(8):321–357. https://doi.org/10.1613/jair.953

    Article  MATH  Google Scholar 

  7. Fern´andez A, del R´ıo S, Chawla NV, Herrera1 F (2017) An insight into imbalanced big data classification: Outcomes and challenges. Complex Intell Syst 3:105–120. https://doi.org/10.1007/s40747-017-0037-9

  8. Abdeljaber F (2019) Detecting autistic traits using computational intelligence and machine learning techniques. Master of research thesis, Psychology Department, School of Health, University of Huddersfield, Huddersfield, UK. http://eprints.hud.ac.uk/id/eprint/34844/

  9. Estabrooks A, Jo T, Japkowicz N (2004) A multiple resampling method for learning from imbalanced data sets. Comput Intell 20(1):18–36. https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x

    Article  MathSciNet  Google Scholar 

  10. Thabtah F, Hammoud S, Kamalov F, Gonsalves A (2020) Data imbalance in classification: experimental evaluation. Inf Sci 513:429–441. https://doi.org/10.1016/j.ins.2019.11.004

    Article  MathSciNet  Google Scholar 

  11. Zheng Z, Cai Y, Li Y (2015) Oversampling method for imbalanced classification. Comput Inform 34(5):1017–1037. https://doi.org/10.1016/j.ins.2019.11.004

  12. Thabtah F, Kamalov F, Rajab K (2018) A new computational intelligence approach to detect autistic features for autism screening. Int J Med Inform 117:112–124. https://doi.org/10.1016/j.ijmedinf.2018.06.009

  13. Han H, Wang W-Y, Mao B-H (2005) Borderline-smote: a new oversampling method in imbalanced data sets learning. In: Huang D-S, Zhang X-P, Huang G-B (eds) Advances in intelligent computing ICIC. Lecture notes in computer science. Springer, Berlin, Heidelberg, pp 878–887. https://doi.org/10.1007/1153805991

  14. Wang Q, Luo Z, Huang J, Feng Y, Liu Z (2017) A novel ensemble method for imbalanced data learning: bagging of extrapolation-smote svm. Comput Intell Neurosci (Article ID 1827016):11 https://doi.org/10.1155/2017/1827016

  15. He H, Bai Y, Garcia EA, Li S (2008) Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks, IEEE world congress on computational intelligence. pp 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969

  16. Shelke MS, Deshmukh PR, Shandilya VK (2017) A review on imbalanced data handling using undersampling and oversampling technique. Int. J. Recent Trends Eng Res 3(4):444–449

    Article  Google Scholar 

  17. Abdelhamid N, Padmavathy A, Peebles D, Thabtah F, GoulderHorobin D (2020) Data imbalance in autism pre-diagnosis classification systems: an experimental study. J Inf Knowl Manag 19(1). https://doi.org/10.1142/S0219649220400146

  18. Rahman MM, Davis DN (2013) Addressing the class imbalance problem in medical datasets. Int J Mach Learn Comput 3(2):224–228. https://doi.org/10.7763/IJMLC.2013.V3.307

    Article  Google Scholar 

  19. Li D-C, Liu C-W, Hub CS (2010) A learning method for the class imbalance problem with medical data sets. Comput Biol Med 40(5):509–518. https://doi.org/10.1016/j.compbiomed.2010.03.005

  20. El-Sayed AA, Mahmood MAM, Meguid NA, Hefny HA ((2015)) Handling autism imbalanced data using synthetic minority over-sampling technique (smote). In: Third world conference on complex systems (WCCS). IEEE, pp 1–5. https://doi.org/10.1109/ICoCS.2015.7483267

  21. Vakadkar K, Purkayastha D, Krishnan D (2021) Detection of autism spectrum disorder in children using machine learning technique. SN Comput Sci 2(5):1–9. https://doi.org/10.1007/s42979-021-00776-5

    Article  Google Scholar 

  22. Das PR, Kumar CJ (2021) The diagnosis of asd using multiple machine learning techniques. Int J Dev Disabil. https://doi.org/10.1080/20473869.2021.1933730

  23. Thabtah F, Spencer R, Abdelhamid N, Kamalov F, Wentzel C, Ye Y, Dayara T (2022) Autism screening: an unsupervised machine learning approach. Health Inf Sci Syst 10(1):26. https://doi.org/10.1007/s13755-022-00191-x

  24. Thabtah F (2019) Machine learning in autistic spectrum disorder behavioral research: a review and ways forward. Inform Health Soc Care 44(3):278–297. https://doi.org/10.1080/17538157.2017.1399132

    Article  Google Scholar 

Download references

Acknowledgements

The work carried out by the first author is supported by the GATE scholarship from Ministry of Education, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rahul Kumar Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, R.K., Dutta, K. (2023). Resampling Strategies for Mitigating Class Imbalance of ASD Dataset on the Performance of Machine Learning Classifiers. In: Borah, S., Gandhi, T.K., Piuri, V. (eds) Advanced Computational and Communication Paradigms . ICACCP 2023. Lecture Notes in Networks and Systems, vol 535. Springer, Singapore. https://doi.org/10.1007/978-981-99-4284-8_18

Download citation

Publish with us

Policies and ethics