skip to main content
10.1145/3464968.3468408acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
short-paper

Impact of programming languages on machine learning bugs

Published:11 July 2021Publication History

ABSTRACT

Machine learning (ML) is on the rise to be ubiquitous in modern software. Still, its use is challenging for software developers. So far, research has focused on the ML libraries to find and mitigate these challenges. However, there is initial evidence that programming languages also add to the challenges, identifiable in different distributions of bugs in ML programs. To fill this research gap, we propose the first empirical study on the impact of programming languages on bugs in ML programs. We plan to analyze software from GitHub and related discussions in GitHub issues and Stack Overflow for bug distributions in ML programs, aiming to identify correlations with the chosen programming language, its features and the application domain. This study's results enable better-targeted use of available programming language technology in ML programs, preventing bugs, reducing errors and speeding up development.

References

  1. Saleema Amershi, Andrew Begel, and Christian Bird. 2019. Software Engineering for Machine Learning: A Case Study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP ’19). IEEE Press, 291–300. https://doi.org/10.1109/ICSE-SEIP.2019.00042 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Boris Beizer. 1984. Software System Testing and Quality Assurance. Van Nostrand Reinhold Co., USA. isbn:0442213069Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Emery D. Berger, Celeste Hollenbeck, Petr Maj, Olga Vitek, and Jan Vitek. 2019. On the Impact of Programming Languages on Code Quality: A Reproduction Study. ACM Trans. Program. Lang. Syst., 41, 4 (2019), Article 21, Oct., 24 pages. issn:0164-0925 https://doi.org/10.1145/3340571 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mary Beth Kery and Brad A. Myers. 2017. Exploring exploratory programming. In 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). 25–29. https://doi.org/10.1109/VLHCC.2017.8103446 Google ScholarGoogle ScholarCross RefCross Ref
  5. Gavin Bierman, Martín Abadi, and Mads Torgersen. 2014. Understanding TypeScript. In Proceedings of the 28th European Conference on ECOOP 2014 — Object-Oriented Programming - Volume 8586. Springer-Verlag, Berlin, Heidelberg. 257–281. isbn:9783662442012 https://doi.org/10.1007/978-3-662-44202-9_11 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S.R. Chidamber and C.F. Kemerer. 1994. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20, 6 (1994), 476–493. https://doi.org/10.1109/32.295895 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Eldh, H. Hansson, and S. Punnekkat. 2006. A Framework for Comparing Efficiency, Effectiveness and Applicability of Software Testing Techniques. In Testing: Academic Industrial Conference - Practice And Research Techniques (TAIC PART’06). 159–170. https://doi.org/10.1109/TAIC-PART.2006.1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Onyeka Ezenwoye. 2018. What Language? - The Choice of an Introductory Programming Language. In 2018 IEEE Frontiers in Education Conference (FIE). 1–8. https://doi.org/10.1109/FIE.2018.8658592 Google ScholarGoogle ScholarCross RefCross Ref
  9. Zheng Gao, Christian Bird, and Earl T. Barr. 2017. To Type or Not to Type: Quantifying Detectable Bugs in JavaScript. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 758–769. https://doi.org/10.1109/ICSE.2017.75 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Garkavtsev, N. Lamonova, and A. Gostev. 2018. Chosing a Programming Language for a New Project from a Code Quality Perspective. In 2018 IEEE Second International Conference on Data Stream Mining Processing (DSMP). 75–78. https://doi.org/10.1109/DSMP.2018.8478454 Google ScholarGoogle ScholarCross RefCross Ref
  11. Danielle Gonzalez, Thomas Zimmermann, and Nachiappan Nagappan. 2020. The State of the ML-Universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR ’20). Association for Computing Machinery, New York, NY, USA. 431–442. isbn:9781450375177 https://doi.org/10.1145/3379597.3387473 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of Real Faults in Deep Learning Systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 1110–1121. isbn:9781450371216 https://doi.org/10.1145/3377811.3380395 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA. 510–520. isbn:9781450355728 https://doi.org/10.1145/3338906.3338955 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Howell Jordan, Goetz Botterweck, and John Noll. 2015. A feature model of actor, agent, functional, object, and procedural programming languages. Science of Computer Programming, 98 (2015), 120–139. issn:0167-6423 https://doi.org/10.1016/j.scico.2014.02.009 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kaggle. 2019. State of Data Science and Machine Learning 2019. https://www.kaggle.com/c/kaggle-survey-2019/dataGoogle ScholarGoogle Scholar
  16. Mary Beth Kery, Amber Horvath, and Brad Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17). Association for Computing Machinery, New York, NY, USA. 1265–1276. isbn:9781450346559 https://doi.org/10.1145/3025453.3025626 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. S. Kochhar, D. Wijedasa, and D. Lo. 2016. A Large Scale Study of Multiple Programming Languages and Code Quality. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER). 1, 563–573. https://doi.org/10.1109/SANER.2016.112 Google ScholarGoogle ScholarCross RefCross Ref
  18. Andrey Kolmogorov. 1933. Sulla determinazione empirica di una lgge di distribuzione. Inst. Ital. Attuari, Giorn., 4 (1933), 83–91.Google ScholarGoogle Scholar
  19. Grace A. Lewis, Stephany Bellomo, and Ipek Ozkaya. 2021. Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems. arxiv:2103.14101. arxiv:2103.14101Google ScholarGoogle Scholar
  20. Mario Linares-Vásquez, Sam Klock, and Collin McMillan. 2014. Domain Matters: Bringing Further Evidence of the Relationships among Anti-Patterns, Application Domains, and Quality-Related Metrics in Java Mobile Apps. In Proceedings of the 22nd International Conference on Program Comprehension (ICPC 2014). Association for Computing Machinery, New York, NY, USA. 232–243. isbn:9781450328791 https://doi.org/10.1145/2597008.2597144 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Abhinav Nagpal and Goldie Gabrani. 2019. Python for Data Analytics, Scientific and Technical Applications. In 2019 Amity International Conference on Artificial Intelligence (AICAI). 140–145. https://doi.org/10.1109/AICAI.2019.8701341 Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Nanz and C. A. Furia. 2015. A Comparative Study of Programming Languages in Rosetta Code. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. 1, 778–788. https://doi.org/10.1109/ICSE.2015.90 Google ScholarGoogle ScholarCross RefCross Ref
  23. D.L. Parnas and M. Lawford. 2003. The role of inspection in software quality assurance. IEEE Transactions on Software Engineering, 29, 8 (2003), 674–676. https://doi.org/10.1109/TSE.2003.1223642 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A Large Scale Study of Programming Languages and Code Quality in Github. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). Association for Computing Machinery, New York, NY, USA. 155–165. isbn:9781450330565 https://doi.org/10.1145/2635868.2635922 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Michael L. Scott. 2009. 1 - Introduction. In Programming Language Pragmatics (Third Edition) (third edition ed.), Michael L. Scott (Ed.). Morgan Kaufmann, Boston. 5–39. isbn:978-0-12-374514-9 https://doi.org/10.1016/B978-0-12-374514-9.00010-0 Google ScholarGoogle ScholarCross RefCross Ref
  26. Beau Sheil. 1986. Datamation®: Power Tools for Programmers. In Readings in Artificial Intelligence and Software Engineering, Charles Rich and Richard C. Waters (Eds.). Morgan Kaufmann, 573–580. isbn:978-0-934613-12-5 https://doi.org/10.1016/B978-0-934613-12-5.50048-3 Google ScholarGoogle ScholarCross RefCross Ref
  27. Pramila P. Shinde and Seema Shah. 2018. A Review of Machine Learning and Deep Learning Applications. In 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). 1–6. https://doi.org/10.1109/ICCUBEA.2018.8697857 Google ScholarGoogle ScholarCross RefCross Ref
  28. X. Sun, T. Zhou, G. Li, J. Hu, H. Yang, and B. Li. 2017. An Empirical Study on Real Bugs for Machine Learning Programs. In 2017 24th Asia-Pacific Software Engineering Conference (APSEC). 348–357. https://doi.org/10.1109/APSEC.2017.41 Google ScholarGoogle Scholar
  29. F. Thung, S. Wang, D. Lo, and L. Jiang. 2012. An Empirical Study of Bugs in Machine Learning Systems. In 2012 IEEE 23rd International Symposium on Software Reliability Engineering. 271–280. https://doi.org/10.1109/ISSRE.2012.22 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Michael M. Vitousek, Andrew M. Kent, Jeremy G. Siek, and Jim Baker. 2014. Design and Evaluation of Gradual Typing for Python. SIGPLAN Not., 50, 2 (2014), Oct., 45–56. issn:0362-1340 https://doi.org/10.1145/2775052.2661101 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jianxiong Xiao. 2017. Learning Affordance for Autonomous Driving. In Proceedings of the 2nd ACM International Workshop on Smart, Autonomous, and Connected Vehicular Systems and Services. Association for Computing Machinery, New York, NY, USA. 1. isbn:9781450351461 https://doi.org/10.1145/3131944.3133941 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ru Zhang, Wencong Xiao, and Hongyu Zhang. 2020. An Empirical Study on Program Failures of Deep Learning Jobs. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 1159–1170. https://doi.org/10.1145/3377811.3380362 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tianyi Zhang, Ganesha Upadhyaya, and Anastasia Reinhardt. 2018. Are Code Examples on an Online Q A Forum Reliable?: A Study of API Misuse on Stack Overflow. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 886–896. https://doi.org/10.1145/3180155.3180260 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Impact of programming languages on machine learning bugs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          AISTA 2021: Proceedings of the 1st ACM International Workshop on AI and Software Testing/Analysis
          July 2021
          20 pages
          ISBN:9781450385411
          DOI:10.1145/3464968
          • General Chairs:
          • Shuai Wang,
          • Xiaofei Xie,
          • Lei Ma

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 July 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Upcoming Conference

          ISSTA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader