skip to main content
10.1145/3611643.3613882acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Automated Test Generation for Medical Rules Web Services: A Case Study at the Cancer Registry of Norway

Authors Info & Claims
Published:30 November 2023Publication History

ABSTRACT

The Cancer Registry of Norway (CRN) collects, curates, and manages data related to cancer patients in Norway, supported by an interactive, human-in-the-loop, socio-technical decision support software system. Automated software testing of this software system is inevitable; however, currently, it is limited in CRN’s practice. To this end, we present an industrial case study to evaluate an AI-based system-level testing tool, i.e., EvoMaster, in terms of its effectiveness in testing CRN’s software system. In particular, we focus on GURI, CRN’s medical rule engine, which is a key component at the CRN. We test GURI with EvoMaster’s black-box and white-box tools and study their test effectiveness regarding code coverage, errors found, and domain-specific rule coverage. The results show that all EvoMaster tools achieve a similar code coverage; i.e., around 19% line, 13% branch, and 20% method; and find a similar number of errors; i.e., 1 in GURI’s code. Concerning domain-specific coverage, EvoMaster’s black-box tool is the most effective in generating tests that lead to applied rules; i.e., 100% of the aggregation rules and between 12.86% and 25.81% of the validation rules; and to diverse rule execution results; i.e., 86.84% to 89.95% of the aggregation rules and 0.93% to 1.72% of the validation rules pass, and 1.70% to 3.12% of the aggregation rules and 1.58% to 3.74% of the validation rules fail. We further observe that the results are consistent across 10 versions of the rules. Based on these results, we recommend using EvoMaster’s black-box tool to test GURI since it provides good results and advances the current state of practice at the CRN. Nonetheless, EvoMaster needs to be extended to employ domain-specific optimization objectives to improve test effectiveness further. Finally, we conclude with lessons learned and potential research directions, which we believe are applicable in a general context.

References

  1. Ali Abedi and Tim Brecht. 2017. Conducting Repeatable Experiments in Highly Variable Cloud Computing Environments. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering (ICPE 2017). Association for Computing Machinery (ACM), New York, NY, USA. 287–292. isbn:978-1-4503-4404-3 https://doi.org/10.1145/3030207.3030229 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Shaukat Ali, Muhammad Zohaib Iqbal, Andrea Arcuri, and Lionel C. Briand. 2013. Generating Test Data from OCL Constraints with Search Techniques. IEEE Transactions on Software Engineering, 39, 10 (2013), Oct., 1376–1402. https://doi.org/10.1109/tse.2013.17 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. APIFuzzer. 2022. APIFuzzer – HTTP API Testing Framework. https://github.com/KissPeter/APIFuzzer Accessed 23.8.2023 Google ScholarGoogle Scholar
  4. Andrea Arcuri. 2018. EvoMaster: Evolutionary Multi-Context Automated System Test Generation. In Proceedings of the 11th IEEE International Conference on Software Testing, Verification and Validation (ICST 2018). Institute of Electrical and Electronics Engineers (IEEE), 394–397. https://doi.org/10.1109/ICST.2018.00046 Google ScholarGoogle ScholarCross RefCross Ref
  5. Andrea Arcuri. 2018. Test Suite Generation with the Many Independent Objective (MIO) Algorithm. Information and Software Technology, 104 (2018), Dec., 195–206. https://doi.org/10.1016/j.infsof.2018.05.003 Google ScholarGoogle ScholarCross RefCross Ref
  6. Andrea Arcuri. 2019. RESTful API Automated Test Case Generation with EvoMaster. ACM Transactions on Software Engineering and Methodology, 28, 1 (2019), Feb., 1–37. https://doi.org/10.1145/3293455 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Andrea Arcuri. 2021. Automated Black- and White-Box Testing of RESTful APIs With EvoMaster. IEEE Software, 38, 3 (2021), May, 72–78. https://doi.org/10.1109/MS.2020.3013820 Google ScholarGoogle ScholarCross RefCross Ref
  8. Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011). Association for Computing Machinery (ACM). https://doi.org/10.1145/1985793.1985795 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Vaggelis Atlidakis, Patrice Godefroid, and Marina Polishchuk. 2019. RESTler: Stateful REST API Fuzzing. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering (ICSE 2019). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/icse.2019.00083 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering, 41, 5 (2015), May, 507–525. https://doi.org/10.1109/tse.2014.2372785 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Marcel Böhme, László Szekeres, and Jonathan Metzman. 2022. On the Reliability of Coverage-Based Fuzzer Benchmarking. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE 2022). Association for Computing Machinery (ACM), 1621–1633. https://doi.org/10.1145/3510003.3510230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Marcel Böme. 2023. Tweet: Comparison to Production. https://twitter.com/mboehme_/status/1640743122681339905 Accessed 23.8.2023 Google ScholarGoogle Scholar
  13. Marcel Böme. 2023. Tweet: Domain-Specific Fuzzing. https://twitter.com/mboehme_/status/1640739828621795332 Accessed 23.8.2023 Google ScholarGoogle Scholar
  14. Marcel Böme. 2023. Tweet: Evaluating Fuzzers. https://twitter.com/mboehme_/status/1640365695211896837 Accessed 23.8.2023 Google ScholarGoogle Scholar
  15. Marcel Böme. 2023. Tweet: Oracles. https://twitter.com/mboehme_/status/1640705559879094272 Accessed 23.8.2023 Google ScholarGoogle Scholar
  16. Davide Corradini, Amedeo Zampieri, Michele Pasqua, Emanuele Viglianisi, Michael Dallago, and Mariano Ceccato. 2022. Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs. Software Testing, Verification and Reliability, 32, 5 (2022), Jan., https://doi.org/10.1002/stvr.1808 Google ScholarGoogle ScholarCross RefCross Ref
  17. Fida K. Dankar and Mahmoud Ibrahim. 2021. Fake It Till You Make It: Guidelines for Effective Synthetic Data Generation. Applied Sciences, 11, 5 (2021), issn:2076-3417 https://doi.org/10.3390/app11052158 Google ScholarGoogle ScholarCross RefCross Ref
  18. Dredd. 2021. Dredd – HTTP API Testing Framework. https://dredd.org Accessed 23.8.2023 Google ScholarGoogle Scholar
  19. Dmitry Dygalo. 2023. Schemathesis: Property-Based Testing for API Schemas. https://schemathesis.readthedocs.io Accessed 23.8.2023 Google ScholarGoogle Scholar
  20. J Ferlay, M Ervik, F Lam, M Colombet, L Mery, M Piñeros, A Znaor, I Soerjomataram, and Bray Freddie. 2020. Global Cancer Observatory: Cancer Today. https://gco.iarc.fr/today Google ScholarGoogle Scholar
  21. Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE 2011). Association for Computing Machinery (ACM). https://doi.org/10.1145/2025113.2025179 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering, 39, 2 (2013), Feb., 276–291. https://doi.org/10.1109/TSE.2012.14 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Amid Golmohammadi, Man Zhang, and Andrea Arcuri. 2022. Testing RESTful APIs: A Survey. https://doi.org/10.48550/arXiv.2212.14604 arxiv:2212.14604. Google ScholarGoogle ScholarCross RefCross Ref
  24. A. Goncalves, P. Ray, B. Soper, J. Stevens, L. Coyle, and A. P. Sales. 2020. Generation and evaluation of synthetic patient data. BMC Med Res Methodol, 20, 1 (2020), 108. issn:1471-2288 (Electronic) 1471-2288 (Linking) https://doi.org/10.1186/s12874-020-00977-1 Goncalves, Andre Ray, Priyadip Soper, Braden Stevens, Jennifer Coyle, Linda Sales, Ana Paula eng England BMC Med Res Methodol. 2020 May 7;20(1):108. doi: 10.1186/s12874-020-00977-1. Google ScholarGoogle ScholarCross RefCross Ref
  25. Roman Haas, Daniel Elsner, Elmar Juergens, Alexander Pretschner, and Sven Apel. 2021. How Can Manual Testing Processes Be Optimized? Developer Survey, Optimization Guidelines, and Case Studies. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing Machinery (ACM). https://doi.org/10.1145/3468264.3473922 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mikel Hernandez, Gorka Epelde, Ane Alberdi, Rodrigo Cilla, and Debbie Rankin. 2022. Synthetic data generation for tabular health records: A systematic review. Neurocomputing, 493 (2022), 28–45. issn:09252312 https://doi.org/10.1016/j.neucom.2022.04.053 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Erblin Isaku, Hassan Sartaj, Christoph Laaber, Shaukat Ali, Tao Yue, Thomas Schwitalla, and Jan F. Nygård. 2023. Cost Reduction on Testing Evolving Cancer Registry System. In Proceedings of the 39th IEEE International Conference on Software Maintenance and Evolution (ICSME 2023). Institute of Electrical and Electronics Engineers (IEEE). Google ScholarGoogle Scholar
  28. Myeongsoo Kim, Qi Xin, Saurabh Sinha, and Alessandro Orso. 2022. Automated Test Generation for REST APIs: No Time to Rest Yet. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). Association for Computing Machinery (ACM), 289–301. https://doi.org/10.1145/3533767.3534401 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kerry Kimbrough, Juglar, and Thibault Kruse. 2023. Tcases: A Model-Based Test Case Generator. https://github.com/Cornutum/tcases Accessed 23.8.2023 Google ScholarGoogle Scholar
  30. Christoph Laaber, Tao Yue, Shaukat Ali, Thomas Schwitalla, and Jan F. Nygård. 2023. Challenges of Testing an Evolving Cancer Registration Support System in Practice. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering: Companion Proceedings (ICSE-Companion 2023). Institute of Electrical and Electronics Engineers (IEEE), 355–359. https://doi.org/10.1109/ICSE-Companion58688.2023.00102 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Nuno Laranjeiro, João Agnelo, and Jorge Bernardino. 2021. A Black Box Tool for Robustness Testing of REST Services. IEEE Access, 9 (2021), Feb., 24738–24754. https://doi.org/10.1109/ACCESS.2021.3056505 Google ScholarGoogle ScholarCross RefCross Ref
  32. Nuno Laranjeiro, Carlos Francisco Fernandes Santos, and João Agnelo. 2022. EvoReFuzz – Evolutionary REST Fuzzer. https://git.dei.uc.pt/cnl/bBOXRT Accessed 23.8.2023 Google ScholarGoogle Scholar
  33. Valentin Liévin, Christoffer Egeberg Hother, and Ole Winther. 2023. Can large language models reason about medical questions? https://doi.org/10.48550/arXiv.2207.08143 arxiv:2207.08143. Google ScholarGoogle ScholarCross RefCross Ref
  34. Chengjie Lu, Qinghua Xu, Tao Yue, Shaukat Ali, Thomas Schwitalla, and Jan F. Nygård. 2023. EvoCLINICAL: Evolving Cyber-Cyber Digital Twin with Active Transfer Learning for Automated Cancer Registry System. In Proceedings of the 31th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023). Association for Computing Machinery (ACM), 11 pages. isbn:979-8-4007-0327-0/23/12 https://doi.org/10.1145/3611643.3613897 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hong Lu, Shuai Wang, Tao Yue, Shaukat Ali, and Jan F. Nygård. 2019. Automated Refactoring of OCL Constraints with Search. IEEE Transactions on Software Engineering, 45, 2 (2019), Feb., 148–170. https://doi.org/10.1109/tse.2017.2774829 Google ScholarGoogle ScholarCross RefCross Ref
  36. Bogdan Marculescu, Man Zhang, and Andrea Arcuri. 2022. On the Faults Found in REST APIs by Automated Test Generation. ACM Transactions on Software Engineering and Methodology, 31, 3 (2022), July, 1–43. https://doi.org/10.1145/3491038 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Alberto Martin-Lopez, Sergio Segura, and Antonio Ruiz-Cortés. 2021. RESTest: Automated Black-Box Testing of RESTful Web APIs. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery (ACM), 682–685. https://doi.org/10.1145/3460319.3469082 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rohan Padhye, Caroline Lemieux, Koushik Sen, Laurent Simon, and Hayawardh Vijayakumar. 2019. FuzzFactory: Domain-Specific Fuzzing with Waypoints. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), Oct., 1–29. https://doi.org/10.1145/3360600 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2018. Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets. IEEE Transactions on Software Engineering, 44, 2 (2018), Feb., 122–158. https://doi.org/10.1109/tse.2017.2663435 Google ScholarGoogle ScholarCross RefCross Ref
  40. Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, Philip Mansfield, Blaise Aguera y Arcas, Dale Webster, Greg S. Corrado, Yossi Matias, Katherine Chou, Juraj Gottweis, Nenad Tomasev, Yun Liu, Alvin Rajkomar, Joelle Barral, Christopher Semturs, Alan Karthikesalingam, and Vivek Natarajan. 2022. Large Language Models Encode Clinical Knowledge. https://doi.org/10.48550/arXiv.2212.13138 arxiv:2212.13138. Google ScholarGoogle ScholarCross RefCross Ref
  41. Klaas-Jan Stol and Brian Fitzgerald. 2018. The ABC of Software Engineering Research. ACM Transactions on Software Engineering and Methodology, 27, 3 (2018), Oct., 1–51. https://doi.org/10.1145/3241743 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Shuai Wang, Hong Lu, Tao Yue, Shaukat Ali, and Jan Nygård. 2016. MBF4CR: A Model-Based Framework for Supporting an Automated Cancer Registry System. In Proceedings of the 12th European Conference on Modelling Foundations and Applications (ECMFA 2016). Springer International Publishing, 191–204. https://doi.org/10.1007/978-3-319-42061-5_12 Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shuai Wang, Thomas Schwitalla, Tao Yue, Shaukat Ali, and Jan F. Nygård. 2017. RCIA: Automated Change Impact Analysis to Facilitate a Practical Cancer Registry System. In Proceedings of the 33rd IEEE International Conference on Software Maintenance and Evolution (ICSME 2017). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/icsme.2017.22 Google ScholarGoogle ScholarCross RefCross Ref
  44. Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E. Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B. Costa, Mona G. Flores, Ying Zhang, Tanja Magoc, Christopher A. Harle, Gloria Lipori, Duane A. Mitchell, William R. Hogan, Elizabeth A. Shenkman, Jiang Bian, and Yonghui Wu. 2022. A Large Language Model for Electronic Health Records. npj Digital Medicine, 5, 1 (2022), Dec., https://doi.org/10.1038/s41746-022-00742-2 Google ScholarGoogle ScholarCross RefCross Ref
  45. Li Yunxiang, Li Zihan, Zhang Kai, Dan Ruilong, and Zhang You. 2023. ChatDoctor: A Medical Chat Model Fine-Tuned on LLaMA Model using Medical Domain Knowledge. https://doi.org/10.48550/arXiv.2303.14070 arxiv:2303.14070. Google ScholarGoogle ScholarCross RefCross Ref
  46. Man Zhang and Andrea Arcuri. 2021. Enhancing Resource-Based Test Case Generation for RESTful APIs with SQL Handling. In Proceedings of the 13th International Symposium on Search Based Software Engineering (SSBSE 2021). Springer, 103–117. https://doi.org/10.1007/978-3-030-88106-1_8 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Man Zhang and Andrea Arcuri. 2022. Adaptive Hypermutation for Search-Based System Test Generation: A Study on REST APIs with EvoMaster. ACM Transactions on Software Engineering and Methodology, 31, 1 (2022), Jan., 1–52. https://doi.org/10.1145/3464940 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Man Zhang, Andrea Arcuri, Yonggang Li, Yang Liu, and Kaiming Xue. 2023. White-Box Fuzzing RPC-Based APIs with EvoMaster: An Industrial Case Study. ACM Transactions on Software Engineering and Methodology, 1–39. Google ScholarGoogle Scholar
  49. Man Zhang, Bogdan Marculescu, and Andrea Arcuri. 2021. Resource and Dependency Based Test Case Generation for RESTful Web Services. Empirical Software Engineering, 26, 4 (2021), June, https://doi.org/10.1007/s10664-020-09937-1 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated Test Generation for Medical Rules Web Services: A Case Study at the Cancer Registry of Norway

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
          November 2023
          2215 pages
          ISBN:9798400703270
          DOI:10.1145/3611643

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 November 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate112of543submissions,21%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader