research-article

Automated Test Generation for Medical Rules Web Services: A Case Study at the Cancer Registry of Norway

Authors:
Christoph Laaber

Simula Research Laboratory, Oslo, Norway

Simula Research Laboratory, Oslo, Norway
View Profile

,
Tao Yue

Simula Research Laboratory, Oslo, Norway

Simula Research Laboratory, Oslo, Norway
View Profile

,
Shaukat Ali

Simula Research Laboratory, Oslo, Norway / Oslo Metropolitan University, Oslo, Norway

Simula Research Laboratory, Oslo, Norway / Oslo Metropolitan University, Oslo, Norway
View Profile

,
Thomas Schwitalla

Cancer Registry of Norway, Oslo, Norway

Cancer Registry of Norway, Oslo, Norway
View Profile

,
Jan Nygård

Cancer Registry of Norway, Oslo, Norway / UiT The Arctic University of Norway, Tromsø, Norway

Cancer Registry of Norway, Oslo, Norway / UiT The Arctic University of Norway, Tromsø, Norway
View Profile

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software EngineeringNovember 2023Pages 1937–1948https://doi.org/10.1145/3611643.3613882

Published:30 November 2023Publication History

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 1937–1948

ABSTRACT

The Cancer Registry of Norway (CRN) collects, curates, and manages data related to cancer patients in Norway, supported by an interactive, human-in-the-loop, socio-technical decision support software system. Automated software testing of this software system is inevitable; however, currently, it is limited in CRN’s practice. To this end, we present an industrial case study to evaluate an AI-based system-level testing tool, i.e., EvoMaster, in terms of its effectiveness in testing CRN’s software system. In particular, we focus on GURI, CRN’s medical rule engine, which is a key component at the CRN. We test GURI with EvoMaster’s black-box and white-box tools and study their test effectiveness regarding code coverage, errors found, and domain-specific rule coverage. The results show that all EvoMaster tools achieve a similar code coverage; i.e., around 19% line, 13% branch, and 20% method; and find a similar number of errors; i.e., 1 in GURI’s code. Concerning domain-specific coverage, EvoMaster’s black-box tool is the most effective in generating tests that lead to applied rules; i.e., 100% of the aggregation rules and between 12.86% and 25.81% of the validation rules; and to diverse rule execution results; i.e., 86.84% to 89.95% of the aggregation rules and 0.93% to 1.72% of the validation rules pass, and 1.70% to 3.12% of the aggregation rules and 1.58% to 3.74% of the validation rules fail. We further observe that the results are consistent across 10 versions of the rules. Based on these results, we recommend using EvoMaster’s black-box tool to test GURI since it provides good results and advances the current state of practice at the CRN. Nonetheless, EvoMaster needs to be extended to employ domain-specific optimization objectives to improve test effectiveness further. Finally, we conclude with lessons learned and potential research directions, which we believe are applicable in a general context.

References

Ali Abedi and Tim Brecht. 2017. Conducting Repeatable Experiments in Highly Variable Cloud Computing Environments. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering (ICPE 2017). Association for Computing Machinery (ACM), New York, NY, USA. 287–292. isbn:978-1-4503-4404-3 https://doi.org/10.1145/3030207.3030229 Google ScholarDigital Library
Shaukat Ali, Muhammad Zohaib Iqbal, Andrea Arcuri, and Lionel C. Briand. 2013. Generating Test Data from OCL Constraints with Search Techniques. IEEE Transactions on Software Engineering, 39, 10 (2013), Oct., 1376–1402. https://doi.org/10.1109/tse.2013.17 Google ScholarDigital Library
APIFuzzer. 2022. APIFuzzer – HTTP API Testing Framework. https://github.com/KissPeter/APIFuzzer Accessed 23.8.2023 Google Scholar
Andrea Arcuri. 2018. EvoMaster: Evolutionary Multi-Context Automated System Test Generation. In Proceedings of the 11th IEEE International Conference on Software Testing, Verification and Validation (ICST 2018). Institute of Electrical and Electronics Engineers (IEEE), 394–397. https://doi.org/10.1109/ICST.2018.00046 Google ScholarCross Ref
Andrea Arcuri. 2018. Test Suite Generation with the Many Independent Objective (MIO) Algorithm. Information and Software Technology, 104 (2018), Dec., 195–206. https://doi.org/10.1016/j.infsof.2018.05.003 Google ScholarCross Ref
Andrea Arcuri. 2019. RESTful API Automated Test Case Generation with EvoMaster. ACM Transactions on Software Engineering and Methodology, 28, 1 (2019), Feb., 1–37. https://doi.org/10.1145/3293455 Google ScholarDigital Library
Andrea Arcuri. 2021. Automated Black- and White-Box Testing of RESTful APIs With EvoMaster. IEEE Software, 38, 3 (2021), May, 72–78. https://doi.org/10.1109/MS.2020.3013820 Google ScholarCross Ref
Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In Proceedings of the 33rd International Conference on Software Engineering (ICSE 2011). Association for Computing Machinery (ACM). https://doi.org/10.1145/1985793.1985795 Google ScholarDigital Library
Vaggelis Atlidakis, Patrice Godefroid, and Marina Polishchuk. 2019. RESTler: Stateful REST API Fuzzing. In Proceedings of the 41st IEEE/ACM International Conference on Software Engineering (ICSE 2019). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/icse.2019.00083 Google ScholarDigital Library
Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering, 41, 5 (2015), May, 507–525. https://doi.org/10.1109/tse.2014.2372785 Google ScholarDigital Library
Marcel Böhme, László Szekeres, and Jonathan Metzman. 2022. On the Reliability of Coverage-Based Fuzzer Benchmarking. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering (ICSE 2022). Association for Computing Machinery (ACM), 1621–1633. https://doi.org/10.1145/3510003.3510230 Google ScholarDigital Library
Marcel Böme. 2023. Tweet: Comparison to Production. https://twitter.com/mboehme_/status/1640743122681339905 Accessed 23.8.2023 Google Scholar
Marcel Böme. 2023. Tweet: Domain-Specific Fuzzing. https://twitter.com/mboehme_/status/1640739828621795332 Accessed 23.8.2023 Google Scholar
Marcel Böme. 2023. Tweet: Evaluating Fuzzers. https://twitter.com/mboehme_/status/1640365695211896837 Accessed 23.8.2023 Google Scholar
Marcel Böme. 2023. Tweet: Oracles. https://twitter.com/mboehme_/status/1640705559879094272 Accessed 23.8.2023 Google Scholar
Davide Corradini, Amedeo Zampieri, Michele Pasqua, Emanuele Viglianisi, Michael Dallago, and Mariano Ceccato. 2022. Automated Black-Box Testing of Nominal and Error Scenarios in RESTful APIs. Software Testing, Verification and Reliability, 32, 5 (2022), Jan., https://doi.org/10.1002/stvr.1808 Google ScholarCross Ref
Fida K. Dankar and Mahmoud Ibrahim. 2021. Fake It Till You Make It: Guidelines for Effective Synthetic Data Generation. Applied Sciences, 11, 5 (2021), issn:2076-3417 https://doi.org/10.3390/app11052158 Google ScholarCross Ref
Dredd. 2021. Dredd – HTTP API Testing Framework. https://dredd.org Accessed 23.8.2023 Google Scholar
Dmitry Dygalo. 2023. Schemathesis: Property-Based Testing for API Schemas. https://schemathesis.readthedocs.io Accessed 23.8.2023 Google Scholar
J Ferlay, M Ervik, F Lam, M Colombet, L Mery, M Piñeros, A Znaor, I Soerjomataram, and Bray Freddie. 2020. Global Cancer Observatory: Cancer Today. https://gco.iarc.fr/today Google Scholar
Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: Automatic Test Suite Generation for Object-Oriented Software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE 2011). Association for Computing Machinery (ACM). https://doi.org/10.1145/2025113.2025179 Google ScholarDigital Library
Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering, 39, 2 (2013), Feb., 276–291. https://doi.org/10.1109/TSE.2012.14 Google ScholarDigital Library
Amid Golmohammadi, Man Zhang, and Andrea Arcuri. 2022. Testing RESTful APIs: A Survey. https://doi.org/10.48550/arXiv.2212.14604 arxiv:2212.14604. Google ScholarCross Ref
A. Goncalves, P. Ray, B. Soper, J. Stevens, L. Coyle, and A. P. Sales. 2020. Generation and evaluation of synthetic patient data. BMC Med Res Methodol, 20, 1 (2020), 108. issn:1471-2288 (Electronic) 1471-2288 (Linking) https://doi.org/10.1186/s12874-020-00977-1 Goncalves, Andre Ray, Priyadip Soper, Braden Stevens, Jennifer Coyle, Linda Sales, Ana Paula eng England BMC Med Res Methodol. 2020 May 7;20(1):108. doi: 10.1186/s12874-020-00977-1. Google ScholarCross Ref
Roman Haas, Daniel Elsner, Elmar Juergens, Alexander Pretschner, and Sven Apel. 2021. How Can Manual Testing Processes Be Optimized? Developer Survey, Optimization Guidelines, and Case Studies. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing Machinery (ACM). https://doi.org/10.1145/3468264.3473922 Google ScholarDigital Library
Mikel Hernandez, Gorka Epelde, Ane Alberdi, Rodrigo Cilla, and Debbie Rankin. 2022. Synthetic data generation for tabular health records: A systematic review. Neurocomputing, 493 (2022), 28–45. issn:09252312 https://doi.org/10.1016/j.neucom.2022.04.053 Google ScholarDigital Library
Erblin Isaku, Hassan Sartaj, Christoph Laaber, Shaukat Ali, Tao Yue, Thomas Schwitalla, and Jan F. Nygård. 2023. Cost Reduction on Testing Evolving Cancer Registry System. In Proceedings of the 39th IEEE International Conference on Software Maintenance and Evolution (ICSME 2023). Institute of Electrical and Electronics Engineers (IEEE). Google Scholar
Myeongsoo Kim, Qi Xin, Saurabh Sinha, and Alessandro Orso. 2022. Automated Test Generation for REST APIs: No Time to Rest Yet. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2022). Association for Computing Machinery (ACM), 289–301. https://doi.org/10.1145/3533767.3534401 Google ScholarDigital Library
Kerry Kimbrough, Juglar, and Thibault Kruse. 2023. Tcases: A Model-Based Test Case Generator. https://github.com/Cornutum/tcases Accessed 23.8.2023 Google Scholar
Christoph Laaber, Tao Yue, Shaukat Ali, Thomas Schwitalla, and Jan F. Nygård. 2023. Challenges of Testing an Evolving Cancer Registration Support System in Practice. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering: Companion Proceedings (ICSE-Companion 2023). Institute of Electrical and Electronics Engineers (IEEE), 355–359. https://doi.org/10.1109/ICSE-Companion58688.2023.00102 Google ScholarDigital Library
Nuno Laranjeiro, João Agnelo, and Jorge Bernardino. 2021. A Black Box Tool for Robustness Testing of REST Services. IEEE Access, 9 (2021), Feb., 24738–24754. https://doi.org/10.1109/ACCESS.2021.3056505 Google ScholarCross Ref
Nuno Laranjeiro, Carlos Francisco Fernandes Santos, and João Agnelo. 2022. EvoReFuzz – Evolutionary REST Fuzzer. https://git.dei.uc.pt/cnl/bBOXRT Accessed 23.8.2023 Google Scholar
Valentin Liévin, Christoffer Egeberg Hother, and Ole Winther. 2023. Can large language models reason about medical questions? https://doi.org/10.48550/arXiv.2207.08143 arxiv:2207.08143. Google ScholarCross Ref
Chengjie Lu, Qinghua Xu, Tao Yue, Shaukat Ali, Thomas Schwitalla, and Jan F. Nygård. 2023. EvoCLINICAL: Evolving Cyber-Cyber Digital Twin with Active Transfer Learning for Automated Cancer Registry System. In Proceedings of the 31th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023). Association for Computing Machinery (ACM), 11 pages. isbn:979-8-4007-0327-0/23/12 https://doi.org/10.1145/3611643.3613897 Google ScholarDigital Library
Hong Lu, Shuai Wang, Tao Yue, Shaukat Ali, and Jan F. Nygård. 2019. Automated Refactoring of OCL Constraints with Search. IEEE Transactions on Software Engineering, 45, 2 (2019), Feb., 148–170. https://doi.org/10.1109/tse.2017.2774829 Google ScholarCross Ref
Bogdan Marculescu, Man Zhang, and Andrea Arcuri. 2022. On the Faults Found in REST APIs by Automated Test Generation. ACM Transactions on Software Engineering and Methodology, 31, 3 (2022), July, 1–43. https://doi.org/10.1145/3491038 Google ScholarDigital Library
Alberto Martin-Lopez, Sergio Segura, and Antonio Ruiz-Cortés. 2021. RESTest: Automated Black-Box Testing of RESTful Web APIs. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery (ACM), 682–685. https://doi.org/10.1145/3460319.3469082 Google ScholarDigital Library
Rohan Padhye, Caroline Lemieux, Koushik Sen, Laurent Simon, and Hayawardh Vijayakumar. 2019. FuzzFactory: Domain-Specific Fuzzing with Waypoints. Proceedings of the ACM on Programming Languages, 3, OOPSLA (2019), Oct., 1–29. https://doi.org/10.1145/3360600 Google ScholarDigital Library
Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2018. Automated Test Case Generation as a Many-Objective Optimisation Problem with Dynamic Selection of the Targets. IEEE Transactions on Software Engineering, 44, 2 (2018), Feb., 122–158. https://doi.org/10.1109/tse.2017.2663435 Google ScholarCross Ref
Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Seneviratne, Paul Gamble, Chris Kelly, Nathaneal Scharli, Aakanksha Chowdhery, Philip Mansfield, Blaise Aguera y Arcas, Dale Webster, Greg S. Corrado, Yossi Matias, Katherine Chou, Juraj Gottweis, Nenad Tomasev, Yun Liu, Alvin Rajkomar, Joelle Barral, Christopher Semturs, Alan Karthikesalingam, and Vivek Natarajan. 2022. Large Language Models Encode Clinical Knowledge. https://doi.org/10.48550/arXiv.2212.13138 arxiv:2212.13138. Google ScholarCross Ref
Klaas-Jan Stol and Brian Fitzgerald. 2018. The ABC of Software Engineering Research. ACM Transactions on Software Engineering and Methodology, 27, 3 (2018), Oct., 1–51. https://doi.org/10.1145/3241743 Google ScholarDigital Library
Shuai Wang, Hong Lu, Tao Yue, Shaukat Ali, and Jan Nygård. 2016. MBF4CR: A Model-Based Framework for Supporting an Automated Cancer Registry System. In Proceedings of the 12th European Conference on Modelling Foundations and Applications (ECMFA 2016). Springer International Publishing, 191–204. https://doi.org/10.1007/978-3-319-42061-5_12 Google ScholarDigital Library
Shuai Wang, Thomas Schwitalla, Tao Yue, Shaukat Ali, and Jan F. Nygård. 2017. RCIA: Automated Change Impact Analysis to Facilitate a Practical Cancer Registry System. In Proceedings of the 33rd IEEE International Conference on Software Maintenance and Evolution (ICSME 2017). Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/icsme.2017.22 Google ScholarCross Ref
Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E. Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B. Costa, Mona G. Flores, Ying Zhang, Tanja Magoc, Christopher A. Harle, Gloria Lipori, Duane A. Mitchell, William R. Hogan, Elizabeth A. Shenkman, Jiang Bian, and Yonghui Wu. 2022. A Large Language Model for Electronic Health Records. npj Digital Medicine, 5, 1 (2022), Dec., https://doi.org/10.1038/s41746-022-00742-2 Google ScholarCross Ref
Li Yunxiang, Li Zihan, Zhang Kai, Dan Ruilong, and Zhang You. 2023. ChatDoctor: A Medical Chat Model Fine-Tuned on LLaMA Model using Medical Domain Knowledge. https://doi.org/10.48550/arXiv.2303.14070 arxiv:2303.14070. Google ScholarCross Ref
Man Zhang and Andrea Arcuri. 2021. Enhancing Resource-Based Test Case Generation for RESTful APIs with SQL Handling. In Proceedings of the 13th International Symposium on Search Based Software Engineering (SSBSE 2021). Springer, 103–117. https://doi.org/10.1007/978-3-030-88106-1_8 Google ScholarDigital Library
Man Zhang and Andrea Arcuri. 2022. Adaptive Hypermutation for Search-Based System Test Generation: A Study on REST APIs with EvoMaster. ACM Transactions on Software Engineering and Methodology, 31, 1 (2022), Jan., 1–52. https://doi.org/10.1145/3464940 Google ScholarDigital Library
Man Zhang, Andrea Arcuri, Yonggang Li, Yang Liu, and Kaiming Xue. 2023. White-Box Fuzzing RPC-Based APIs with EvoMaster: An Industrial Case Study. ACM Transactions on Software Engineering and Methodology, 1–39. Google Scholar
Man Zhang, Bogdan Marculescu, and Andrea Arcuri. 2021. Resource and Dependency Based Test Case Generation for RESTful Web Services. Empirical Software Engineering, 26, 4 (2021), June, https://doi.org/10.1007/s10664-020-09937-1 Google ScholarDigital Library

Index Terms

Automated Test Generation for Medical Rules Web Services: A Case Study at the Cancer Registry of Norway

Recommendations

Automated coverage calculation and test case generation
SAICSIT '12: Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference

This article describes the use of symbolic execution, a formal method of static analysis, to calculate code coverage of a program's existing JUnit test suites. Code coverage is measured with respect to a number of test adequacy criteria, including ...
Read More
Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection
TAIC-PART '09: Proceedings of the 2009 Testing: Academic and Industrial Conference - Practice and Research Techniques

Where no independent specification is available, object-oriented unit testing is limited to exercising all interleaved method paths, seeking unexpected failures.A recent trend in unit testing, that interleaves dynamic analysis between each test cycle, ...
Read More
Repairing order-dependent flaky tests via test generation
ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Flaky tests are tests that pass or fail nondeterministically on the same version of code. These tests can mislead developers concerning the quality of their code changes during regression testing. A common kind of flaky tests are order-dependent tests, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
General Chair:
Satish Chandra
Google, USA
,
Program Chairs:
Kelly Blincoe
University of Auckland, New Zealand
,
Paolo Tonella
USI Lugano, Switzerland
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 November 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
REST APIs
automated software testing
cancer registry
electronic health records
rule engine
test generation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate112of543submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 48
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automated Test Generation for Medical Rules Web Services: A Case Study at the Cancer Registry of Norway

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automated coverage calculation and test case generation

Dynamic Analysis of Algebraic Structure to Optimize Test Generation and Test Case Selection

Repairing order-dependent flaky tests via test generation