Skip to main content
Log in

Identifying duplicate functionality in textual use cases by aligning semantic actions

  • Regular Paper
  • Published:
Software & Systems Modeling Aims and scope Submit manuscript

Abstract

Developing high-quality requirements specifications often demands a thoughtful analysis and an adequate level of expertise from analysts. Although requirements modeling techniques provide mechanisms for abstraction and clarity, fostering the reuse of shared functionality (e.g., via UML relationships for use cases), they are seldom employed in practice. A particular quality problem of textual requirements, such as use cases, is that of having duplicate pieces of functionality scattered across the specifications. Duplicate functionality can sometimes improve readability for end users, but hinders development-related tasks such as effort estimation, feature prioritization, and maintenance, among others. Unfortunately, inspecting textual requirements by hand in order to deal with redundant functionality can be an arduous, time-consuming, and error-prone activity for analysts. In this context, we introduce a novel approach called ReqAligner that aids analysts to spot signs of duplication in use cases in an automated fashion. To do so, ReqAligner combines several text processing techniques, such as a use case-aware classifier and a customized algorithm for sequence alignment. Essentially, the classifier converts the use cases into an abstract representation that consists of sequences of semantic actions, and then these sequences are compared pairwise in order to identify action matches, which become possible duplications. We have applied our technique to five real-world specifications, achieving promising results and identifying many sources of duplication in the use cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. OpenNLP: available at http://opennlp.apache.org.

  2. Stanford CoreNLP: available at http://nlp.stanford.edu/software/corenlp.shtml.

  3. Snowball: Available at http://snowball.tartarus.org.

  4. Mate-Tools: available at http://code.google.com/p/mate-tools/.

  5. Available at http://www.ranks.nl/resources/stopwords.html.

  6. Mulan: Available at http://mulan.sourceforge.net/.

  7. The projects and the use cases used for the training phase were developed by System Engineering students, in the context of a Software Development Methodologies course taught at UNICEN University in 2011 and 2012.

  8. Please note that this schema is more rigorous than a traditional k-fold cross-validation evaluation, in which the dataset is partitioned in k-1 subsets for training and just 1 subset for testing.

  9. http://jaligner.sourceforge.net.

  10. Software Development Studio (SDS) Development Server—http://sds.cs.put.poznan.pl.

  11. To reduce confounding factors, we ensured these analysts had a college degree in Software Engineering with a solid academic background. Moreover, the final meeting between the four of them allowed us to manage the different learning curves which might have affected their findings.

  12. Because it exploits peculiarities in the domain of use cases, especially of textual use case specifications.

References

  1. Adolph, S., Bramble, P., Cockburn, A., Pols, A.: Patterns for Effective Use Cases. The Agile Software Development Series. Addison-Wesley, Reading, MA (2003)

    Google Scholar 

  2. Alchimowicz, B., Jurkiewicz, J., Ochodek, M., Nawrocki, J.: Building benchmarks for use cases. Comput. Inf. 29(1), 27–44 (2010)

  3. Baniassad, E., Clarke, S.: Theme: an approach for aspect-oriented analysis and design. In: Proceedings of the 26th International Conference on Software Engineering, IEEE Computer Society, Scotland, UK, pp. 158–167 (2004)

  4. Bell, R.: Course registration system. http://sce.uhcl.edu/helm/RUP_course_example/courseregistrationproject/indexcourse.htm

  5. Booch, G., Rumbaugh, J., Jacobsen, I.: The Unified Modeling Language User Guide. The Addison-Wesley Object Technology Series. Addison Wesley, Reading (1999)

  6. Chen, L., AliBabar, M., Nuseibeh, B.: Characterizing architecturally significant requirements. IEEE Softw. 30(2), 38–45 (2013). doi:10.1109/MS.2012.174

    Article  Google Scholar 

  7. Chernak, Y.: Building a foundation for structured requirements. Aspect-oriented re explained—part 1. Better Software (2009)

  8. Ciemniewska, A., Jurkiewicz, J.: Automatic detection of defects in use cases. Master’s thesis, Poznan University of Technology—Faculty of Computer Science and Management—Institute of Computer Science (2007)

  9. Ciemniewska, A., Jurkiewicz, J., Olek, L., Nawrocki, J.: Supporting use-case reviews. In: Proceedings of the 10th International Conference on Business Information Systems (BIS’07), Springer, Berlin, Heidelberg, pp. 424–437 (2007)

  10. Cockburn, A.: Writing Effective Use Cases, vol. 1. Addison-Wesley, Reading (2001)

    Google Scholar 

  11. Dekhtyar, A., Dekhtyar, O., Holden, J., Hayes, J., Cuddeback, D., Kong, W.K.: On human analyst performance in assisted requirements tracing: statistical analysis. In: 2011 19th IEEE International on Requirements Engineering Conference (RE), pp. 111–120 (2011). doi:10.1109/RE.2011.6051649

  12. Dobrzanski, L., Kuzniarz, L.: An approach to refactoring of executable UML models. In: Proceedings of the 2006 ACM Symposium on Applied Computing, ACM, New York, NY, USA, SAC ’06, pp. 1273–1279 (2006). doi:10.1145/1141277.1141574

  13. Einarsson, H.T., Neukirchen, H.: An approach and tool for synchronous refactoring of UML diagrams and models using model-to-model transformations. In: Proceedings of the Fifth Workshop on Refactoring Tools. ACM, New York, NY, USA, WRT ’12, pp. 16–23 (2012). doi:10.1145/2328876.2328879

  14. Eissen, S.M., Stein, B.: Intrinsic plagiarism detection. In: Advances in Information Retrieval. Lecture Notes in Computer Science, vol. 3936, Springer, Berlin Heidelberg, pp. 565–569 (2006)

  15. El-Attar, M., Miller, J.: Improving the quality of use case models using antipatterns. Softw. Syst. Model. 9(2), 141–160 (2010). doi:10.1007/s10270-009-0112-9

    Article  Google Scholar 

  16. Enckevort, Tv.: Refactoring UML models: using openarchitectureware to measure uml model quality and perform pattern matching on UML models with OCL queries. In: Proceedings of the 24th ACM SIGPLAN conference companion on OOPSLA ’09. ACM, New York, NY, USA, pp. 635–646 (2009). doi:10.1145/1639950.1639959

  17. Falessi, D., Cantone, G., Canfora, G.: Empirical principles and an industrial case study in retrieving equivalent requirements via natural language processing techniques. IEEE Trans. Softw. Eng. 39(1), 18–44 (2013). doi:10.1109/TSE.2011.122

    Article  Google Scholar 

  18. Femmer, H., Fernández, D.M., Juergens, E., Klose, M., Zimmer, I., Zimmer, J.: Rapid requirements checks with requirements smells: two case studies. In: Proceedings of the 1st International Workshop on Rapid Continuous Software Engineering (RCoSE’14) Held at the 36th International Conference on Software Engineering (ICSE’14). ACM, Hyderabad, India, pp. 10–19 (2014). doi:10.1145/2593812.2593817

  19. Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3):705–708 (1982). http://view.ncbi.nlm.nih.gov/pubmed/7166760

  20. Greenwood, P.: Tao: A testbed for aspect oriented software development. (2011). http://www.comp.lancs.ac.uk/~greenwop/tao/

  21. Horton, R., Olsen, M., Roe, G.: Something borrowed: Sequence alignment and the identification of similar passages in large text collections. Digital Studies/Le Champ Numérique 2(1) (2010). http://www.digitalstudies.org/ojs/index.php/digital_studies/issue/view/25

  22. Hull, E., Jackson, K., Dick, J.: Requirements Engineering. Springer, Berlin (2010). http://books.google.com.ar/books?id=5xREIrqnDQEC

  23. Issa, A., Odeh, M., Coward, D.: Using use case patterns to estimate reusability in software systems. Inf. Softw. Technol. 48(9), 836–845 (2006)

    Article  Google Scholar 

  24. Juergens, E., Deissenboeck, F., Feilkas, M., Hummel, B., Schaetz, B., Wagner, S., Domann, C., Streit, J.: Can clone detection support quality assessments of requirements specifications? In: Proceedings of the 32 ACM/IEEE International Conference on Software Engineering (ICSE’10), vol. 2. ACM, New York, NY, USA, pp. 79–88 (2010). doi:10.1145/1810295.1810308

  25. Kamalrudin, M., Hosking, J., Grundy, J.: Improving requirements quality using essential use case interaction patterns. In: Proceedings of the 33rd International Conference on Software Engineering (ICSE’11), Waikiki, Honoulu, Hawaii, pp. 531–540 (2011). doi:10.1145/1985793.1985866

  26. Kamata, M.I., Tamai, T.: How does requirements quality relate to project success or failure? In: Proceedings of the 15th IEEE International Requirements Engineering Conference (RE’07). IEEE Computer Society, New Delhi, India, pp. 69–78 (2007). doi:10.1109/RE.2007.31

  27. Kulak, D., Guiney, E.: Use Cases: Requirements in Context, 2nd edn. Pearson Education, New Jersey (2012)

    Google Scholar 

  28. Larman, C.: Applying Uml and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3rd edn. Pearson Education, New Jersey (2012)

    Google Scholar 

  29. Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008). http://books.google.com.ar/books?id=t1PoSh4uwVcC

  30. Mich, L., Franch, M., Novi, I.P.L.: Market research for requirements analysis using linguistic tools. Requir. Eng. 9(2), 151 (2004). doi:10.1007/s00766-004-0195-3

    Article  Google Scholar 

  31. Moreira, A., Chitchyan, R., Araujo, J., Rashid, A. (eds.): Aspect-Oriented Requirements Engineering, vol. XIX. Springer, Berlin Heidelberg (2013). doi:10.1007/978-3-642-38640-4

    Google Scholar 

  32. Niazi, M., Shastry, S.: Role of requirements engineering in software development process: an empirical study. In: 7th International Multi Topic Conference (INMIC 2003), pp. 402–407 (2003). doi:10.1109/INMIC.2003.1416759

  33. Palmer, M., Gildea, D., Xue, N.: Semantic Role Labeling. Synthesis Lectures on Human Language Technologies. Morgan & Claypool (2010). http://books.google.com.ar/books?id=6C1Ag3NUqNEC

  34. Polanski, A., Kimmel, M.: Bioinformatics. Springer, Berlin (2007). http://books.google.com.ar/books?id=oZbR3GEdmVMC

  35. Rago, A., Abait, E., Marcos, C., Diaz-Pace, A.: Early aspect identification from use cases using NLP and WSD techniques. In: Proceedings of the Workshop on Early Aspects held at the 15th International Conference on Aspect-Oriented Software Development (AOSD’09). ACM, Charlottesville, Virginia, USA, pp. 19–24 (2009). doi:10.1145/1509825.1509830

  36. Rago, A., Marcos, C., Diaz-Pace, A.: Uncovering quality-attribute concerns in use case specifications via early aspect mining. Requir. Eng. 18(1), 67–84 (2013). doi:10.1007/s00766-011-0142-z

    Article  Google Scholar 

  37. Rago, A., Marcos, C., Diaz-Pace, A.: Assisting requirements analysts to find latent concerns with reassistant. Autom. Softw. Eng. (2014). doi:10.1007/s10515-014-0156-0

  38. Rago, A., Marcos, C., Diaz-Pace, A.: Una comparación de técnicas de nlp semánticas para analizar casos de uso (in spanish). In: Proceedings of the 2nd IEEE Biennial Congress of Argentina (ARGENCON’14), IEEE Argentina, Bariloche, Argentina (2014)

  39. Ramos, R., Castro, J., Alencar, F., Araújo, J., Moreira, A., de Engenharia da Computacao, C., Penteado, R.: Quality improvement for use case model. In: XXIII Brazilian Symposium on Software Engineering, 2009. SBES’09. IEEE, pp. 187–195 (2009)

  40. Ren, S., Butler, G., Rui, K., Xu, J., Yu, W., Luo, R.: A prototype tool for use case refactoring. In: Proceedings of the 6th International Conference on Enterprise Information Systems (ICEIS’04), Porto, Portugal, pp. 173–178 (2004)

  41. Rui, K., Butler, G.: Refactoring use case models: the metamodel. In: Proceedings of the 26th Australasian computer science conference-Volume 16, Australian Computer Society Inc, pp 301–308 (2003)

  42. Sampaio, A., Rashid, A., Chitchyan, R., Rayson, P.: EA-Miner: towards automation in aspect-oriented requirements engineering. In: Transactions on Aspect-Oriented Software Development III. Lecture Notes in Computer Science, vol. 4620, Springer, Berlin, pp. 4–39 (2007)

  43. Sateli, B., Angius, E., Rajivelu, S.S., Witte, R.: Can text mining assistants help to improve requirements specifications? In: Mining Unstructured Data (MUD 2012), Kingston, Ontario, Canada, (2012). http://sailhome.cs.queensu.ca/mud/res/sateli-mud2012.pdf

  44. Schneider, G., Winters, J.P.: Applying Use Cases: A Practical Guide. Object Technology Series, 2nd edn. Addison Wesley, Reading, MA (2001)

    Google Scholar 

  45. Sinha, A., Paradkar, A., Kumanan, P., Boguraev, B.: An analysis engine for dependable elicitation of natural language use case description and its application to industrial use cases. IBM Research Report RC24712 (2008)

  46. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  47. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. Data Mining and Knowledge Discovery Handbook, pp. 667–685 (2010)

  48. Yu, W., Li, J., Butler, G.: Refactoring use case models on episodes. In: Proceedings of the 19th International Conference on Automated Software Engineering. IEEE, pp. 328–335 (2004)

Download references

Acknowledgments

The authors would like to thank Paula Frade and Miguel Ruival, who implemented the ReqAligner prototype and evaluated the technique as part of their final project for the degree of Bachelor in Systems Engineering at UNICEN University. Also, the authors are grateful to the analysts who defined the reference solution for the evaluation of the technique. The authors also thank the anonymous reviewers for their feedback that helped to improve the quality of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Alejandro Rago, Claudia Marcos or J. Andres Diaz-Pace.

Additional information

Communicated by Prof. Daniel Amyot.

This work was partially supported by ANPCyT, CONICET and CIC (Argentina) through PICT Project 2010 No. 2247, PIP Project 2012- 2014 No. 11220110100078, and Project No. 813/13, respectively.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rago, A., Marcos, C. & Diaz-Pace, J.A. Identifying duplicate functionality in textual use cases by aligning semantic actions. Softw Syst Model 15, 579–603 (2016). https://doi.org/10.1007/s10270-014-0431-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-014-0431-3

Keywords

Navigation