Abstract
Our goal is to analyze the optimality of search strategies for use in systematic reviews of software engineering experiments. Studies retrieval is an important problem in any evidence-based discipline. This question has not been examined for evidence-based software engineering as yet. We have run several searches exercising different terms denoting experiments to evaluate their recall and precision. Based on our evaluation, we propose using a high recall strategy when there are plenty of resources or the results need to be exhaustive. For any other case, we propose optimal, or even acceptable, search strategies. As a secondary goal, we have analysed trends and weaknesses in terminology used in articles reporting software engineering experiments. We have found that it is impossible for a search strategy to retrieve 100% of the experiments of interest (as happens in other experimental disciplines), because of the shortage of reporting standards in the community.
Similar content being viewed by others
Notes
The complete catalog of articles is published in (Kempenes 2007).
Although a broader range of study-types was considered in these two surveys, the definition of term “experiment” is the same as the one in the gold standard.
References
Bailey J, Zhang C, Budgen D (2007) Search engine overlaps: do they agree or disagree? Proceedings of the 2nd International Workshop on Realising Evidence-Based Software Engineering REBSE’07. Minneapolis, USA, 1–6
Brereton P, Kitchenham B, Budgen D, Turner M, Khalil M (2007) Lessons from applying the systematic literature review process within the software engineering domain. J Syst Softw 80(4):571–583. doi:10.1016/j.jss.2006.07.009
Davis A, Dieste O, Juristo N, Moreno A (2006) Effectiveness of requirements elicitation techniques: empirical results derived from a systematic review. Proceedings of the 14th IEEE International Requirements Engineering Conference RE’06. Minneapolis, USA, 179–188
Dybå T, Kitchenham B, Jørgensen M (2005) Evidence-based software engineering for practitioners. IEEE Softw 22:58–65. doi:10.1109/MS.2005.6
Dybå T, Kampenes V, Sjøberg D (2006) A systematic review of statistical power in software engineering experiments. Inf Softw Technol 48(8):745–755. doi:10.1016/j.infsof.2005.08.009
Dybå T, Arisholm E, Sjoberg D, Hannay J, Shull F (2007a) Are two heads better than one? On the effectiveness of pair programming. IEEE Softw 24(6):10–13. doi:10.1109/MS.2007.158
Dybå T, Dingsøyr T, Hanssen GK (2007b) Applying systematic reviews to diverse study types: an experience report. Proceedings of the First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007). Madrid, Spain, 225–234
EBM Working Group (1992) Evidence-based medicine: a new approach to teach the practice of medicine. J Am Med Inform Assoc 268(17):2420–2425
Hannay J, Sjøberg D, Dybå T (2007) A systematic review of theory use in software engineering experiments. IEEE Trans Softw Eng 33(2):87–107
Higgins J, Green S (2006) Cochrane handbook for systematic reviews of interventions 4.2.6 [updated September 2006]. In: The Cochrane library (vol 4). Wiley, Chichester, UK
Jedlitschka A, Pfahl D (2005) Reporting experiments in software engineering. Proceedings of the International Symposium on Empirical Software Engineering (ISESE’05). Noosa Heads, Australia, 95–104
Jørgensen MA (2004) Review of studies on expert estimation of software development effort. J Syst Softw 70(1–2):37–60
Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33(1):33–53
Juristo N, Moreno A, Vegas S (2004) Reviewing 25 years of testing technique experiment. Empir Softw Eng 9:7–44
Kempenes VB (2007) Quality of design, analysis and reporting of software engineering experiments—a systematic review. Ph.D. dissertation Nr 671, University of Oslo. ISSN 1501-7710
Kampenes VB, Dybå T, Hannay JE, Sjøberg DIK (2007) A systematic review of effect size in software engineering experiments. Inf Softw Technol 49(11–12):1073–1086
Kitchenham B, Pfleeger S, Pickard L, Jones P, Hoaglin D, Emam K, Rosenberg J (2002) Preliminary guidelines for empirical research in software engineering. IEEE Trans Softw Eng 28:721–734
Kitchenham B, Dybå T, Jørgensen M (2004) Evidence-based software engineering. Proceedings of the 26th International Conference on Software Engineering (ICSE’04). Scotland, UK, 273–284
Kitchenham BA, Mendes E, Travassos GH (2007) Cross versus within-company cost estimation studies: a systematic review. IEEE Trans Softw Eng 33(5):316–329
Lajeunesse MJ, Forbes M (2003) Variable reporting and quantitative reviews: a comparison of three meta-analytical techniques. Ecol Lett 6:448–454
Mendes E (2005). A systematic review of web engineering research. Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering. Noosa Heads, Australia, 498–507
Petitti D (2000) Meta-analysis, decision analysis and cost-effectiveness analysis. Oxford University Press, Oxford
Singer J (1999) Using the American Psychological Association (APA) style guidelines to report experimental results. Proceedings of the Workshop on Empirical Studies in Software Maintenance, Oxford, UK, 2–5
Sjøberg D, Hannay J, Hansen O, Kampenes V, Karahasanovic A, Liborg N, Rekdal A (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31:733–753
Straus SE, Richardson W, Glasziou P, Haynes RB (2005) Evidence-based medicine. How to practice and teach EBM. Elsevier, Oxford
van Rijsbergen CJ (1979) Information retrieval. Department of Computer Science, University of Glasgow, Glasgow
Zelkowitz MV, Wallace DR (1998) Experimental models for validating technology. IEEE Comput 31(5):23–31
Acknowledgements
We would like to thank Dag Sjøberg and Jo Hannay, of Simula Research Laboratory, for providing the references to the 103 articles that were used to formalize the gold standard used in this research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Per Runeson
Rights and permissions
About this article
Cite this article
Dieste, O., Grimán, A. & Juristo, N. Developing search strategies for detecting relevant experiments. Empir Software Eng 14, 513–539 (2009). https://doi.org/10.1007/s10664-008-9091-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-008-9091-7