Predicting Fault-Prone Modules by Word Occurrence in Identifiers

Kawashima, Naoki; Mizuno, Osamu

doi:10.1007/978-3-319-11265-7_7

Naoki Kawashima³ &
Osamu Mizuno³

Part of the book series: Studies in Computational Intelligence ((SCI,volume 578))

861 Accesses
3 Citations

Abstract

Prediction of fault-prone modules is an important area of software engineering. We assumed that the occurrence of faults is related to the semantics in the source code modules. Semantics in a software module can be extracted from identifiers in the module. We then analyze the relationship between occurrence of “words” in identifiers and the existence of faults. To do so, we first decompose the identifiers into words, and investigate the occurrence of words in a module. Modeling by the random forest technique, we made a model of occurrence of words and existence of faults. We compared the word occurrence model with traditional models using CK metrics and LOC. The result of comparison showed that the occurrence of words is a good prediction measure as well as CK metrics and LOC.

Currently, The author is in Nara Advanced Institute of Science and Technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/.
2.
http://promisedata.googlecode.com/.
3.
https://github.com/doofuslarge/lscp. lcsp is a lightweight source code preprocesser. lscp can be used to isolate and manipulate the linguistic data (i.e., identifier names, comments, and string literals) from source code files.

References

Hata, H., Mizuno, O., Kikuno, T.: A systematic review of software fault prediction studies and related techniques in the context of repository mining. JSSST Comput. Softw. 29(1), 106–117 (2012)
Google Scholar
Khoshgoftaar, T.M., Seliya, N.: Comparative assessment of software quality classification techniques: an empirical study. Empirical Softw. Eng. 9, 229–257 (2004)
Article Google Scholar
Briand, L.C., Melo, W.L., Wust, J.: Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans. Softw. Eng. 28(7), 706–720 (2002)
Article Google Scholar
Gyimóthy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005). http://dx.doi.org/10.1109/TSE.2005.112
Ostrand, T., Weyuker, E., Bell, R.: Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31(4), 340–355 (2005)
Google Scholar
Graves, T.L., Karr, A.F., Marron, J., Siy, H.: Predicting fault incidence using software change history. IEEE Trans. Softw. Eng. 26(7), 653–661 (2000). http://doi.ieeecomputersociety.org/10.1109/32.859533
Nagappan, N., Ball, T.: Static analysis tools as early indicators of pre-release defect density. In: Proceedings of 27th International Conference on Software Engineering, pp. 580–586. ACM, New York, NY, USA (2005). http://doi.acm.org/10.1145/1062455.1062558
Zheng, J., Williams, L., Nagappan, N., Snipes, W., Hudepohl, J.P., Vouk, M.A.: On the value of static analysis for fault detection in software. IEEE Trans. Softw. Eng. 32(4), 240–253 (2006). doi:10.1109/TSE.2006.38. http://dx.doi.org/10.1109/TSE.2006.38
Kawamoto, K., Mizuno, O.: Do long identifiers induce faults in software? A repository mining based investigation. In: Proceedings of 22nd International Symposium on Software Reliability Engineering (ISSRE2011), Supplemental Proceedings, pp. 3–1. Hiroshima, Japan, 2011
Google Scholar
Yamamoto, H.: Software bug density prediction based on variable name (2010)
Google Scholar

Download references

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number 24500038.

Author information

Authors and Affiliations

Software Engineering Laboratory, Graduate School of Science and Technology, Kyoto Institute of Technology, Kyoto, Japan
Naoki Kawashima & Osamu Mizuno

Authors

Naoki Kawashima
View author publications
You can also search for this author in PubMed Google Scholar
Osamu Mizuno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naoki Kawashima .

Editor information

Editors and Affiliations

Software Engineering and Information Technology Institute, Central Michigan University, Mount Pleasant, Michigan, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kawashima, N., Mizuno, O. (2015). Predicting Fault-Prone Modules by Word Occurrence in Identifiers. In: Lee, R. (eds) Software Engineering Research, Management and Applications. Studies in Computational Intelligence, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-319-11265-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-11265-7_7
Published: 02 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11264-0
Online ISBN: 978-3-319-11265-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics