ABSTRACT
Unused definitions are values assigned to variables but not used. Since unused definitions are usually considered redundant code causing no severe consequences except for wasting CPU cycles, system developers usually treat them as mild warnings and simply remove them. In this paper, we reevaluate the effect of unused definitions and discover that some unused definitions could indicate non-trivial bugs like security issues or data corruption, which calls for more attention from developers.
Although there are existing techniques to detect unused definitions, it is still challenging to detect critical bugs from unused definitions because only a small proportion of unused definitions are real bugs. In this paper, we present a static analysis framework ValueCheck to address the challenges of detecting bugs from unused definitions. First, we make a unique observation that the unused definitions on the boundary of developers' interactions are prone to be bugs. Second, we summarize syntactic and semantic patterns where unused definitions are intentionally written, which should not be considered bugs. Third, to distill bugs from unused definitions, we adopt the code familiarity metrics from the software engineering field to rank the detected bugs, which enables developers to prioritize their focus.
We evaluate ValueCheck with large system software and libraries including Linux, MySQL, OpenSSL, and NFS-ganesha. ValueCheck helps detect 210 unknown bugs from these applications. 154 bugs are confirmed by developers. Compared to state-of-the-art tools, ValueCheck demonstrates to effectively detect bugs with low false positives.
- Diagnostic flags in Clang. https://clang.llvm.org/docs/DiagnosticsReference.html#wunused, 2007.Google Scholar
- GitPython Documentation --- GitPython 3.1.12 documentation. https://gitpython.readthedocs.io/en/stable/index.html, 2015.Google Scholar
- Smatch: pluggable static analysis for C. https://lwn.net/Articles/691882/, 2016.Google Scholar
- Smatch the Source Matcher. https://smatch.sourceforge.net/, 2020.Google Scholar
- NFS-ganesha: User Space NFS and 9P File Server. https://nfs-ganesha.github.io/, 2021.Google Scholar
- The LLVM Compiler Infrastructure. https://llvm.org/, 2021.Google Scholar
- Warning Options (Using the GNU Compiler Collection (GCC)). https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options, 2021.Google Scholar
- re - Regular expression operations. https://docs.python.org/3/library/re.html, 2022.Google Scholar
- StackOverflow. https://stackoverflow.com/, 2022.Google Scholar
- FloridSleeves/ValueCheck. https://github.com/FloridSleeves/ValueCheck, 2023.Google Scholar
- Ahmadi, M., Farkhani, R. M., Williams, R., and Lu, L. Finding bugs using your own code: detecting functionally-similar yet inconsistent code. In 30th USENIX Security Symposium (USENIX Security 21) (2021), pp. 2025--2040.Google Scholar
- Aho, A. V., Sethi, R., and Ullman, J. D. Compilers, principles, techniques. Addison wesley 7, 8 (1986), 9.Google Scholar
- Andersen, L. O. Program analysis and specialization for the C programming language. PhD thesis, Citeseer, 1994.Google Scholar
- Anvik, J., and Murphy, G. C. Determining implementation expertise from bug reports. In Fourth International Workshop on Mining Software Repositories (MSR'07: ICSE Workshops 2007) (2007), IEEE, pp. 2--2.Google ScholarDigital Library
- Bai, J.-J., Li, T., and Hu, S.-M. {DLOS}: Effective static detection of deadlocks in {OS} kernels. In 2022 USENIX Annual Technical Conference (USENIX ATC 22) (2022), pp. 367--382.Google Scholar
- Bertolini, C., Scaäf, M., and Schweitzer, P. Infeasible code detection. In International Conference on Verified Software: Tools, Theories, Experiments (2012), Springer, pp. 310--325.Google ScholarDigital Library
- Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., and Engler, D. A few billion lines of code later: using static analysis to find bugs in the real world. Communications of the ACM 53, 2 (2010), 66--75.Google ScholarDigital Library
- Bird, C., Nagappan, N., Murphy, B., Gall, H., and Devanbu, P. Don't touch my code! examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering (2011), pp. 4--14.Google ScholarDigital Library
- Briggs, P., and Cooper, K. D. Effective partial redundancy elimination. ACM SIGPLAN Notices 29, 6 (1994), 159--170.Google ScholarDigital Library
- Chaitin, G. J. Register allocation & spilling via graph coloring. ACM Sigplan Notices 17, 6 (1982), 98--101.Google ScholarDigital Library
- Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Using crash hoare logic for certifying the fscq file system. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 18--37.Google ScholarDigital Library
- Claytor, L., and Servant, F. Understanding and leveraging developer inexpertise. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (2018), pp. 404--405.Google ScholarDigital Library
- Engler, D., Chen, D. Y., Hallem, S., Chou, A., and Chelf, B. Bugs as deviant behavior: A general approach to inferring errors in systems code. ACM SIGOPS Operating Systems Review 35, 5 (2001), 57--72.Google ScholarDigital Library
- Fowler, M. Refactoring: Improving the design of existing code. In 11th European Conference. Jyväskylä, Finland (1997).Google Scholar
- Fowler, M. Refactoring: improving the design of existing code. Addison-Wesley Professional, 2018.Google Scholar
- Fritz, T., Ou, J., Murphy, G. C., and Murphy-Hill, E. A degree-of-knowledge model to capture source code familiarity In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1 (2010), pp. 385--394.Google Scholar
- Gabel, M., Yang, J., Yu, Y., Goldszmidt, M., and Su, Z. Scalable and systematic detection of buggy inconsistencies in source code. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications (2010), pp. 175--190.Google ScholarDigital Library
- Giger, E., and Gall, H. Object-oriented design heuristics.Google Scholar
- Gupta, R., Benson, D., and Fang, J. Z. Path profile guided partial dead code elimination using predication. In Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques (1997), IEEE, pp. 102--113.Google ScholarCross Ref
- Hall, T., Zhang, M., Bowes, D., and Sun, Y. Some code smells have a significant but small effect on faults. ACM Transactions on Software Engineering and Methodology (TOSEM) 23, 4 (2014), 1--39.Google Scholar
- Hind, M., and Pioli, A. Which pointer analysis should i use? In Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis (2000), pp. 113--123.Google ScholarDigital Library
- Hu, H., Zhang, H., Xuan, J., and Sun, W. Effective bug triage based on historical bug-fix information. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (2014), IEEE, pp. 122--132.Google ScholarDigital Library
- Huang, H., Shen, B., Zhong, L., and Zhou, Y. Protecting data integrity of web applications with database constraints inferred from application code. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (2023), pp. 632--645.Google ScholarDigital Library
- Huang, H., Xiang, C., Zhong, L., and Zhou, Y. {PYLIVE}:{On-the-Fly} code change for python-based online services. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (2021), pp. 349--363.Google Scholar
- Huang, Y., Zheng, Q., Chen, X., Xiong, Y., Liu, Z., and Luo, X. Mining version control system for automatically generating commit comment. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (2017), IEEE, pp. 414--423.Google ScholarDigital Library
- Hwang, Y.-S., and Saltz, J. Identifying def/use information of statements that construct and traverse dynamic recursive data structures. In International Workshop on Languages and Compilers for Parallel Computing (1997), Springer, pp. 131--145.Google Scholar
- Johnson, R., and Pingali, K. Dependence-based program analysis. In Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation (1993), pp. 78--89.Google ScholarDigital Library
- Khomh, F., Di Penta, M., Guéhéneuc, Y.-G., and Antoniol, G. An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empirical Software Engineering 17, 3 (2012), 243--275.Google ScholarDigital Library
- Kim, S., Xu, M., Kashyap, S., Yoon, J., Xu, W., and Kim, T. Finding semantic bugs in file systems with an extensible fuzzing framework. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (2019), pp. 147--161.Google ScholarDigital Library
- Knoop, J., Rüthing, O., and Steffen, B. Partial dead code elimination. ACM SIGPLAN Notices 29, 6 (1994), 147--158.Google ScholarDigital Library
- Lattner, C. Llvm and clang: Next generation compiler technology. In The BSD conference (2008), vol. 5.Google Scholar
- Leesatapornwongsa, T., Hao, M., Joshi, P., Lukman, J. F., and Gunawi, H. S. Samc: Semantic-aware model checking for fast discovery of deep bugs in cloud systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (USA, 2014), OSDI'14, USENIX Association, p. 399--414.Google Scholar
- Li, T., Bai, J.-J., Sui, Y., and Hu, S.-M. Path-sensitive and alias-aware typestate analysis for detecting os bugs. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (2022), pp. 859--872.Google ScholarDigital Library
- Lu, K., Pakki, A., and Wu, Q. Automatically identifying security checks for detecting kernel semantic bugs. In Computer Security-ESORICS 2019: 24th European Symposium on Research in Computer Security, Luxembourg, September 23-27, 2019, Proceedings, Part II 24 (2019), Springer, pp. 3--25.Google Scholar
- Lu, K. L., Pakki, A., and Wu, Q. Detecting missing-check bugs via semantic-and context-aware criticalness and constraints inferences. In Proceedings of the 28th USENIX Conference on Security Symposium (2019).Google ScholarDigital Library
- McDonald, D. W., and Ackerman, M. S. Expertise recommender: a flexible recommendation system and architecture. In Proceedings of the 2000 ACM conference on Computer supported cooperative work (2000), pp. 231--240.Google ScholarDigital Library
- Min, C., Kashyap, S., Lee, B., Song, C., and Kim, T. Cross-checking semantic correctness: The case of finding file system bugs. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 361--377.Google ScholarDigital Library
- Minto, S., and Murphy, G. C. Recommending emergent teams. In Fourth International Workshop on Mining Software Repositories (MSR'07: ICSE Workshops 2007) (2007), IEEE, pp. 5--5.Google ScholarDigital Library
- Mockus, A., and Herbsleb, J. D. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering. ICSE 2002 (2002), IEEE, pp. 503--512.Google ScholarCross Ref
- Munson, J. C., and Elbaum, S. G. Code churn: A measure for estimating the impact of code change. In Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272) (1998), IEEE, pp. 24--31.Google ScholarCross Ref
- Moth, R. Register liveness analysis of executable code. Manuscript, Dept. of Computer Science, The University of Arizona, Dec (1998).Google Scholar
- Nguyen, T. T., Nguyen, T. N., Duesterwald, E., Klinger, T., and Santhanam, P. Inferring developer expertise through defect analysis. In 2012 34th International Conference on Software Engineering (ICSE) (2012), IEEE, pp. 1297--1300.Google ScholarCross Ref
- Nielson, F., Nielson, H. R., and Hankin, C. Principles of program analysis. Springer Science & Business Media, 2004.Google ScholarDigital Library
- Novillo, D. Gcc an architectural overview, current status, and future directions. In Proceedings of the Linux Symposium (2006), vol. 2, p. 185.Google Scholar
- Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., and De Lucia, A. Do they really smell bad? a study on developers' perception of bad code smells. In 2014 IEEE International Conference on Software Maintenance and Evolution (2014), IEEE, pp. 101--110.Google ScholarDigital Library
- Probst, M., Krall, A., and Scholz, B. Register liveness analysis for optimizing dynamic binary translation. In Ninth Working Conference on Reverse Engineering, 2002. Proceedings. (2002), IEEE, pp. 35--44.Google ScholarCross Ref
- Reps, T., Horwitz, S., and Sagiv, M. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages (1995), pp. 49--61.Google ScholarDigital Library
- Ribeiro, A., Meirelles, P., Lago, N., and Kon, F. Ranking warnings from multiple source code static analyzers via ensemble learning. In Proceedings of the 15th International Symposium on Open Collaboration (2019), pp. 1--10.Google ScholarDigital Library
- Ridge, T., Sheets, D., Tuerk, T., Giugliano, A., Madhavapeddy, A., and Sewell, P. Sibylfs: formal specification and oracle-based testing for posix and real-world file systems. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 38--53.Google ScholarDigital Library
- Ruparelia, N. B. The history of version control. ACM SIGSOFT Software Engineering Notes 35, 1 (2010), 5--9.Google ScholarDigital Library
- Schäf, M., Schwartz-Narbonne, D., and Wies, T. Explaining inconsistent code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (2013), pp. 521--531.Google ScholarDigital Library
- Schneck, P. B. A survey of compiler optimization techniques. In Proceedings of the ACM annual conference (1973), pp. 106--113.Google ScholarDigital Library
- Schuler, D., and Zimmermann, T. Mining usage expertise from version archives. In Proceedings of the 2008 international working conference on Mining software repositories (2008), pp. 121--124.Google ScholarDigital Library
- Shatnawi, R., and Li, W. An investigation of bad smells in object-oriented design. In Third International Conference on Information Technology: New Generations (ITNG'06) (2006), IEEE, pp. 161--165.Google ScholarDigital Library
- Shen, B. Automatic Methods to Enhance Server Systems in Access Control Diagnosis. University of California, San Diego, 2022.Google ScholarDigital Library
- Shen, B., Shan, T., and Zhou, Y. Multiview: Finding blind spots in access-deny issues diagnosis. In USENIX Security Symposium (2023).Google Scholar
- Sigurbjarnarson, H., Bornholt, J., Torlak, E., and Wang, X. Pushbutton verification of file systems via crash refinement. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (USA, 2016), OSDI'16, USENIX Association, p. 1--16.Google Scholar
- Sjøberg, D. I., Yamashita, A., Anda, B. C., Mockus, A., and Dybå, T. Quantifying the effect of code smells on maintenance effort. IEEE Transactions on Software Engineering 39, 8 (2012), 1144--1156.Google Scholar
- Sui, Y., and Xue, J. Svf: interprocedural static value-flow analysis in llvm. In Proceedings of the 25th international conference on compiler construction (2016), ACM, pp. 265--266.Google ScholarDigital Library
- Tan, L., Zhang, X., Ma, X., Xiong, W., and Zhou, Y. Autoises: Automatically inferring security specification and detecting violations. In USENIX Security Symposium (2008), pp. 379--394.Google ScholarDigital Library
- Thongtanunam, P., McImtosh, S., Hassan, A. E., and Iida, H. Revisiting code ownership and its relationship with software quality in the scope of modern code review. In Proceedings of the 38th international conference on software engineering (2016), pp. 1039--1050.Google ScholarDigital Library
- Tomb, A., and Flanagan, C. Detecting inconsistencies via universal reachability analysis. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (2012), pp. 287--297.Google ScholarDigital Library
- Wand, M., and Siveroni, I. Constraint systems for useless variable elimination. In Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (1999), pp. 291--302.Google ScholarDigital Library
- Wang, C., Li, Y., Chen, L., Huang, W., Zhou, Y., and Xu, B. Examining the effects of developer familiarity on bug fixing. Journal of Systems and Software 169 (2020), 110667.Google ScholarCross Ref
- Wang, Y.-J., Yin, L.-Z., and Dong, W. Amchex: Accurate analysis of missing-check bugs for linux kernel. Journal of Computer Science and Technology 36 (2021), 1325--1341.Google ScholarDigital Library
- Xiang, C., Wu, Y., Shen, B., Shen, M., Huang, H., Xu, T., Zhou, Y., Moore, C., Jin, X., and Sheng, T. Towards continuous access control validation and forensics. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security (2019), pp. 113--129.Google ScholarDigital Library
- Yamashita, A., and Moonen, L. Exploring the impact of inter-smell relations on software maintainability: An empirical study. In 2013 35th International Conference on Software Engineering (ICSE) (2013), IEEE, pp. 682--691.Google ScholarDigital Library
- Zhang, T., Shen, W, Lee, D., Jung, C., Azab, A. M., and Wang, R. Pex: A permission check analysis framework for linux kernel. In Proceedings of the 28th USENIX Conference on Security Symposium (USA, 2019), SEC'19, USENIX Association, p. 1205--1220.Google Scholar
- Zhong, L. A survey of prevent and detect access control vulnerabilities. arXiv preprint arXiv:2304.10600 (2023).Google Scholar
- Zhong, L., and Wang, Z. Can chatgpt replace stackoverflow? a study on robustness and reliability of large language model code generation, 2023.Google Scholar
- Zhong, L., and Wang, Z. A study on robustness and reliability of large language model code generation. arXiv preprint arXiv:2308.10335 (2023).Google Scholar
Index Terms
- Effective Bug Detection with Unused Definitions
Recommendations
Effective Bug Triage Based on Historical Bug-Fix Information
ISSRE '14: Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability EngineeringFor complex and popular software, project teams could receive a large number of bug reports. It is often tedious and costly to manually assign these bug reports to developers who have the expertise to fix the bugs. Many bug triage techniques have been ...
How many of all bugs do we find? a study of static bug detectors
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software EngineeringStatic bug detectors are becoming increasingly popular and are widely used by professional software developers. While most work on bug detectors focuses on whether they find bugs at all, and on how many false positives they report in addition to ...
ColFinder Collaborative Concurrency Bug Detection
QSIC '13: Proceedings of the 2013 13th International Conference on Quality SoftwareMany concurrency bugs are extremely difficult to be detected by random test due to huge input space and huge interleaving space. The multicore technology trend worsens this problem. We propose an innovative, collaborative approach called ColFinder to ...
Comments