skip to main content
10.1145/3627703.3629576acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections

Effective Bug Detection with Unused Definitions

Published:22 April 2024Publication History

ABSTRACT

Unused definitions are values assigned to variables but not used. Since unused definitions are usually considered redundant code causing no severe consequences except for wasting CPU cycles, system developers usually treat them as mild warnings and simply remove them. In this paper, we reevaluate the effect of unused definitions and discover that some unused definitions could indicate non-trivial bugs like security issues or data corruption, which calls for more attention from developers.

Although there are existing techniques to detect unused definitions, it is still challenging to detect critical bugs from unused definitions because only a small proportion of unused definitions are real bugs. In this paper, we present a static analysis framework ValueCheck to address the challenges of detecting bugs from unused definitions. First, we make a unique observation that the unused definitions on the boundary of developers' interactions are prone to be bugs. Second, we summarize syntactic and semantic patterns where unused definitions are intentionally written, which should not be considered bugs. Third, to distill bugs from unused definitions, we adopt the code familiarity metrics from the software engineering field to rank the detected bugs, which enables developers to prioritize their focus.

We evaluate ValueCheck with large system software and libraries including Linux, MySQL, OpenSSL, and NFS-ganesha. ValueCheck helps detect 210 unknown bugs from these applications. 154 bugs are confirmed by developers. Compared to state-of-the-art tools, ValueCheck demonstrates to effectively detect bugs with low false positives.

References

  1. Diagnostic flags in Clang. https://clang.llvm.org/docs/DiagnosticsReference.html#wunused, 2007.Google ScholarGoogle Scholar
  2. GitPython Documentation --- GitPython 3.1.12 documentation. https://gitpython.readthedocs.io/en/stable/index.html, 2015.Google ScholarGoogle Scholar
  3. Smatch: pluggable static analysis for C. https://lwn.net/Articles/691882/, 2016.Google ScholarGoogle Scholar
  4. Smatch the Source Matcher. https://smatch.sourceforge.net/, 2020.Google ScholarGoogle Scholar
  5. NFS-ganesha: User Space NFS and 9P File Server. https://nfs-ganesha.github.io/, 2021.Google ScholarGoogle Scholar
  6. The LLVM Compiler Infrastructure. https://llvm.org/, 2021.Google ScholarGoogle Scholar
  7. Warning Options (Using the GNU Compiler Collection (GCC)). https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options, 2021.Google ScholarGoogle Scholar
  8. re - Regular expression operations. https://docs.python.org/3/library/re.html, 2022.Google ScholarGoogle Scholar
  9. StackOverflow. https://stackoverflow.com/, 2022.Google ScholarGoogle Scholar
  10. FloridSleeves/ValueCheck. https://github.com/FloridSleeves/ValueCheck, 2023.Google ScholarGoogle Scholar
  11. Ahmadi, M., Farkhani, R. M., Williams, R., and Lu, L. Finding bugs using your own code: detecting functionally-similar yet inconsistent code. In 30th USENIX Security Symposium (USENIX Security 21) (2021), pp. 2025--2040.Google ScholarGoogle Scholar
  12. Aho, A. V., Sethi, R., and Ullman, J. D. Compilers, principles, techniques. Addison wesley 7, 8 (1986), 9.Google ScholarGoogle Scholar
  13. Andersen, L. O. Program analysis and specialization for the C programming language. PhD thesis, Citeseer, 1994.Google ScholarGoogle Scholar
  14. Anvik, J., and Murphy, G. C. Determining implementation expertise from bug reports. In Fourth International Workshop on Mining Software Repositories (MSR'07: ICSE Workshops 2007) (2007), IEEE, pp. 2--2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bai, J.-J., Li, T., and Hu, S.-M. {DLOS}: Effective static detection of deadlocks in {OS} kernels. In 2022 USENIX Annual Technical Conference (USENIX ATC 22) (2022), pp. 367--382.Google ScholarGoogle Scholar
  16. Bertolini, C., Scaäf, M., and Schweitzer, P. Infeasible code detection. In International Conference on Verified Software: Tools, Theories, Experiments (2012), Springer, pp. 310--325.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., and Engler, D. A few billion lines of code later: using static analysis to find bugs in the real world. Communications of the ACM 53, 2 (2010), 66--75.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bird, C., Nagappan, N., Murphy, B., Gall, H., and Devanbu, P. Don't touch my code! examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering (2011), pp. 4--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Briggs, P., and Cooper, K. D. Effective partial redundancy elimination. ACM SIGPLAN Notices 29, 6 (1994), 159--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Chaitin, G. J. Register allocation & spilling via graph coloring. ACM Sigplan Notices 17, 6 (1982), 98--101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chen, H., Ziegler, D., Chajed, T., Chlipala, A., Kaashoek, M. F., and Zeldovich, N. Using crash hoare logic for certifying the fscq file system. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 18--37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Claytor, L., and Servant, F. Understanding and leveraging developer inexpertise. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings (2018), pp. 404--405.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Engler, D., Chen, D. Y., Hallem, S., Chou, A., and Chelf, B. Bugs as deviant behavior: A general approach to inferring errors in systems code. ACM SIGOPS Operating Systems Review 35, 5 (2001), 57--72.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Fowler, M. Refactoring: Improving the design of existing code. In 11th European Conference. Jyväskylä, Finland (1997).Google ScholarGoogle Scholar
  25. Fowler, M. Refactoring: improving the design of existing code. Addison-Wesley Professional, 2018.Google ScholarGoogle Scholar
  26. Fritz, T., Ou, J., Murphy, G. C., and Murphy-Hill, E. A degree-of-knowledge model to capture source code familiarity In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1 (2010), pp. 385--394.Google ScholarGoogle Scholar
  27. Gabel, M., Yang, J., Yu, Y., Goldszmidt, M., and Su, Z. Scalable and systematic detection of buggy inconsistencies in source code. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications (2010), pp. 175--190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Giger, E., and Gall, H. Object-oriented design heuristics.Google ScholarGoogle Scholar
  29. Gupta, R., Benson, D., and Fang, J. Z. Path profile guided partial dead code elimination using predication. In Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques (1997), IEEE, pp. 102--113.Google ScholarGoogle ScholarCross RefCross Ref
  30. Hall, T., Zhang, M., Bowes, D., and Sun, Y. Some code smells have a significant but small effect on faults. ACM Transactions on Software Engineering and Methodology (TOSEM) 23, 4 (2014), 1--39.Google ScholarGoogle Scholar
  31. Hind, M., and Pioli, A. Which pointer analysis should i use? In Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis (2000), pp. 113--123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hu, H., Zhang, H., Xuan, J., and Sun, W. Effective bug triage based on historical bug-fix information. In 2014 IEEE 25th International Symposium on Software Reliability Engineering (2014), IEEE, pp. 122--132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Huang, H., Shen, B., Zhong, L., and Zhou, Y. Protecting data integrity of web applications with database constraints inferred from application code. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (2023), pp. 632--645.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Huang, H., Xiang, C., Zhong, L., and Zhou, Y. {PYLIVE}:{On-the-Fly} code change for python-based online services. In 2021 USENIX Annual Technical Conference (USENIX ATC 21) (2021), pp. 349--363.Google ScholarGoogle Scholar
  35. Huang, Y., Zheng, Q., Chen, X., Xiong, Y., Liu, Z., and Luo, X. Mining version control system for automatically generating commit comment. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (2017), IEEE, pp. 414--423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hwang, Y.-S., and Saltz, J. Identifying def/use information of statements that construct and traverse dynamic recursive data structures. In International Workshop on Languages and Compilers for Parallel Computing (1997), Springer, pp. 131--145.Google ScholarGoogle Scholar
  37. Johnson, R., and Pingali, K. Dependence-based program analysis. In Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation (1993), pp. 78--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Khomh, F., Di Penta, M., Guéhéneuc, Y.-G., and Antoniol, G. An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empirical Software Engineering 17, 3 (2012), 243--275.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Kim, S., Xu, M., Kashyap, S., Yoon, J., Xu, W., and Kim, T. Finding semantic bugs in file systems with an extensible fuzzing framework. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (2019), pp. 147--161.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Knoop, J., Rüthing, O., and Steffen, B. Partial dead code elimination. ACM SIGPLAN Notices 29, 6 (1994), 147--158.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Lattner, C. Llvm and clang: Next generation compiler technology. In The BSD conference (2008), vol. 5.Google ScholarGoogle Scholar
  42. Leesatapornwongsa, T., Hao, M., Joshi, P., Lukman, J. F., and Gunawi, H. S. Samc: Semantic-aware model checking for fast discovery of deep bugs in cloud systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (USA, 2014), OSDI'14, USENIX Association, p. 399--414.Google ScholarGoogle Scholar
  43. Li, T., Bai, J.-J., Sui, Y., and Hu, S.-M. Path-sensitive and alias-aware typestate analysis for detecting os bugs. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (2022), pp. 859--872.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Lu, K., Pakki, A., and Wu, Q. Automatically identifying security checks for detecting kernel semantic bugs. In Computer Security-ESORICS 2019: 24th European Symposium on Research in Computer Security, Luxembourg, September 23-27, 2019, Proceedings, Part II 24 (2019), Springer, pp. 3--25.Google ScholarGoogle Scholar
  45. Lu, K. L., Pakki, A., and Wu, Q. Detecting missing-check bugs via semantic-and context-aware criticalness and constraints inferences. In Proceedings of the 28th USENIX Conference on Security Symposium (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. McDonald, D. W., and Ackerman, M. S. Expertise recommender: a flexible recommendation system and architecture. In Proceedings of the 2000 ACM conference on Computer supported cooperative work (2000), pp. 231--240.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Min, C., Kashyap, S., Lee, B., Song, C., and Kim, T. Cross-checking semantic correctness: The case of finding file system bugs. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 361--377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Minto, S., and Murphy, G. C. Recommending emergent teams. In Fourth International Workshop on Mining Software Repositories (MSR'07: ICSE Workshops 2007) (2007), IEEE, pp. 5--5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Mockus, A., and Herbsleb, J. D. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering. ICSE 2002 (2002), IEEE, pp. 503--512.Google ScholarGoogle ScholarCross RefCross Ref
  50. Munson, J. C., and Elbaum, S. G. Code churn: A measure for estimating the impact of code change. In Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272) (1998), IEEE, pp. 24--31.Google ScholarGoogle ScholarCross RefCross Ref
  51. Moth, R. Register liveness analysis of executable code. Manuscript, Dept. of Computer Science, The University of Arizona, Dec (1998).Google ScholarGoogle Scholar
  52. Nguyen, T. T., Nguyen, T. N., Duesterwald, E., Klinger, T., and Santhanam, P. Inferring developer expertise through defect analysis. In 2012 34th International Conference on Software Engineering (ICSE) (2012), IEEE, pp. 1297--1300.Google ScholarGoogle ScholarCross RefCross Ref
  53. Nielson, F., Nielson, H. R., and Hankin, C. Principles of program analysis. Springer Science & Business Media, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Novillo, D. Gcc an architectural overview, current status, and future directions. In Proceedings of the Linux Symposium (2006), vol. 2, p. 185.Google ScholarGoogle Scholar
  55. Palomba, F., Bavota, G., Di Penta, M., Oliveto, R., and De Lucia, A. Do they really smell bad? a study on developers' perception of bad code smells. In 2014 IEEE International Conference on Software Maintenance and Evolution (2014), IEEE, pp. 101--110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Probst, M., Krall, A., and Scholz, B. Register liveness analysis for optimizing dynamic binary translation. In Ninth Working Conference on Reverse Engineering, 2002. Proceedings. (2002), IEEE, pp. 35--44.Google ScholarGoogle ScholarCross RefCross Ref
  57. Reps, T., Horwitz, S., and Sagiv, M. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages (1995), pp. 49--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Ribeiro, A., Meirelles, P., Lago, N., and Kon, F. Ranking warnings from multiple source code static analyzers via ensemble learning. In Proceedings of the 15th International Symposium on Open Collaboration (2019), pp. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Ridge, T., Sheets, D., Tuerk, T., Giugliano, A., Madhavapeddy, A., and Sewell, P. Sibylfs: formal specification and oracle-based testing for posix and real-world file systems. In Proceedings of the 25th Symposium on Operating Systems Principles (2015), pp. 38--53.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Ruparelia, N. B. The history of version control. ACM SIGSOFT Software Engineering Notes 35, 1 (2010), 5--9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Schäf, M., Schwartz-Narbonne, D., and Wies, T. Explaining inconsistent code. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (2013), pp. 521--531.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Schneck, P. B. A survey of compiler optimization techniques. In Proceedings of the ACM annual conference (1973), pp. 106--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Schuler, D., and Zimmermann, T. Mining usage expertise from version archives. In Proceedings of the 2008 international working conference on Mining software repositories (2008), pp. 121--124.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Shatnawi, R., and Li, W. An investigation of bad smells in object-oriented design. In Third International Conference on Information Technology: New Generations (ITNG'06) (2006), IEEE, pp. 161--165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Shen, B. Automatic Methods to Enhance Server Systems in Access Control Diagnosis. University of California, San Diego, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Shen, B., Shan, T., and Zhou, Y. Multiview: Finding blind spots in access-deny issues diagnosis. In USENIX Security Symposium (2023).Google ScholarGoogle Scholar
  67. Sigurbjarnarson, H., Bornholt, J., Torlak, E., and Wang, X. Pushbutton verification of file systems via crash refinement. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (USA, 2016), OSDI'16, USENIX Association, p. 1--16.Google ScholarGoogle Scholar
  68. Sjøberg, D. I., Yamashita, A., Anda, B. C., Mockus, A., and Dybå, T. Quantifying the effect of code smells on maintenance effort. IEEE Transactions on Software Engineering 39, 8 (2012), 1144--1156.Google ScholarGoogle Scholar
  69. Sui, Y., and Xue, J. Svf: interprocedural static value-flow analysis in llvm. In Proceedings of the 25th international conference on compiler construction (2016), ACM, pp. 265--266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Tan, L., Zhang, X., Ma, X., Xiong, W., and Zhou, Y. Autoises: Automatically inferring security specification and detecting violations. In USENIX Security Symposium (2008), pp. 379--394.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Thongtanunam, P., McImtosh, S., Hassan, A. E., and Iida, H. Revisiting code ownership and its relationship with software quality in the scope of modern code review. In Proceedings of the 38th international conference on software engineering (2016), pp. 1039--1050.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Tomb, A., and Flanagan, C. Detecting inconsistencies via universal reachability analysis. In Proceedings of the 2012 International Symposium on Software Testing and Analysis (2012), pp. 287--297.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Wand, M., and Siveroni, I. Constraint systems for useless variable elimination. In Proceedings of the 26th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (1999), pp. 291--302.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Wang, C., Li, Y., Chen, L., Huang, W., Zhou, Y., and Xu, B. Examining the effects of developer familiarity on bug fixing. Journal of Systems and Software 169 (2020), 110667.Google ScholarGoogle ScholarCross RefCross Ref
  75. Wang, Y.-J., Yin, L.-Z., and Dong, W. Amchex: Accurate analysis of missing-check bugs for linux kernel. Journal of Computer Science and Technology 36 (2021), 1325--1341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Xiang, C., Wu, Y., Shen, B., Shen, M., Huang, H., Xu, T., Zhou, Y., Moore, C., Jin, X., and Sheng, T. Towards continuous access control validation and forensics. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security (2019), pp. 113--129.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Yamashita, A., and Moonen, L. Exploring the impact of inter-smell relations on software maintainability: An empirical study. In 2013 35th International Conference on Software Engineering (ICSE) (2013), IEEE, pp. 682--691.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Zhang, T., Shen, W, Lee, D., Jung, C., Azab, A. M., and Wang, R. Pex: A permission check analysis framework for linux kernel. In Proceedings of the 28th USENIX Conference on Security Symposium (USA, 2019), SEC'19, USENIX Association, p. 1205--1220.Google ScholarGoogle Scholar
  79. Zhong, L. A survey of prevent and detect access control vulnerabilities. arXiv preprint arXiv:2304.10600 (2023).Google ScholarGoogle Scholar
  80. Zhong, L., and Wang, Z. Can chatgpt replace stackoverflow? a study on robustness and reliability of large language model code generation, 2023.Google ScholarGoogle Scholar
  81. Zhong, L., and Wang, Z. A study on robustness and reliability of large language model code generation. arXiv preprint arXiv:2308.10335 (2023).Google ScholarGoogle Scholar

Index Terms

  1. Effective Bug Detection with Unused Definitions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems
      April 2024
      1245 pages
      ISBN:9798400704376
      DOI:10.1145/3627703

      Copyright © 2024 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 April 2024

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate241of1,308submissions,18%
    • Article Metrics

      • Downloads (Last 12 months)44
      • Downloads (Last 6 weeks)44

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader