skip to main content
10.1145/3236454.3236501acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article
Public Access

NJR: a normalized Java resource

Published:16 July 2018Publication History

ABSTRACT

We are on the cusp of a major opportunity: software tools that take advantage of Big Code. Specifically, Big Code will enable novel tools in areas such as security enhancers, bug finders, and code synthesizers. What do researchers need from Big Code to make progress on their tools? Our answer is an infrastructure that consists of 100,000 executable Java programs together with a set of working tools and an environment for building new tools. This Normalized Java Resource (NJR) will lower the barrier to implementation of new tools, speed up research, and ultimately help advance research frontiers.

Researchers get significant advantages from using NJR. They can write scripts that base their new tool on NJR's already-working tools, and they can search NJR for programs with desired characteristics. They will receive the search result as a container that they can run either locally or on a cloud service. Additionally, they benefit from NJR's normalized representation of each Java program, which enables scalable running of tools on the entire collection. Finally, they will find that NJR's collection of programs is diverse because of our efforts to run clone detection and near-duplicate removal. In this paper we describe our vision for NJR and our current prototype.

References

  1. S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. L. Intel, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. v. Dincklage, and B. Wiedermann. 2006. The DaCapo benchmarks: Java Benchmarking Development and Analysis. In OOPSLA'06, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages, and Applications. 169--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In OOPSLA'09, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages and Applications. 243--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Bruneton, R. Lenglet, and T. Coupaye. 2002. ASM: a code manipulation tool to implement adaptable systems. In Adaptable and extensible component systems.Google ScholarGoogle Scholar
  4. Shigeru Chiba. 2000. Load-time Structural Reflection in Java. In ECOOP'00, European Conf. on Object-Oriented Programming. Springer-Verlag (LNCS 1850), 313--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. EMMA Developers. 2018. EMMA, a free Java Code Coverage Tool. (2018). http://emma.sourceforge.net, accessed Jan 6, 2018.Google ScholarGoogle Scholar
  6. Jens Dietrich, Li Sui, Shawn Rasheed, and Amjed Tahir. 2017. On the Construction of Soundness Oracles. In SOAP'17, 6th ACM SIGPLAN Int. Workshop on State Of the Art in Program Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories. In 35th Int. Conf. on Software Engineering (ICSE 2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T.J. Watson Libraries for Analysis. 2018. WALA. (2018). http://wala.sourceforge.net, accessed Jan 6, 2018.Google ScholarGoogle Scholar
  9. Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering 39, 2 (2013), 276--291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Marc R. Hoffmann, Evgeny Mandrikov, and Mirko Friedenhagen. 2018. JaCoCo: Java Code Coverage for Eclipse. (2018). http://www.eclemma.org/research/index.html, accessed Jan 6, 2018.Google ScholarGoogle Scholar
  11. Ziyi Lin, Darko Marinov, Hao Zhong, Yuting Chen, and Jianjun Zhao. 2015. A Benchmark Suite of Real-World Java Concurrency Bugs. In ASE'15, IEEE Int. Conf. on Automated Software Engineering. 178--189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhotak, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In Defense of Soundiness: A Manifesto. CACM 58, 2 (February 2015), 44--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cristina Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DejaVu: A Map of Code Duplicates on GitHub. In OOPSLA'17, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages and Applications.Google ScholarGoogle Scholar
  14. Ravi Mangal, Xin Zhang, Aditya Nori, and Mayur Naik. 2015. A User-Guided Approach to Program Analysis. In FSE'15, ACM SIGSOFT Int. Symposium on the Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Veselin Raychev, Martin T. Vechev, and Andreas Krause. 2015. Predicting Program Properties from Big Code. In POPL'15, ACM Annual Symposium on Principles of Programming Languages. 111--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Reif, M. Eichberg, B. Hermann, and M. Mezini. 2017. Hermes: assessment and creation of effective test corpora. In SOAP'17, the 6th ACM SIGPLAN Int. Workshop on State Of the Art in Program Analysis. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Ewan Tempero, Craig Anslow, Jens Dietrich, Ted Han, Jing Li, Markus Lumpe, Hayden Melton, and James Noble. 2010. The Qualitas Corpus: A curated collection of Java code for empirical studies. In APSEC'10, Asia Pacific Software Engineering Conf.. 336--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Raja Vallé-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode using the Soot Framework: Is it Feasible?. In CC'00, Int. Conf. on Compiler Construction. Springer-Verlag (LNCS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Eran Yahav. 2015. Programming with Big Code. In 13th Asian Symposium on Programming Languages and Systems (APLAS'15). 3--8.Google ScholarGoogle Scholar

Index Terms

  1. NJR: a normalized Java resource

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops
        July 2018
        143 pages
        ISBN:9781450359399
        DOI:10.1145/3236454

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 July 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate58of213submissions,27%

        Upcoming Conference

        ISSTA '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader