NJR: a normalized Java resource

Authors:
Jens Palsberg

University of California

University of California
View Profile

,
Cristina V. Lopes

University of California

University of California
View Profile

ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 WorkshopsJuly 2018Pages 100–106https://doi.org/10.1145/3236454.3236501

Published:16 July 2018Publication History

ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops

Pages 100–106

ABSTRACT

We are on the cusp of a major opportunity: software tools that take advantage of Big Code. Specifically, Big Code will enable novel tools in areas such as security enhancers, bug finders, and code synthesizers. What do researchers need from Big Code to make progress on their tools? Our answer is an infrastructure that consists of 100,000 executable Java programs together with a set of working tools and an environment for building new tools. This Normalized Java Resource (NJR) will lower the barrier to implementation of new tools, speed up research, and ultimately help advance research frontiers.

Researchers get significant advantages from using NJR. They can write scripts that base their new tool on NJR's already-working tools, and they can search NJR for programs with desired characteristics. They will receive the search result as a container that they can run either locally or on a cloud service. Additionally, they benefit from NJR's normalized representation of each Java program, which enables scalable running of tools on the entire collection. Finally, they will find that NJR's collection of programs is diverse because of our efforts to run clone detection and near-duplicate removal. In this paper we describe our vision for NJR and our current prototype.

References

S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. L. Intel, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. v. Dincklage, and B. Wiedermann. 2006. The DaCapo benchmarks: Java Benchmarking Development and Analysis. In OOPSLA'06, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages, and Applications. 169--190. Google ScholarDigital Library
Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In OOPSLA'09, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages and Applications. 243--262. Google ScholarDigital Library
E. Bruneton, R. Lenglet, and T. Coupaye. 2002. ASM: a code manipulation tool to implement adaptable systems. In Adaptable and extensible component systems.Google Scholar
Shigeru Chiba. 2000. Load-time Structural Reflection in Java. In ECOOP'00, European Conf. on Object-Oriented Programming. Springer-Verlag (LNCS 1850), 313--336. Google ScholarDigital Library
EMMA Developers. 2018. EMMA, a free Java Code Coverage Tool. (2018). http://emma.sourceforge.net, accessed Jan 6, 2018.Google Scholar
Jens Dietrich, Li Sui, Shawn Rasheed, and Amjed Tahir. 2017. On the Construction of Soundness Oracles. In SOAP'17, 6th ACM SIGPLAN Int. Workshop on State Of the Art in Program Analysis. Google ScholarDigital Library
Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen. 2013. Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories. In 35th Int. Conf. on Software Engineering (ICSE 2013). Google ScholarDigital Library
T.J. Watson Libraries for Analysis. 2018. WALA. (2018). http://wala.sourceforge.net, accessed Jan 6, 2018.Google Scholar
Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering 39, 2 (2013), 276--291. Google ScholarDigital Library
Marc R. Hoffmann, Evgeny Mandrikov, and Mirko Friedenhagen. 2018. JaCoCo: Java Code Coverage for Eclipse. (2018). http://www.eclemma.org/research/index.html, accessed Jan 6, 2018.Google Scholar
Ziyi Lin, Darko Marinov, Hao Zhong, Yuting Chen, and Jianjun Zhao. 2015. A Benchmark Suite of Real-World Java Concurrency Bugs. In ASE'15, IEEE Int. Conf. on Automated Software Engineering. 178--189.Google ScholarDigital Library
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhotak, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In Defense of Soundiness: A Manifesto. CACM 58, 2 (February 2015), 44--46. Google ScholarDigital Library
Cristina Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DejaVu: A Map of Code Duplicates on GitHub. In OOPSLA'17, ACM SIGPLAN Conf. on Object-Oriented Programming Systems, Languages and Applications.Google Scholar
Ravi Mangal, Xin Zhang, Aditya Nori, and Mayur Naik. 2015. A User-Guided Approach to Program Analysis. In FSE'15, ACM SIGSOFT Int. Symposium on the Foundations of Software Engineering. Google ScholarDigital Library
Veselin Raychev, Martin T. Vechev, and Andreas Krause. 2015. Predicting Program Properties from Big Code. In POPL'15, ACM Annual Symposium on Principles of Programming Languages. 111--124. Google ScholarDigital Library
M. Reif, M. Eichberg, B. Hermann, and M. Mezini. 2017. Hermes: assessment and creation of effective test corpora. In SOAP'17, the 6th ACM SIGPLAN Int. Workshop on State Of the Art in Program Analysis. Google ScholarDigital Library
Ewan Tempero, Craig Anslow, Jens Dietrich, Ted Han, Jing Li, Markus Lumpe, Hayden Melton, and James Noble. 2010. The Qualitas Corpus: A curated collection of Java code for empirical studies. In APSEC'10, Asia Pacific Software Engineering Conf.. 336--345. Google ScholarDigital Library
Raja Vallé-Rai, Etienne Gagnon, Laurie Hendren, Patrick Lam, Patrice Pominville, and Vijay Sundaresan. 2000. Optimizing Java Bytecode using the Soot Framework: Is it Feasible?. In CC'00, Int. Conf. on Compiler Construction. Springer-Verlag (LNCS). Google ScholarDigital Library
Eran Yahav. 2015. Programming with Big Code. In 13th Asian Symposium on Programming Languages and Systems (APLAS'15). 3--8.Google Scholar

Index Terms

NJR: a normalized Java resource
1. Social and professional topics
  1. Professional topics
    1. History of computing
      1. History of programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages

Recommendations

Assessing the potentials of CASE-tools in software process improvement: a benchmarking study
SAST '96: Proceedings of the Proceedings of the Fourth International Symposium on Assessment of Software Tools (SAST '96)

CASE tools have been thought as one of the most important means for implementing the derived quality programs. Two basic questions should be answered to find the right CASE tool: what attributes the CASE tools should exhibit and how the existing tools ...
Read More
Are the UML modelling tools powerful enough for practitioners? A literature review

Unified Modelling Language (UML) is essentially a de‐facto standard for software modeling and supported with many modeling tools. In this study, 58 UML tools have been analysed for modelling viewpoints, analysis, transformation & export, collaboration, ...
Read More
CASE: Analysis and Design Tools

Computer-aided software engineering (CASE) tools are defined, and ten CASE tools are briefly overviewed. Individual presentations on the various tools follow. The focus is on structured analysis, design, and programming. Two of the tools (Cradle and JSP ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops
July 2018
143 pages
ISBN:9781450359399
DOI:10.1145/3236454
Conference Chairs:
Julian Dolby
IBM Thomas J. Watson Research Center
,
William G. J. Halfond
University of Southern California
,
Ashish Mishra
Northeastern University, United States.
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 July 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
000 Java programs
100
plug-and-play environment
reproducible results
software tools
static and dynamic analyses
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate58of213submissions,27%
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 300
  Total Downloads
- Downloads (Last 12 months)75
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

NJR: a normalized Java resource

ISSTA '18: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops

ABSTRACT

References

Cited By

Index Terms

Recommendations

Assessing the potentials of CASE-tools in software process improvement: a benchmarking study

Are the UML modelling tools powerful enough for practitioners? A literature review

CASE: Analysis and Design Tools