ABSTRACT
Software development projects, in particular open source ones, heavily rely on the use of tools to support, coordinate and promote development activities. Despite their paramount value, they contribute to fragment the project data, thus challenging practitioners and researchers willing to derive insightful analytics about software projects. In this demo we present Perceval, a loyal helper able to perform automatic and incremental data gathering from almost any tool related with contributing to open source development, among others, source code management, issue tracking systems, mailing lists, forums, and social media. Perceval is an industry strong free software tool that has been widely used in Bitergia, a company devoted to offer commercial software analytics of software projects. It hides the technical complexities related to data acquisition and eases the definition of analytics. A video showcasing the main features of Perceval can be found at https://youtu.be/eH1sYF0Hdc8.
- Christian Bird, Peter C Rigby, Earl T Barr, David J Hamilton, Daniel M German, and Prem Devanbu. 2009. The promises and perils of mining git. In MSR. 1--10. Google ScholarDigital Library
- Casey Casalnuovo, Yagnik Suchak, Baishakhi Ray, and Cindy Rubio-González. 2017. GitcProc: a tool for processing and classifying GitHub commits. In ISSTA. 396--399. Google ScholarDigital Library
- Maëlick Claes, Mika Mäntylä, Miikka Kuutila, and Bram Adams. 2017. Abnormal Working Hours: Effect of Rapid Releases and Implications to Work Content. In MSR '17. IEEE Press, 243--247. Google ScholarDigital Library
- Valerio Cosentino, Javier Luis Cánovas Izquierdo, and Jordi Cabot. 2015. Gitana: a SQL-based Git Repository Inspector. In ER. 329--343.Google Scholar
- Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social coding in GitHub: transparency and collaboration in an open software repository. In CSCW. 1277--1286. Google ScholarDigital Library
- Premkumar Devanbu, Pallavi Kudigrama, Cindy Rubio-González, and Bogdan Vasilescu. 2017. Timezone and Time-of-day Variance in GitHub Teams: An Empirical Method and Study. In SWAN '17. ACM, 19--22. Google ScholarDigital Library
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N Nguyen. 2013. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In MSR. 422--431. Google ScholarDigital Library
- Michael Fischer, Martin Pinzger, and Harald Gall. 2003. Populating a release history database from version control and bug tracking systems. In ICSM 2003. 23--32. Google ScholarDigital Library
- Clinton Gormley and Zachary Tong. 2015. Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine. "O'Reilly Media, Inc.". Google ScholarDigital Library
- Georgios Gousios and Diomidis Spinellis. 2012. GHTorrent: GitHub's data from a firehose. In MSR 2012. 12--21. Google ScholarDigital Library
- Dick Hardt. 2012. The OAuth 2.0 authorization framework. (2012).Google Scholar
- Hadi Hemmati, Sarah Nadi, Olga Baysal, Oleksii Kononenko, Wei Wang, Reid Holmes, and Michael W Godfrey. 2013. The MSR cookbook: Mining a decade of research. In MSR. 343--352. Google ScholarDigital Library
- Israel Herraiz, Daniel Izquierdo-Cortazar, and Francisco Rivas-Hernández. 2009. Flossmetrics: Free/libre/open source software metrics. In CSMR'09. 281--284. Google ScholarDigital Library
- Truong Ho-Quang, Regina Hebig, Gregorio Robles, Michel RV Chaudron, and Miguel Angel Fernandez. 2017. Practices and perceptions of UML use in open source projects. In ICSE SEIP. IEEE Press, 203--212. Google ScholarDigital Library
- James Howison, Megan Conklin, and Kevin Crowston. 2006. FLOSSmole: A collaborative repository for FLOSS research data and analyses. Intl Journal of Information Technology and Web Engineering 1, 3 (2006), 17--26.Google ScholarCross Ref
- Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2016. An in-depth study of the promises and perils of mining GitHub. Empirical Software Engineering 21, 5 (2016), 2035--2071. Google ScholarDigital Library
- Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian E Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica B Hamrick, Jason Grout, Sylvain Corlay, and others. 2016. Jupyter Notebooks-a publishing format for reproducible computational workflows.. In ELPUB. 87--90.Google Scholar
- Filippo Lanubile, Christof Ebert, Rafael Prikladnicki, and Aurora Vizcaíno. 2010. Collaboration tools for global software engineering. IEEE software 27, 2 (2010). Google ScholarDigital Library
- Wes McKinney and others. 2010. Data structures for statistical computing in python. In SciPy, Vol. 445. 51--56.Google Scholar
- Gregorio Robles, Jesus M González-Barahona, Daniel Izquierdo-Cortazar, and Israel Herraiz. 2009. Tools for the study of the usual data sources found in libre software projects. Intl J of Open Source Software and Processes 1, 1 (2009), 24--45.Google ScholarCross Ref
- Gregorio Robles, Truong Ho-Quang, Regina Hebig, Michel RV Chaudron, and Miguel Angel Fernandez. 2017. An extensive dataset of UML models in GitHub. In MSR'17. IEEE Press, 519--522. Google ScholarDigital Library
- Gregorio Robles, Stefan Koch, and Jesús M González-Barahona. 2004. Remote analysis and measurement of libre software systems by means of the CVSAnalY tool. In 2nd Workshop Remote Analysis and Measurement of Softw Systems. 51--56.Google ScholarCross Ref
- Margaret-Anne Storey, Christoph Treude, Arie van Deursen, and Li-Te Cheng. 2010. The impact of social media on software engineering practices and tools. In ESE. ACM, 359--364. Google ScholarDigital Library
- R Core Team. 2000. R language definition. (2000).Google Scholar
Recommendations
SortingHat: wizardry on software project members
ICSE '19: Proceedings of the 41st International Conference on Software Engineering: Companion ProceedingsNowadays, software projects and in particular open source ones heavily rely on a plethora of tools (e.g., Git, GitHub) to support and coordinate development activities. Despite their paramount value, they foster to fragment members' contribution, since ...
Understanding How Companies Interact with Free Software Communities
Free, open source software development communities can become large and complex. They can also be a focus of interest for competing companies relying on their outcomes, with employees joining the development and maintenance effort. In those cases, it's ...
Towards base rates in software analytics
Nowadays a vast and growing body of open source software (OSS) project data is publicly available on the internet. Despite this public body of project data, the field of software analytics has not yet settled on a solid quantitative base for basic ...
Comments