Skip to main content
Log in

What Are They Talking About? Analyzing Code Reviews in Pull-Based Development Model

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but also about project evolution and social interaction. A comprehensive understanding of the review topics in pull-based model would be useful to better organize the code review process and optimize review tasks such as reviewer recommendation and pull-request prioritization. In this paper, we first conduct a qualitative study on three popular open-source software projects hosted on GitHub and construct a fine-grained two-level taxonomy covering four level-1 categories (code correctness, pull-request decision-making, project management, and social interaction) and 11 level-2 subcategories (e.g., defect detecting, reviewer assigning, contribution encouraging). Second, we conduct preliminary quantitative analysis on a large set of review comments that were labeled by TSHC (a two-stage hybrid classification algorithm), which is able to automatically classify review comments by combining rule-based and machine-learning techniques. Through the quantitative study, we explore the typical review patterns. We find that the three projects present similar comments distribution on each subcategory. Pull-requests submitted by inexperienced contributors tend to contain potential issues even though they have passed the tests. Furthermore, external contributors are more likely to break project conventions in their early contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Barr E T, Bird C, Rigby P C, Hindle A, German D M, Devanbu P. Cohesive and isolated development with branches. In Fundamental Approaches to Software Engineering, De Lara J, Zisman A (eds.), Springer, 2012, pp.316-331.

  2. Gousios G, Pinzger M, van Deursen A. An exploratory study of the pull-based software development model. In Proc. the 36th Int. Conf. Software Engineering, May 31-June 7, 2014, pp.345-355.

  3. Gousios G, Zaidman A, Storey M A, van Deursen A. Work practices and challenges in pull-based development: The integrator’s perspective. In Proc. the 37th Int. Conf. Software Engineering, May 2015, pp.358-368.

  4. Gousios G, Storey M A, Bacchelli A. Work practices and challenges in pull-based development: The contributor’s perspective. In Proc. the 38th Int. Conf. Software Engineering, May 2016, pp.285-296.

  5. Tsay J, Dabbish L, Herbsleb J. Let’s talk about it: Evaluating contributions through discussion in GitHub. In Proc. the 22nd ACM SIGSOFT Int. Symp. Foundations of Software Engineering, November 2014, pp.144-154.

  6. Marlow J, Dabbish L, Herbsleb J. Impression formation in online peer production: Activity traces and personal profiles in GitHub. In Proc. Conf. Computer Supported Cooperative Work, February 2013, pp.117-128.

  7. Yu Y, Wang H M, Yin G, Wang T. Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment? Information and Software Technology, 2016, 74: 204-218.

    Article  Google Scholar 

  8. Tsay J, Dabbish L, Herbsleb J. Influence of social and technical factors for evaluating contribution in GitHub. In Proc. the 36th Int. Conf. Software Engineering, May 31-June 7, 2014, pp.356-366.

  9. Yu Y, Yin G, Wang T, Yang C, Wang H M. Determinants of pull-based development in the context of continuous integration. Science China Information Sciences, 2016, 59: 080104.

  10. Thongtanunam P, McIntosh S, Hassan A E, Iida H. Investigating code review practices in defective files: An empirical study of the QT system. In Proc. the 12th Working Conf. Mining Software Repositories, May 2015, pp.168-179.

  11. Storey M A, Singer L, Cleary B, Filho F F, Zagalsky A. The (r)evolution of social media in software engineering. In Proc. the Future of Software Engineering, May 31-June 7, 2014, pp.100-116.

  12. Zhu J X, Zhou M H, Mockus A. Effectiveness of code contribution: From patch-based to pull-request-based tools. In Proc. the 24th ACM SIGSOFT Int. Symp. Foundations of Software Engineering, November 2016, pp.871-882.

  13. De Lima M L, Soares D M, Plastino A, Murta L. Developers assignment for analyzing pull requests. In Proc. the 30th Annual ACM Symp. Applied Computing, April 2015, pp.1567-1572.

  14. van der Veen E, Gousios G, Zaidman A. Automatically prioritizing pull requests. In Proc. the 12th Working Conf. Mining Software Repositories, May 2015, pp.357-361.

  15. Bacchelli A, Bird C. Expectations, outcomes, and challenges of modern code review. In Proc. the 35th Int. Conf. Software Engineering, May 2013, pp.712-721.

  16. Rigby P C, Bacchelli A, Gousios G, Mukadam M. A mixed methods approach to mining code review data: Examples and a study of multi-commit reviews and pull requests. In The Art and Science of Analyzing Software Data, Bird C, Menzies T, Zimmermann T (eds.), Morgan Kaufmann, 2015, pp.231-255.

  17. Vasilescu B, Yu Y, Wang H M, Devanbu P, Filkov V. Quality and productivity outcomes relating to continuous integration in GitHub. In Proc. the 10th Joint Meeting on Foundations of Software Engineering, August 30-September 4, 2015, pp.805-816.

  18. Mcintosh S, Kamei Y, Adams B, Hassan A E. An empirical study of the impact of modern code review practices on software quality. Empirical Software Engineering, 2016, 21(5): 2146-2189.

    Article  Google Scholar 

  19. Fagan M. Design and code inspections to reduce errors in program development. In Software Pioneers, Broy M, Denert E (eds.), Springer-Verlag, 2002, pp.575-607.

  20. Aurum A, Petersson H, Wohlin C. State-of-the-art: Software inspections after 25 years. Sofware: Testing Verification and Reliability, 2002, 12(3): 133-154.

    Google Scholar 

  21. Rigby P, Cleary B, Painchaud F, Storey M A, German D. Contemporary peer review in action: Lessons from open source development. IEEE Software, 2012, 29(6): 56-61.

    Article  Google Scholar 

  22. Rigby P C, Storey M A. Understanding broadcast based peer review on open source software projects. In Proc. the 33rd Int. Conf. Software Engineering, May 2011, pp.541-550.

  23. Baum T, Liskin O, Niklas K, Schneider K. Factors influencing code review processes in industry. In Proc. the 24th ACM SIGSOFT Int. Symp. Foundations of Software Engineering, November 2016, pp.85-96.

  24. Mcintosh S, Kamei Y, Adams B, Hassan A E. The impact of code review coverage and code review participation on software quality: A case study of the QT, VTK, and ITK projects. In Proc. the 11th Working Conf. Mining Software Repositories, May 31-June 1, 2014, pp.192-201.

  25. Thongtanunam P, Mcintosh S, Hassan A E, Iida H. Review participation in modern code review. Empirical Software Engineering, 2016, 22(2): 768-817.

    Article  Google Scholar 

  26. Zhang Y, Wang H M, Yin G, Wang T, Yu Y. Social media in GitHub: The role of @-mention in assisting software development. Science China Information Sciences, 2017, 60: 032102.

  27. Baeza-Yates R A, Ribeiro-Neto B. Modern Information Retrieval: The Concepts and Technology Behind Search (2nd edition). Addison Wesley, 2011.

  28. Zhou Y, Tong Y X, Gu R H, Gall H. Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process, 2016, 28(3): 150-176.

    Google Scholar 

  29. Shah S K. Motivation, governance, and the viability of hybrid forms in open source software development. Management Science, 2006, 52(7): 1000-1014.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Yu.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 331 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, ZX., Yu, Y., Yin, G. et al. What Are They Talking About? Analyzing Code Reviews in Pull-Based Development Model. J. Comput. Sci. Technol. 32, 1060–1075 (2017). https://doi.org/10.1007/s11390-017-1783-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-017-1783-2

Keywords

Navigation