Abstract
Open Source Software (OSS) projects rely on a continuous stream of new contributors for their livelihood. Recent studies reported that new contributors experience many barriers in their first contribution, with the social barrier being critical. Although a number of studies investigated the social barriers to new contributors, we hypothesize that negative first responses may cause an unpleasant feeling, and subsequently lead to the discontinuity of any future contribution. We execute protocols of a registered report to analyze 2,765,917 first contributions as Pull Requests (PRs) with 642,841 first responses. We characterize most first response as being positive, but less responsive, and exhibiting sentiments of fear, joy and love. Results also indicate that negative first responses have the literal intention to arouse emotions of being either constructive (50.71%) or criticizing (37.68%) in nature. Running different machine learning models, we find that predicting future interactions is low (F1 score of 0.6171), but relatively better than baselines. Furthermore, an analysis of these models show that interactions are positively correlated with a future contribution, with other dimensions (i.e., project, contributor, contribution) having a large effect.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available in the FirstResponsePR repository, https://github.com/NAIST-SE/FirstResponsePR.
Notes
updated as to 2021/03/06
Please refer to https://github.com/NAIST-SE/FirstResponsePR/blob/main/DEVIATIONS.md for details
References
Asri IE, Kerzazi N, Uddin G, Khomh F, Janati Idrissi M (2019) An empirical study of sentiments in code reviews. Inf Softw Technol 114:37–54. https://doi.org/10.1016/j.infsof.2019.06.005
Assavakamhaenghan N, Wattanakriengkrai S, Shimada N, Kula RG, Ishio T, ichi Matsumoto K (2021) Does the first-response matter for future contributions? A study of first contributions. Proceedings of the IEEE/ACM 18th international conference on mining software repositories (MSR). https://doi.org/10.48550/arXiv.2104.02933
Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? an analysis of topics and trends in stack overflow. Empir Softw Eng 19(3):619–654. https://doi.org/10.1007/s10664-012-9231-y
Bertram D, Voida A, Greenberg S, Walker R (2010) Communication, collaboration, and bugs: The social nature of issue tracking in small, collocated teams. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, Association for Computing Machinery, New York, NY, USA, CSCW ’10, pp 291–300, https://doi.org/10.1145/1718918.1718972
Bonaccorsi A, Rossi C (2006) Comparing motivations of individual programmers and firms to take part in the open source movement: from community to business. Knowl Technol Policy 18(4), 40–64. https://doi.org/10.1007/s12130-006-1003-9
Bosu A, Carver JC (2014) Impact of developer reputation on code review outcomes in oss projects: An empirical investigation. In: Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement, Association for Computing Machinery, New York, NY, USA, ESEM ’14, https://doi.org/10.1145/2652524.2652544
Bougie G, Starke J, Storey MA, German DM (2011) Towards understanding twitter use in software engineering: Preliminary findings, ongoing challenges and future questions. In: Proceedings of the 2nd international workshop on web 2.0 for software engineering, Association for Computing Machinery, New York, NY, USA, Web2SE ’11, pp 31–36, https://doi.org/10.1145/1984701.1984707
Calefato F, Lanubile F, Novielli N (2017) EmoTxt: A toolkit for emotion recognition from text. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), 79–80
Choi B, Alexander K, Kraut RE, Levine JM (2010) Socialization tactics in wikipedia and their effects. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, 107–116. Presented at the Savannah, Georgia, USA. https://doi.org/10.1145/1718918.1718940
Correa D, Sureka A (2013) Integrating issue tracking systems with community-based question and answering websites. In: 2013 22nd Australian software engineering conference, pp 88–96, https://doi.org/10.1109/ASWEC.2013.20
Destefanis G, Ortu M, Counsell S, Marchesi M, Tonelli R (2015) Software development: do good manners matter? PeerJ. https://doi.org/10.7287/peerj.preprints.1515v1
Elliott Sim S, Holt RC (1998) The ramp-up problem in software projects: A case study of how software immigrants naturalize. In: Proceedings of the 20th international conference on software engineering, IEEE Computer Society, USA, ICSE ’98, pp 361–370
Fagerholm F, Guinea AS, Münch J, Borenstein J (2014) The role of mentoring and project characteristics for onboarding in open source software projects. In: Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement, Association for Computing Machinery, New York, NY, USA, ESEM ’14
Ferreira I, Cheng J, Adams B (2021) The “shut the f**k up” phenomenon: Characterizing incivility in open source code review discussions. In: Proceedings of the ACM on Human-Computer Interaction 5, https://doi.org/10.1145/3479497
Pearson FRS, K. (1900) X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 50(302), 157–175. https://doi.org/10.1080/14786440009463897
Gamalielsson J, Lundell B (2014) Sustainability of open source software communities beyond a fork: How and why has the libreoffice project evolved? Journal of Systems and Software 89:128–145. https://doi.org/10.1016/j.jss.2013.11.1077
Golzadeh M, Decan A, Legay D, Mens T (2021) A ground-truth dataset and classification model for detecting bots in github issue and pr comments. J Syst Softw 175, 110911. https://doi.org/10.1016/j.jss.2021.110911
Gousios G (2013) The ghtorrent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories, IEEE Press, Piscataway, NJ, USA, MSR ’13, pp 233–236
Gousios G, Storey M-A, Bacchelli A (2016) Work practices and challenges in pull-based development: the contributor’s perspective. In: Proceedings of the 38th international conference on software engineering, pp 285–296. Presented at the Austin, Texas. https://doi.org/10.1145/2884781.2884826
Grigore M, Rosenkranz C (2011) Increasing the willingness to collaborate online: an analysis of sentiment-driven interactions in peer content production. In: Galletta DF, Liang T-P (Eds.), Proceedings of the International Conference on Information Systems, ICIS 2011, Shanghai, China, December 4–7, 2011. Retrieved from http://aisel.aisnet.org/icis2011/proceedings/onlinecommunity/20
Hars A, Ou S (2001) Working for free? Motivations of participating in open source projects. In: Proceedings of the 34th annual Hawaii international conference on system sciences 9 pp. https://doi.org/10.1109/HICSS.2001.927045
Hata H, Treude C, Kula RG, Ishio T (2019) 9.6 million links in source code comments: purpose, evolution, and decay. In: Proceedings of the 41st international conference on software engineering, pp 1211–1221. https://doi.org/10.1109/ICSE.2019.00123
Iaffaldano G, Steinmacher I, Calefato F, Gerosa M, Lanubile F (2019) Why do developers take breaks from contributing to oss projects? a preliminary analysis. In: Proceedings of the 2nd international workshop on software health, IEEE Press, SoHeal ’19, pp 9–16
Iqbal T, Khan M, Taveter K, Seyff N (2021) Mining reddit as a new source for software requirements. In: 2021 IEEE 29th international requirements engineering conference (RE), pp 128–138, https://doi.org/10.1109/RE51729.2021.00019
Islam MR, Zibran MF (2018) SentiStrength-SE: exploiting domain specificity for improved sentiment analysis in software engineering text. J Syst Softw 145, 125–146. https://doi.org/10.1016/j.jss.2018.08.030
Kula R, Robles G (2019) The Life and Death of Software Ecosystems, pp 97–105. https://doi.org/10.1007/978-981-13-7099-1_6
Lakhani K, Wolf R (2003) Why hackers do what they do: Understanding motivation and effort in free/open source software projects. Perspectives on Free and Open Source Software. https://doi.org/10.2139/ssrn.443040
Lee A, Carver JC, Bosu A (2017) Understanding the impressions, motivations, and barriers of one time code contributors to FLOSS projects: a survey. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp 187–197. https://doi.org/10.1109/ICSE.2017.25
Legay D, Decan A, Mens T (2018) On the impact of pull request decisions on future contributions. CoRR abs/1812.06269, arxiv:1812.06269
Li Z, Yu Y, Wang T, Yin G, Li S, Wang H (2021) Are you still working on this an empirical study on pull request abandonment. IEEE Trans Softw Eng PP:1–1. https://doi.org/10.1109/TSE.2021.3053403
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18(1), 50–60. https://doi.org/10.1214/aoms/1177730491
Miller C, Cohen S, Klug D, Vasilescu B, Kästner C (2022) “Did you miss my comment or what?” Understanding toxicity in open source discussions. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), Pittsburgh, PA, USA, 2022, pp. 710–722. https://doi.org/10.1145/3510003.3510111
Nakakoji K, Yamamoto Y, Nishinaka Y, Kishida K, Ye Y (2002) Evolution patterns of open-source software systems and communities. In: Proceedings of the International workshop on principles of software evolution, Association for Computing Machinery, New York, NY, USA, IWPSE ’02, pp 76–85, https://doi.org/10.1145/512035.512055
Pinto G, Steinmacher I, Gerosa MA (2016) More common than you think: an in-depth study of casual contributors. In: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 1, 112–123. https://doi.org/10.1109/SANER.2016.68
Roberts J, Hann IH, Slaughter S (2006) Understanding the motivations, participation, and performance of open source software developers: A longitudinal study of the apache projects. Management Science 52:984–999
Sanei A, Cheng J, Adams B (2021) The impacts of sentiments and tones in community-generated issue discussions. arXiv:2103.10615
Sarker J, Turzo AK, Bosu A (2020) A benchmark study of the contemporary toxicity detectors on software engineering interactions. In: 2020 27th Asia-Pacific Software Engineering Conference (APSEC), pp 218–227. https://doi.org/10.1109/APSEC51365.2020.00030
Schilling A, Laumer S, Weitzel T (2012) Who will remain? An evaluation of actual person-job and person-team fit to predict developer retention in FLOSS projects. In: 2012 45th Hawaii international conference on system sciences, pp 3446–3455. https://doi.org/10.1109/HICSS.2012.644
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3/4), 591–611. https://doi.org/10.2307/2333709
Shrestha P, Sathanur A, Maharjan S, Saldanha E, Arendt D, Volkova S (2020) Multiple social platforms reveal actionable signals for software vulnerability awareness: A study of github, twitter and reddit. PLOS ONE 15(3):1–28. https://doi.org/10.1371/journal.pone.0230250
Steinmacher I, Pinto G, Wiese IS, Gerosa MA (2018) Almost there: A study on quasi-contributors in open source software projects. In: Proceedings of the 40th international conference on software engineering, Association for Computing Machinery, New York, NY, USA, ICSE ’18, pp 256–266
Subramanian VN, Rehman I, Nagappan M, Kula RG (2022) Analyzing first contributions on GitHub: what do newcomers do? IEEE Software 39(1), 93–101. https://doi.org/10.1109/MS.2020.3041241
Swap W, Leonard D, Shields M, Abrams L (2001) Using mentoring and storytelling to transfer knowledge in the workplace. J of Management Information Systems 18:95–114. https://doi.org/10.1142/9789814295505_0006
Tsay J, Dabbish L, Herbsleb J (2014) Let’s talk about it: Evaluating contributions through discussion in github. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, Association for Computing Machinery, New York, NY, USA, FSE 2014, pp 144–154, https://doi.org/10.1145/2635868.2635882
Tuarob S, Assavakamhaenghan N, Tanaphantaruk W, Suwanworaboon P, Hassan SU, Choetkiertikul M (2021) Automatic team recommendation for collaborative software development. Empirical Software Engineering 26(4):64. https://doi.org/10.1007/s10664-021-09966-4
Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Family medicine 37(5), 360–363
von Krogh G, Spaeth S, Lakhani KR (2003) Community, joining, and specialization in open source software innovation: a case study. Research Policy 32(7):1217–1241. https://doi.org/10.1016/S0048-7333(03)00050-7, open Source Software Development
Wang D, Xiao T, Thongtanunam P, Kula RG, Matsumoto K (2021) Understanding shared links and their intentions to meet information needs in modern code review. Empirical Software Engineering 26(5):96. https://doi.org/10.1007/s10664-021-09997-x
Wattanakriengkrai S, Thongtanunam P, Tantithamthavorn C, Hata H, Matsumoto K (2020) Predicting defective lines using a model-agnostic technique. IEEE Transactions on Software Engineering 1–1. https://doi.org/10.1109/TSE.2020.3023177
Zhou M, Mockus A (2012) What make long term contributors: willingness and opportunity in oss community. In: 2012 34th International Conference on Software Engineering (ICSE), pp 518–528. https://doi.org/10.1109/ICSE.2012.6227164
Funding
This work has been supported by JSPS KAKENHI Grant Number JP20H05706, JP20K19774
Author information
Authors and Affiliations
Contributions
Noppadol Assavakamhaenghan: Conceptualisation, Methodology, Investigation, Data collection, Qualitative Analysis, Original Writing draft, Visualisation. Supatsara Wattanakriengkrai: Investigation, Qualitative Analysis, Original Writing draft, Review. Naomichi Shimada: Investigation, Qualitative Analysis, Review. Raula Gaikovina Kula: Conceptualisation, Funding Acquisition, review and editing drafts, Supervision, project administration. Takashi Ishio: Funding Acquisition, review and editing drafts, Supervision, project administration. Kenichi Matsumoto : Funding Acquisition, review and editing drafts, Supervision, project administration.
Corresponding author
Ethics declarations
Conflicts of interest
Raula Gaikovina Kula is on the Editorial Board.
Additional information
Communicated by: David Lo, Tegawendé F. Bissyandé.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Assavakamhaenghan, N., Wattanakriengkrai, S., Shimada, N. et al. Does the first response matter for future contributions? A study of first contributions. Empir Software Eng 28, 75 (2023). https://doi.org/10.1007/s10664-023-10299-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10299-7