Skip to main content
Log in

When conversations turn into work: a taxonomy of converted discussions and issues in GitHub

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Popular and large contemporary open-source projects now embrace a diverse set of documentation for communication channels. Examples include contribution guidelines (i.e., commit message guidelines, coding rules, submission guidelines), code of conduct (i.e., rules and behavior expectations), governance policies, and Q&A forum. In 2020, GitHub released Discussion to distinguish between communication and collaboration. However, it remains unclear how developers maintain these channels, how trivial it is, and whether deciding on conversion takes time. We conducted an empirical study on 259 NPM and 148 PyPI repositories, devising two taxonomies of reasons for converting discussions into issues and vice-versa. The most frequent conversion from a discussion to an issue is when developers request a contributor to clarify their idea into an issue (Reporting a Clarification Request –35.1% and 34.7%, respectively), while agreeing that having non actionable topic (QA, ideas, feature requests –55.0% and 42.0%, respectively) is the most frequent reason of converting an issue into a discussion. Furthermore, we show that not all reasons for conversion are trivial (e.g., not a bug), and raising a conversion intent potentially takes time (i.e., a median of 15.2 and 35.1 h, respectively, taken from issues to discussions). Our work contributes to complementing the GitHub guidelines and helping developers effectively utilize the Issue and Discussion communication channels to maintain their collaboration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available in the https://github.com/posl/GitHub_Discussion_Conversion

Notes

  1. https://github.blog/2020-05-06-new-from-satellite-2020-github-codespaces-github-discussions-securing-code-in-private-repositories-and-more/

  2. https://resources.github.com/devops/process/planning/discussions/

  3. https://libraries.io/

  4. https://graphql.org/

  5. https://github.com/sbaltes/github-retriever/

  6. https://www.surveysystem.com/sscalc.htm

  7. https://github.com/prisma/prisma/discussions/10488

  8. https://github.com/facebook/docusaurus/discussions/6099

  9. https://github.com/eslint/eslint/discussions/14669

  10. https://github.com/gatsbyjs/gatsby/discussions/32147

  11. https://github.com/Automattic/mongoose/discussions/10516

  12. https://github.com/aws-amplify/amplify-js/discussions/8106

  13. https://github.com/grafana/grafana/discussions/46356

  14. https://github.com/logaretm/vee-validate/discussions/3723

  15. https://github.com/keycloak/keycloak/discussions/8988

  16. https://github.com/serialport/node-serialport/discussions/2287

  17. https://github.com/gatsbyjs/gatsby/discussions/31283

  18. https://github.com/facebook/create-react-app/discussions/11405

  19. https://github.com/date-fns/date-fns/discussions/2841

  20. https://github.com/vercel/next.js/discussions/12325

  21. https://github.com/vercel/next.js/discussions/27756

  22. https://github.com/apache/superset/discussions/19185

  23. https://github.com/ant-design/ant-design/discussions/29818

  24. https://github.com/apache/airflow/discussions/14315

  25. https://github.com/invertase/react-native-firebase/discussions/4290

  26. https://github.com/renovatebot/renovate/discussions/14457

References

  • Abdalkareem R, Nourry O, Wehaibi S, Mujahid S, Shihab E (2017) Why do developers use trivial packages? an empirical case study on npm. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ACM, ESEC/FSE 2017, p 385–395

  • Bacchelli A, Bird C (2013) Expectations, Outcomes, and Challenges of Modern Code Review. In: Proceedings of the 35th International Conference on Software Engineering, pp 712–721

  • Bangash AA, Sahar H, Chowdhury S, Wong AW, Hindle A, Ali K (2019) What do developers know about machine learning: a study of ml discussions on stackoverflow. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), IEEE, pp 260–264

  • Chinthanet B, Kula RG, McIntosh S, Ishio T, Ihara A, Matsumoto K (2021) Lags in the release, adoption, and propagation of npm vulnerability fixes. Empir Softw Eng 26:1–28

    Article  Google Scholar 

  • Chouchen M, Ouni A, Kula RG, Wang D, Thongtanunam P, Mkaouer MW, Matsumoto K (2021) Anti-patterns in modern code review: Symptoms and prevalence. In: 2021 IEEE international conference on software analysis, evolution and reengineering (SANER), IEEE, pp 531–535

  • Cogo FR, Oliva GA, Hassan AE (2019) An empirical study of dependency downgrades in the npm ecosystem. IEEE Trans Softw Eng 47(11):2457–2470

    Article  Google Scholar 

  • Decan A, Mens T, Claes M (2016) On the topology of package dependency networks: A comparison of three programming language ecosystems. In: Proccedings of the 10th European Conference on Software Architecture Workshops, pp 1–4

  • Ebert F, Castor F, Novielli N, Serebrenik A (2019) Confusion in code reviews: Reasons, impacts, and coping strategies. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp 49–60

  • Hata H, Treude C, Kula RG, Ishio T (2019) 9.6 Million Links in Source Code Comments: Purpose, Evolution, and Decay. In: Proceedings of the 41st International Conference on Software Engineering, pp 1211–1221

  • Hata H, Novielli N, Baltes S, Kula RG, Treude C (2022) Github discussions: An exploratory study of early adoption. Empir Softw Eng 27:3

    Article  Google Scholar 

  • Hecke TV (2012) Power study of anova versus kruskal-wallis test. J Stat Manag Syst 15(2–3):241–247

    Google Scholar 

  • Hindle A, Alipour A, Stroulia E (2016) A contextual approach towards more accurate duplicate bug report detection and ranking. Empir Softw Eng 21(2):368–410

    Article  Google Scholar 

  • Hirao T, McIntosh S, Ihara A, Matsumoto K (2019) The Review Linkage Graph for Code Review Analytics: A Recovery Approach and Empirical Study. In: Proc. of the International Symposium on the Foundations of Software Engineering (FSE), p 578–589

  • Kruskal WH, Wallis WA (1952) Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47(260):583–621

    Article  MATH  Google Scholar 

  • Kula RG, Robles G (2019) The life and death of software ecosystems. In: Towards Engineering Free/Libre Open Source Software (FLOSS) Ecosystems for Impact and Sustainability: Communications of NII Shonan Meetings. Springer, pp 97–105

  • Lee A, Carver JC, Bosu A (2017) Understanding the impressions, motivations, and barriers of one time code contributors to floss projects: a survey. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), IEEE, pp 187–197

  • Li Z, Yin G, Yu Y, Wang T, Wang H (2017) Detecting duplicate pull-requests in github. In: Proceedings of the 9th Asia-Pacific Symposium on Internetware, pp 1–6

  • Lima M, Steinmacher I, Ford D, Liu E, Vorreuter G, Conte T, Gadelha B (2022) Looking for related discussions on github discussions. arXiv preprint arXiv:220611971

  • Liu X, Zhong H (2018) Mining stackoverflow for program repair. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), IEEE, pp 118–129

  • Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60

    Article  MathSciNet  MATH  Google Scholar 

  • McHugh ML (2012) Interrater reliability: the kappa statistic. Biochemia Medica 22(3):276–282

    Article  MathSciNet  Google Scholar 

  • Mendez C, Padala HS, Steine-Hanson Z, Hilderbrand C, Horvath A, Hill C, Simpson L, Patil N, Sarma A, Burnett M (2018) Open source barriers to entry, revisited: A sociotechnical perspective. In: Proceedings of the 40th International conference on software engineering, pp 1004–1015

  • Nguyen AT, Nguyen TT, Nguyen TN, Lo D, Sun C (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: 2012 Proceedings of the 27th IEEE/ACM international conference on automated software engineering, IEEE, pp 70–79

  • Parra E, Alahmadi M, Ellis A, Haiduc S (2022) A comparative study and analysis of developer communications on slack and gitter. Empir Softw Eng 27(2):1–33

    Article  Google Scholar 

  • Pascarella L, Spadini D, Palomba F, Bruntink M, Bacchelli A (2018) Information Needs in Contemporary Code Review. Proc ACM Conf Comput Supported Coop Work 2:135:1-135:27

    Google Scholar 

  • Raglianti M, Nagy C, Minelli R, Lanza M (2022) DiscOrDance: visualizing software developers communities on discord. In: 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), Limassol, pp 474–478. https://doi.org/10.1109/ICSME55016.2022.00062

  • Rehman I, Wang D, Kula RG, Ishio T, Matsumoto K (2022) Newcomer oss-candidates: Characterizing contributions of novice developers to github. Empir Softw Eng 27(5):1–20

    Article  Google Scholar 

  • Steinmacher I, Gerosa MA, Redmiles D (2014) Attracting, onboarding, and retaining newcomer developers in open source software projects. In: Workshop on Global Software Development in a CSCW Perspective, vol 16, p 20

  • Steinmacher I, Treude C, Gerosa MA (2018) Let me in: Guidelines for the successful onboarding of newcomers to open source projects. IEEE Softw 36(4):41–49

    Article  Google Scholar 

  • Stemler S (2000) An overview of content analysis. Pract Assess Res Eval 7(1):17

    Google Scholar 

  • Storey MA, Zagalsky A, Figueira Filho F, Singer L, German DM (2016) How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans Software Eng 43(2):185–204

    Article  Google Scholar 

  • Stray V, Moe NB (2020) Understanding coordination in global software engineering: A mixed-methods study on the use of meetings and slack. J Syst Softw 170:110717

    Article  Google Scholar 

  • Tan X, Zhou M (2019) How to communicate when submitting patches: An empirical study of the linux kernel. Proc ACM Hum-Comput Interact 3(CSCW):1–26

    Article  Google Scholar 

  • Tan X, Zhou M, Sun Z (2020) A first look at good first issues on GitHub, Association for Computing Machinery, New York, NY, USA, p 398-409. https://doi.org/10.1145/3368089.3409746

  • Tantisuwankul J, Nugroho YS, Kula RG, Hata H, Rungsawang A, Leelaprute P, Matsumoto K (2019) A topological analysis of communication channels for knowledge sharing in contemporary github projects. J Syst Softw 158:110416

    Article  Google Scholar 

  • Treude C, Robillard MP (2017) Understanding stack overflow code fragments. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp 509–513

  • Treude C, Barzilay O, Storey MA (2011) How do programmers ask and answer questions on the web? (nier track). In: Proceedings of the 33rd International Conference on Software Engineering, Association for Computing Machinery, New York, NY, USA, ICSE ’11, p 804–807

  • Vale G, Schmid A, Santos AR, De Almeida ES, Apel S (2020) On the relation between github communication activity and merge conflicts. Empir Softw Eng 25(1):402–433

    Article  Google Scholar 

  • Vasilescu B, Capiluppi A, Serebrenik A (2012) Gender, representation and online participation: A quantitative study of stackoverflow. In: 2012 International Conference on Social Informatics, IEEE, pp 332–338

  • Wan Z, Xia X, Hassan AE (2021) What do programmers discuss about blockchain? a case study on the use of balanced lda and the reference architecture of a domain to capture online discussions about blockchain platforms across stack exchange communities. IEEE Trans Softw Eng 47:(7)1331–1349

  • Wang D, Kula RG, Ishio T, Matsumoto K (2021a) Automatic patch linkage detection in code review using textual content and file location features. Inf Softw Technol 139:106637

    Article  Google Scholar 

  • Wang D, Ueda Y, Kula RG, Ishio T, Matsumoto K (2021b) Can we benchmark code review studies? a systematic mapping study of methodology, dataset, and metric. J Syst Softw 180:111009

    Article  Google Scholar 

  • Wang D, Xiao T, Thongtanunam P, Kula RG, Matsumoto K (2021c) Understanding shared links and their intentions to meet information needs in modern code review. Empir Softw Eng 26(5):96

    Article  Google Scholar 

  • Wang D, Xiao T, Treude C, Kula RG, Hata H, Kamei Y (2023) Understanding the role of images on stack overflow. arXiv preprint arXiv:230315684

  • Wang Q, Xu B, Xia X, Wang T, Li S (2019) Duplicate pull request detection: When time matters. In: Proceedings of the 11th Asia-Pacific Symposium on Internetware, pp 1–10

  • Xiao W, He H, Xu W, Tan X, Dong J, Zhou M (2022) Recommending good first issues in github oss projects. In: 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE), IEEE, pp 1830–1842

Download references

Acknowledgements

This work is supported by Japanese Society for the Promotion of Science (JSPS) KAKENHI grants (JP20K19774, JP20H05706, JP22K17874, JP21H04877, JP23K16864), and JSPS and SNSF for the project “SENSOR” (JPJSJRP20191502).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Wang.

Ethics declarations

Conflict of interest

The authors declare that Raula Gaikovina Kula and Yasutaka Kamei are members of the EMSE Editorial Board. All co-authors have seen and agreed with the contents of the manuscript and there is no financial interest to report.

Additional information

Communicated by: Jeffrey C. Carver

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, D., Kondo, M., Kamei, Y. et al. When conversations turn into work: a taxonomy of converted discussions and issues in GitHub. Empir Software Eng 28, 138 (2023). https://doi.org/10.1007/s10664-023-10366-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10366-z

Keywords

Navigation