skip to main content
10.1145/3524610.3527878acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
short-paper

pycefr: Python competency level through code analysis

Published:20 October 2022Publication History

ABSTRACT

Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, the general programming competence and programming skills, etc. In this paper, we present pycefr, a tool that detects the use of the different elements of the Python language, effectively measuring the level of Python proficiency required to comprehend and deal with a fragment of Python code. Following the well-known Common European Framework of Reference for Languages (CEFR), widely used for natural languages, pycefr categorizes Python code in six levels, depending on the proficiency required to create and understand it. We also discuss different use cases for pycefr: identifying code snippets that can be understood by developers with a certain proficiency, labeling code examples in online resources such as Stackoverflow and GitHub to suit them to a certain level of competency, helping in the onboarding process of new developers in Open Source Software projects, etc. A video shows availability and usage of the tool: https://tinyurl.com/ypdt3fwe.

References

  1. Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2016. You Get Where You're Looking for: The Impact of Information Sources on Code Security. In Proceedings of the IEEE Symposium on Security and Privacy (SP '16). IEEE, 289--305.Google ScholarGoogle ScholarCross RefCross Ref
  2. Carol V Alexandru, José J Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C Gall, and Gregorio Robles. 2018. On the usage of pythonic idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrea Capiluppi, Alexander Serebrenik, and Leif Singer. 2012. Assessing technical candidates on the social web. IEEE software 30, 1 (2012), 45--51.Google ScholarGoogle Scholar
  4. Andrea Capiluppi, Alexander Serebrenik, and Ahmmad Youssef. 2012. Developing an h-index for OSS developers. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, 251--254.Google ScholarGoogle ScholarCross RefCross Ref
  5. Wesley Chun. 2001. Core python programming. Vol. 1. Prentice Hall Professional.Google ScholarGoogle Scholar
  6. Peter JA Cock, Tiago Antao, Jeffrey T Chang, Brad A Chapman, Cymon J Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, et al. 2009. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 11 (2009), 1422--1423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bart Deygers, Beate Zeidler, Dina Vilcu, and Cecilie Hamnes Carlsen. 2018. One framework to unite them all? Use of the CEFR in European university entrance policies. Language Assessment Quarterly 15, 1 (2018), 3--15.Google ScholarGoogle ScholarCross RefCross Ref
  8. Allen Downey. 2012. Think python. "O'Reilly Media, Inc.".Google ScholarGoogle Scholar
  9. Neus Figueras. 2007. The CEFR, a lever for the improvement of language professionals in Europe. Modern Language Journal (2007), 673--675.Google ScholarGoogle Scholar
  10. Julia Hancke and Detmar Meurers. 2013. Exploring CEFR classification for German based on rich linguistic modeling. Learner Corpus Research (2013), 54--56.Google ScholarGoogle Scholar
  11. Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). 837--847.Google ScholarGoogle ScholarCross RefCross Ref
  12. Hugo. 2020. Python version share over time, 6. https://dev.to/hugovk/python-version-share-over-time-6-1jb8. Online; accessed 21 June 2021.Google ScholarGoogle Scholar
  13. JetBrains. 2020. Python Developers Survey 2020 Results. https://www.jetbrains.com/lp/python-developers-survey-2020/. Online; accessed 21 June 2021.Google ScholarGoogle Scholar
  14. Nurdan Kavakli and Sezen Arslan. 2017. Applying EALTA guidelines as baseline for the foreign language proficiency test in Turkey: The case of YDS. International Journal of Curriculum and Instruction 9, 1 (2017), 104--118.Google ScholarGoogle Scholar
  15. Dave Kuhlman. 2009. A python book: Beginning python, advanced python, and python exercises. Dave Kuhlman Lutz.Google ScholarGoogle Scholar
  16. Mark Lutz. 2001. Programming python. "O'Reilly Media, Inc.".Google ScholarGoogle Scholar
  17. Waldemar Martyniuk. 2011. Aligning Tests with the CEFR. Ernst Klett Sprachen.Google ScholarGoogle Scholar
  18. Brian North. 2007. The CEFR illustrative descriptor scales. The Modern Language Journal 91, 4 (2007), 656--659.Google ScholarGoogle ScholarCross RefCross Ref
  19. Council of Europe. 2021. https://www.coe.int/en/web/common-european-framework-reference-languagesGoogle ScholarGoogle Scholar
  20. Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Morakot Choetkiertikul, and Raula Gaikovina Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 806--809.Google ScholarGoogle Scholar
  21. Mark Pilgrim and Simon Willison. 2009. Dive Into Python 3. Vol. 2. Springer.Google ScholarGoogle Scholar
  22. Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, and Rocco Oliveto. 2021. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering 47, 3 (2021), 560--581. Google ScholarGoogle ScholarCross RefCross Ref
  23. Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the Usage of Pythonic Idioms over Time: A Case Study of the with open Idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 43--435.Google ScholarGoogle ScholarCross RefCross Ref
  24. Anita Sarma, Xiaofan Chen, Sandeep Kuttal, Laura Dabbish, and Zhendong Wang. 2016. Hiring in the global stage: Profiles of online contributions. In 2016 IEEE 11th International Conference on Global Software Engineering (ICGSE). IEEE, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  25. Igor Steinmacher, Marco Aurelio Graciotto Silva, Marco Aurelio Gerosa, and David F Redmiles. 2015. A systematic literature review on the barriers faced by newcomers to open source software projects. Information and Software Technology 59 (2015), 67--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Stephen Cass. 2021. Top Programming Languages 2021: Python dominates as the de facto platform for new technologies. https://spectrum-ieee-org.ejournal.mahidol.ac.th/top-programming-languages-2021. Online; accessed 21 October 2021.Google ScholarGoogle Scholar
  27. Mark Summerfield. 2010. Programming in Python 3: a complete introduction to the Python language. Addison-Wesley Professional.Google ScholarGoogle Scholar
  28. TIOBE. 2021. TIOBE Index for October 2021. https://www.tiobe.com/tiobe-index/. Online; accessed 21 October 2021.Google ScholarGoogle Scholar
  29. Bogdan Vasilescu, Alexander Serebrenik, Prem Devanbu, and Vladimir Filkov. 2014. How social Q&A sites are changing knowledge sharing in open source software communities. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. 342--354.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Bogdan Vasilescu, Alexander Serebrenik, and Mark GJ van den Brand. 2013. The Babel of software development: Linguistic diversity in open source. In International Conference on Social Informatics. Springer, 391--404.Google ScholarGoogle Scholar
  31. Jie Yang, Claudia Hauff, Alessandro Bozzon, and Geert-Jan Houben. 2014. Asking the right question in collaborative q&a systems. In Proceedings of the 25th ACM conference on Hypertext and social media. 179--189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. 2018. Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow. In Proceedings of the 40th International Conference on Software Engineering - ICSE '18. 886--896.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension
    May 2022
    698 pages
    ISBN:9781450392983
    DOI:10.1145/3524610
    • Conference Chairs:
    • Ayushi Rastogi,
    • Rosalia Tufano,
    • General Chair:
    • Gabriele Bavota,
    • Program Chairs:
    • Venera Arnaoudova,
    • Sonia Haiduc

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 20 October 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper

    Upcoming Conference

    ICSE 2025

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader