short-paper

pycefr: Python competency level through code analysis

Authors:
Gregorio Robles

Universidad Rey Juan Carlos, Madrid, Spain

Universidad Rey Juan Carlos, Madrid, Spain
View Profile

,
Raula Gaikovina Kula

NAIST, Nara, Japan

NAIST, Nara, Japan
View Profile

,
Chaiyong Ragkhitwetsagul

Mahidol University, Nakhon Pathom, Thailand

Mahidol University, Nakhon Pathom, Thailand
View Profile

,
Tattiya Sakulniwat

NAIST, Nara, Japan

NAIST, Nara, Japan
View Profile

,
Kenichi Matsumoto

NAIST, Nara, Japan

NAIST, Nara, Japan
View Profile

,
Jesus M. Gonzalez-Barahona

Universidad Rey Juan Carlos, Madrid, Spain

Universidad Rey Juan Carlos, Madrid, Spain
View Profile

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program ComprehensionMay 2022Pages 173–177https://doi.org/10.1145/3524610.3527878

Published:20 October 2022Publication History

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

Pages 173–177

ABSTRACT

Python is known to be a versatile language, well suited both for beginners and advanced users. Some elements of the language are easier to understand than others: some are found in any kind of code, while some others are used only by experienced programmers. The use of these elements lead to different ways to code, depending on the experience with the language and the knowledge of its elements, the general programming competence and programming skills, etc. In this paper, we present pycefr, a tool that detects the use of the different elements of the Python language, effectively measuring the level of Python proficiency required to comprehend and deal with a fragment of Python code. Following the well-known Common European Framework of Reference for Languages (CEFR), widely used for natural languages, pycefr categorizes Python code in six levels, depending on the proficiency required to create and understand it. We also discuss different use cases for pycefr: identifying code snippets that can be understood by developers with a certain proficiency, labeling code examples in online resources such as Stackoverflow and GitHub to suit them to a certain level of competency, helping in the onboarding process of new developers in Open Source Software projects, etc. A video shows availability and usage of the tool: https://tinyurl.com/ypdt3fwe.

References

Yasemin Acar, Michael Backes, Sascha Fahl, Doowon Kim, Michelle L Mazurek, and Christian Stransky. 2016. You Get Where You're Looking for: The Impact of Information Sources on Code Security. In Proceedings of the IEEE Symposium on Security and Privacy (SP '16). IEEE, 289--305.Google ScholarCross Ref
Carol V Alexandru, José J Merchante, Sebastiano Panichella, Sebastian Proksch, Harald C Gall, and Gregorio Robles. 2018. On the usage of pythonic idioms. In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. 1--11.Google ScholarDigital Library
Andrea Capiluppi, Alexander Serebrenik, and Leif Singer. 2012. Assessing technical candidates on the social web. IEEE software 30, 1 (2012), 45--51.Google Scholar
Andrea Capiluppi, Alexander Serebrenik, and Ahmmad Youssef. 2012. Developing an h-index for OSS developers. In 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, 251--254.Google ScholarCross Ref
Wesley Chun. 2001. Core python programming. Vol. 1. Prentice Hall Professional.Google Scholar
Peter JA Cock, Tiago Antao, Jeffrey T Chang, Brad A Chapman, Cymon J Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski, et al. 2009. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 11 (2009), 1422--1423.Google ScholarDigital Library
Bart Deygers, Beate Zeidler, Dina Vilcu, and Cecilie Hamnes Carlsen. 2018. One framework to unite them all? Use of the CEFR in European university entrance policies. Language Assessment Quarterly 15, 1 (2018), 3--15.Google ScholarCross Ref
Allen Downey. 2012. Think python. "O'Reilly Media, Inc.".Google Scholar
Neus Figueras. 2007. The CEFR, a lever for the improvement of language professionals in Europe. Modern Language Journal (2007), 673--675.Google Scholar
Julia Hancke and Detmar Meurers. 2013. Exploring CEFR classification for German based on rich linguistic modeling. Learner Corpus Research (2013), 54--56.Google Scholar
Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the Naturalness of Software. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). 837--847.Google ScholarCross Ref
Hugo. 2020. Python version share over time, 6. https://dev.to/hugovk/python-version-share-over-time-6-1jb8. Online; accessed 21 June 2021.Google Scholar
JetBrains. 2020. Python Developers Survey 2020 Results. https://www.jetbrains.com/lp/python-developers-survey-2020/. Online; accessed 21 June 2021.Google Scholar
Nurdan Kavakli and Sezen Arslan. 2017. Applying EALTA guidelines as baseline for the foreign language proficiency test in Turkey: The case of YDS. International Journal of Curriculum and Instruction 9, 1 (2017), 104--118.Google Scholar
Dave Kuhlman. 2009. A python book: Beginning python, advanced python, and python exercises. Dave Kuhlman Lutz.Google Scholar
Mark Lutz. 2001. Programming python. "O'Reilly Media, Inc.".Google Scholar
Waldemar Martyniuk. 2011. Aligning Tests with the CEFR. Ernst Klett Sprachen.Google Scholar
Brian North. 2007. The CEFR illustrative descriptor scales. The Modern Language Journal 91, 4 (2007), 656--659.Google ScholarCross Ref
Council of Europe. 2021. https://www.coe.int/en/web/common-european-framework-reference-languagesGoogle Scholar
Purit Phan-udom, Naruedon Wattanakul, Tattiya Sakulniwat, Chaiyong Ragkhitwetsagul, Thanwadee Sunetnanta, Morakot Choetkiertikul, and Raula Gaikovina Kula. 2020. Teddy: Automatic Recommendation of Pythonic Idiom Usage For Pull-Based Software Projects. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 806--809.Google Scholar
Mark Pilgrim and Simon Willison. 2009. Dive Into Python 3. Vol. 2. Springer.Google Scholar
Chaiyong Ragkhitwetsagul, Jens Krinke, Matheus Paixao, Giuseppe Bianco, and Rocco Oliveto. 2021. Toxic Code Snippets on Stack Overflow. IEEE Transactions on Software Engineering 47, 3 (2021), 560--581. Google ScholarCross Ref
Tattiya Sakulniwat, Raula Gaikovina Kula, Chaiyong Ragkhitwetsagul, Morakot Choetkiertikul, Thanwadee Sunetnanta, Dong Wang, Takashi Ishio, and Kenichi Matsumoto. 2019. Visualizing the Usage of Pythonic Idioms over Time: A Case Study of the with open Idiom. In 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP). IEEE, 43--435.Google ScholarCross Ref
Anita Sarma, Xiaofan Chen, Sandeep Kuttal, Laura Dabbish, and Zhendong Wang. 2016. Hiring in the global stage: Profiles of online contributions. In 2016 IEEE 11th International Conference on Global Software Engineering (ICGSE). IEEE, 1--10.Google ScholarCross Ref
Igor Steinmacher, Marco Aurelio Graciotto Silva, Marco Aurelio Gerosa, and David F Redmiles. 2015. A systematic literature review on the barriers faced by newcomers to open source software projects. Information and Software Technology 59 (2015), 67--85.Google ScholarDigital Library
Stephen Cass. 2021. Top Programming Languages 2021: Python dominates as the de facto platform for new technologies. https://spectrum-ieee-org.ejournal.mahidol.ac.th/top-programming-languages-2021. Online; accessed 21 October 2021.Google Scholar
Mark Summerfield. 2010. Programming in Python 3: a complete introduction to the Python language. Addison-Wesley Professional.Google Scholar
TIOBE. 2021. TIOBE Index for October 2021. https://www.tiobe.com/tiobe-index/. Online; accessed 21 October 2021.Google Scholar
Bogdan Vasilescu, Alexander Serebrenik, Prem Devanbu, and Vladimir Filkov. 2014. How social Q&A sites are changing knowledge sharing in open source software communities. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. 342--354.Google ScholarDigital Library
Bogdan Vasilescu, Alexander Serebrenik, and Mark GJ van den Brand. 2013. The Babel of software development: Linguistic diversity in open source. In International Conference on Social Informatics. Springer, 391--404.Google Scholar
Jie Yang, Claudia Hauff, Alessandro Bozzon, and Geert-Jan Houben. 2014. Asking the right question in collaborative q&a systems. In Proceedings of the 25th ACM conference on Hypertext and social media. 179--189.Google ScholarDigital Library
Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim. 2018. Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow. In Proceedings of the 40th International Conference on Software Engineering - ICSE '18. 886--896.Google ScholarDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension
May 2022
698 pages
ISBN:9781450392983
DOI:10.1145/3524610
Conference Chairs:
Ayushi Rastogi
University of Groningen, The Netherlands
,
Rosalia Tufano
USI Università della Svizzera italiana, Switzerland
,
General Chair:
Gabriele Bavota
USI Università della Svizzera italiana, Switzerland
,
Program Chairs:
Venera Arnaoudova
Washington State University, United States of America
,
Sonia Haiduc
Florida State University, United States of America
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- short-paper
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 98
  Total Downloads
- Downloads (Last 12 months)64
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

pycefr: Python competency level through code analysis

ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension

ABSTRACT

References

Cited By

Recommendations

Beginning Node.js

Building Websites with DotNetNuke 5

Beginning jQuery