Abstract
Code summarization is the task of creating short, natural language descriptions of source code. It is an important part of code comprehension, and a powerful method of documentation. Previous work has made progress in identifying where programmers focus in code as they write their own summaries (i.e., Writing). However, there is currently a gap studying programmers’ attention as they read code with pre-written summaries (i.e., Reading). As a result, it is currently unknown how these two forms of code comprehension compare: Reading and Writing. Also, there is a limited understanding of programmer attention with respect to program semantics. We address these shortcomings with a human eye-tracking study (n=27) comparing Reading and Writing. We examined programmers’ attention with respect to fine-grained program semantics, including their attention sequences (i.e., scan paths). We find distinctions in programmer attention across the comprehension tasks, similarities in reading patterns between them, and differences mediated by demographic factors. This can help guide code comprehension in both CS education and automated code summarization. Furthermore, we mapped programmers’ gaze data onto the Abstract Syntax Tree to explore another representation of human attention. We find that visual behavior on this structure is not always consistent with that on source code.
- 2019. Python - SDK reference guide - Tobii Pro SDK documentation. https://developer.tobiipro.com/python/python-sdk-reference-guide.htmlGoogle Scholar
- 2023. https://go.tobii.com/tobii-pro-fusion-user-manualGoogle Scholar
- Hervé Abdi et al. 2007. Bonferroni and Šidák corrections for multiple comparisons. Encyclopedia of measurement and statistics 3, 01 (2007), 2007.Google Scholar
- Nahla J Abid, Jonathan I Maletic, and Bonita Sharif. 2019. Using developer eye movements to externalize the mental model used in code summarization tasks. In Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications. 1–9.Google ScholarDigital Library
- Nahla J Abid, Bonita Sharif, Natalia Dragan, Hend Alrasheed, and Jonathan I Maletic. 2019. Developer reading behavior while summarizing java methods: Size and context matters. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 384–395.Google ScholarDigital Library
- Marjan Adeli, Nicholas Nelson, Souti Chattopadhyay, Hayden Coffey, Austin Henley, and Anita Sarma. 2020. Supporting code comprehension via annotations: Right information at the right time and place. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 1–10.Google ScholarCross Ref
- Emad Aghajani, Csaba Nagy, Mario Linares-Vásquez, Laura Moreno, Gabriele Bavota, Michele Lanza, and David C Shepherd. 2020. Software documentation: the practitioners’ perspective. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 590–601.Google ScholarDigital Library
- Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A transformer-based approach for source code summarization. arXiv preprint arXiv:2005.00653(2020).Google Scholar
- Akiko Aizawa. 2003. An information-theoretic perspective of tf–idf measures. Information Processing & Management 39, 1 (2003), 45–65.Google ScholarDigital Library
- Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2017. Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740(2017).Google Scholar
- Aakash Bansal, Zachary Eberhart, Zachary Karas, Yu Huang, and Collin McMillan. 2023. Function Call Graph Context Encoding for Neural Source Code Summarization. IEEE Transactions on Software Engineering(2023).Google ScholarDigital Library
- Aakash Bansal, Chia-Yi Su, Zachary Karas, Yifan Zhang, Yu Huang, Toby Jia-Jun Li, and Collin McMillan. 2023. Modeling Programmer Attention as Scanpath Prediction. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1732–1736.Google ScholarCross Ref
- Roman Bednarik, Carsten Schulte, Lea Budde, Birte Heinemann, and Hana Vrzakova. 2018. Eye-movement modeling examples in source code comprehension: A classroom study. In Proceedings of the 18th Koli Calling International Conference on Computing Education Research. 1–8.Google ScholarDigital Library
- Roman Bednarik and Markku Tukiainen. 2008. Temporal eye-tracking data: Evolution of debugging strategies with multiple representations. In Proceedings of the 2008 symposium on Eye tracking research & applications. 99–102.Google ScholarDigital Library
- Jean-Francois Bergeretti and Bernard A Carré. 1985. Information-flow and data-flow analysis of while-programs. ACM Transactions on Programming Languages and Systems (TOPLAS) 7, 1(1985), 37–61.Google ScholarDigital Library
- Birtukan Birawo and Pawel Kasprowski. 2022. Review and evaluation of eye movement event detection algorithms. Sensors 22, 22 (2022), 8810.Google ScholarCross Ref
- Neil CC Brown, Pierre Weill-Tessier, Maksymilian Sekula, Alexandra-Lucia Costache, and Michael Kölling. 2022. Novice use of the Java programming language. ACM Transactions on Computing Education 23, 1 (2022), 1–24.Google ScholarDigital Library
- Peter F Brown, Vincent J Della Pietra, Peter V Desouza, Jennifer C Lai, and Robert L Mercer. 1992. Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467–480.Google Scholar
- Teresa Busjahn, Roman Bednarik, Andrew Begel, Martha Crosby, James H Paterson, Carsten Schulte, Bonita Sharif, and Sascha Tamm. 2015. Eye movements in code reading: Relaxing the linear order. In 2015 IEEE 23rd International Conference on Program Comprehension. IEEE, 255–265.Google ScholarDigital Library
- Teresa Busjahn, Carsten Schulte, and Andreas Busjahn. 2011. Analysis of code reading to gain more insight in program comprehension. In Proceedings of the 11th Koli Calling International Conference on Computing Education Research. 1–9.Google ScholarDigital Library
- Joseph Cesario. 2022. What can experimental studies of bias tell us about real-world group disparities?Behavioral and Brain Sciences 45 (2022), e66.Google Scholar
- Michael L Collard, Michael John Decker, and Jonathan I Maletic. 2013. srcml: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. In 2013 IEEE International conference on software maintenance. IEEE, 516–519.Google ScholarDigital Library
- Robert Cordingly, Hanfei Yu, Varik Hoang, David Perez, David Foster, Zohreh Sadeghi, Rashad Hatchett, and Wes J Lloyd. 2020. Implications of programming language selection for serverless data processing pipelines. In DASC/PiCom/CBDCom/CyberSciTech. IEEE, 704–711.Google Scholar
- Diego Costa, Artur Andrzejak, Janos Seboek, and David Lo. 2017. Empirical study of usage and performance of java collections. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. 389–400.Google ScholarDigital Library
- Richard Craggs and Mary McGee Wood. 2005. Evaluating discourse and dialogue coding schemes. Computational Linguistics 31, 3 (2005), 289–296.Google ScholarDigital Library
- Filipe Cristino, Sebastiaan Mathôt, Jan Theeuwes, and Iain D Gilchrist. 2010. ScanMatch: A novel method for comparing fixation sequences. Behavior research methods 42 (2010), 692–700.Google Scholar
- Martha E Crosby, Jean Scholtz, and Susan Wiedenbeck. 2002. The Roles Beacons Play in Comprehension for Novice and Expert Programmers.. In PPIG. 5.Google Scholar
- Benoît De Smet, Lorent Lempereur, Zohreh Sharafi, Yann-Gaël Guéhéneuc, Giuliano Antoniol, and Naji Habra. 2014. Taupe: Visualizing and analyzing eye-tracking data. Science of computer programming 79 (2014), 260–278.Google Scholar
- Marie Delacre, Daniël Lakens, and Christophe Leys. 2017. Why psychologists should by default use Welch’s t-test instead of Student’s t-test. International Review of Social Psychology 30, 1 (2017), 92–101.Google ScholarCross Ref
- David Demirdjian, Leonid Taycher, Gregory Shakhnarovich, Kristen Grauman, and Trevor Darrell. 2005. Avoiding the” streetlight effect”: tracking by exploring likelihood modes. In Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, Vol. 1. IEEE, 357–364.Google Scholar
- Alan Ewert and Jim Sibthorp. 2009. Creating outcomes through experiential education: The challenge of confounding variables. Journal of Experiential Education 31, 3 (2009), 376–389.Google ScholarCross Ref
- Franz Faul, Edgar Erdfelder, Albert-Georg Lang, and Axel Buchner. 2007. G* Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior research methods 39, 2 (2007), 175–191.Google Scholar
- Benjamin Floyd, Tyler Santander, and Westley Weimer. 2017. Decoding the representation of code in the brain: An fMRI study of code review and expertise. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 175–186.Google ScholarDigital Library
- Davide Fucci, Daniela Girardi, Nicole Novielli, Luigi Quaranta, and Filippo Lanubile. 2019. A replication study on code comprehension and expertise using lightweight biometric sensors. In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC). IEEE, 311–322.Google ScholarDigital Library
- Golara Garousi, Vahid Garousi, Mahmoud Moussavi, Guenther Ruhe, and Brian Smith. 2013. Evaluating usage and quality of technical software documentation: an empirical study. In Proceedings of the 17th international conference on evaluation and assessment in software engineering. 24–35.Google ScholarDigital Library
- Michael Gnatz, Leonid Kof, Franz Prilmeier, and Tilman Seifert. 2003. A practical approach of teaching software engineering. In Proceedings 16th Conference on Software Engineering Education and Training, 2003.(CSEE&T 2003). IEEE, 120–128.Google ScholarCross Ref
- Jaekyu Ha, Robert M Haralick, and Ihsin T Phillips. 1995. Document page decomposition by the bounding-box project. In Proceedings of 3rd International Conference on Document Analysis and Recognition, Vol. 2. IEEE, 1119–1122.Google Scholar
- Sonia Haiduc, Jairo Aponte, and Andrian Marcus. 2010. Supporting program comprehension with source code summarization. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. 223–226.Google ScholarDigital Library
- Sakib Haque, Zachary Eberhart, Aakash Bansal, and Collin McMillan. 2022. Semantic similarity metrics for evaluating source code summarization. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension. 36–47.Google ScholarDigital Library
- Florian Hauser, Jürgen Mottok, and Hans Gruber. 2018. Eye tracking metrics in software engineering. In Proceedings of the 3rd European Conference of Software Engineering Education. 39–44.Google ScholarDigital Library
- Mary Hegarty, Richard E Mayer, and Carolyn E Green. 1992. Comprehension of arithmetic word problems: Evidence from students’ eye fixations.Journal of educational psychology 84, 1 (1992), 76.Google Scholar
- Prateek Hejmady and N Hari Narayanan. 2012. Visual attention patterns during program debugging with an IDE. In proceedings of the symposium on eye tracking research and applications. 197–200.Google ScholarDigital Library
- Xing Hu, Xin Xia, David Lo, Zhiyuan Wan, Qiuyuan Chen, and Thomas Zimmermann. 2022. Practitioners’ expectations on automated code comment generation. In Proceedings of the 44th International Conference on Software Engineering. 1693–1705.Google ScholarDigital Library
- Yu Huang, Kevin Leach, Zohreh Sharafi, Nicholas McKay, Tyler Santander, and Westley Weimer. 2020. Biases and differences in code review using medical imaging and eye-tracking: genders, humans, and machines. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 456–468.Google ScholarDigital Library
- Yu Huang, Xinyu Liu, Ryan Krueger, Tyler Santander, Xiaosu Hu, Kevin Leach, and Westley Weimer. 2019. Distilling neural representations of data structure manipulation using fMRI and fNIRS. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 396–407.Google ScholarDigital Library
- Yasir Hussain, Zhiqiu Huang, Yu Zhou, and Senzhang Wang. 2020. CodeGRU: Context-aware deep learning with gated recurrent unit for source code modeling. Information and Software Technology 125 (2020), 106309.Google ScholarCross Ref
- Sarah Jessup, Sasha M Willis, Gene Alarcon, and Michael Lee. 2021. Using eye-tracking data to compare differences in code comprehension and code perceptions between expert and novice programmers. (2021).Google Scholar
- Marcel A Just and Patricia A Carpenter. 1980. A theory of reading: from eye fixations to comprehension.Psychological review 87, 4 (1980), 329.Google Scholar
- Zachary Karas, Andrew Jahn, Westley Weimer, and Yu Huang. 2021. Connecting the dots: rethinking the relationship between code and prose writing with functional connectivity. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 767–779.Google ScholarDigital Library
- Ryan Krueger, Yu Huang, Xinyu Liu, Tyler Santander, Westley Weimer, and Kevin Leach. 2020. Neurological divide: an fMRI study of prose and code writing. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 678–690.Google ScholarDigital Library
- Alexander LeClair, Sakib Haque, Lingfei Wu, and Collin McMillan. 2020. Improved code summarization via a graph neural network. In Proceedings of the 28th international conference on program comprehension. 184–195.Google ScholarDigital Library
- Alexander LeClair and Collin McMillan. 2019. Recommendations for datasets for source code summarization. arXiv preprint arXiv:1904.02660(2019).Google Scholar
- Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on human-computer interaction 3, CSCW(2019), 1–23.Google ScholarDigital Library
- Nora A McIntyre and Tom Foulsham. 2018. Scanpath analysis of expertise and culture in teacher gaze in real-world classrooms. Instructional Science 46(2018), 435–455.Google ScholarCross Ref
- Mónika Mészáros, Máté Cserép, and Anett Fekete. 2019. Delivering comprehension features into source code editors through LSP. In 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). IEEE, 1581–1586.Google ScholarCross Ref
- Daniel C Molden. 2014. Understanding priming effects in social psychology: An overview and integration. Social Cognition 32, Supplement (2014), 243–249.Google ScholarCross Ref
- Laura Moreno, Jairo Aponte, Giriprasad Sridhara, Andrian Marcus, Lori Pollock, and K Vijay-Shanker. 2013. Automatic generation of natural language summaries for java classes. In 2013 21st International conference on program comprehension (ICPC). IEEE, 23–32.Google ScholarCross Ref
- Sun Developer Network. 1999. Code conventions for the Java programming language.Google Scholar
- Patrick Niemeyer and Jonathan Knudsen. 2005. Learning java. ” O’Reilly Media, Inc.”.Google Scholar
- Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology 21(2012), 4–19.Google Scholar
- David Lorge Parnas. 2010. Precise documentation: The key to better software. In The Future of Software Engineering. Springer, 125–148.Google Scholar
- Norman Peitek, Sven Apel, Chris Parnin, André Brechmann, and Janet Siegmund. 2021. Program comprehension and code complexity metrics: An fmri study. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 524–536.Google ScholarDigital Library
- Norman Peitek, Janet Siegmund, and Sven Apel. 2020. What drives the reading order of programmers? an eye tracking study. In Proceedings of the 28th International Conference on Program Comprehension. 342–353.Google ScholarDigital Library
- Sylvia Peißl, Christopher D. Wickens, and Rithi Baruah. 2018. Eye-Tracking Measures in Aviation: A Selective Literature Review. The International Journal of Aerospace Psychology 28, 3-4(2018), 98–112. https://doi.org/10.1080/24721840.2018.1514978 arXiv:https://doi.org/10.1080/24721840.2018.1514978Google ScholarCross Ref
- Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. Citeseer, 29–48.Google Scholar
- Pooja Rani, Arianna Blasi, Nataliia Stulova, Sebastiano Panichella, Alessandra Gorla, and Oscar Nierstrasz. 2023. A decade of code comment quality assessment: A systematic literature review. Journal of Systems and Software 195 (2023), 111515.Google ScholarDigital Library
- Keith Rayner. 1998. Eye movements in reading and information processing: 20 years of research.Psychological bulletin 124, 3 (1998), 372.Google Scholar
- Paige Rodeghero, Cheng Liu, Paul W McBurney, and Collin McMillan. 2015. An eye-tracking study of java programmers and application to source code summarization. IEEE Transactions on Software Engineering 41, 11 (2015), 1038–1054.Google ScholarDigital Library
- Paige Rodeghero and Collin McMillan. 2015. An empirical study on the patterns of eye movement during summarization tasks. In 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 1–10.Google ScholarCross Ref
- Paige Rodeghero, Collin McMillan, Paul W McBurney, Nigel Bosch, and Sidney D’Mello. 2014. Improving automated source code summarization via an eye-tracking study of programmers. In Proceedings of the 36th international conference on Software engineering. 390–401.Google ScholarDigital Library
- Herbert Schildt. 2007. Java: the complete reference. (2007).Google Scholar
- Hugo H Schoonewille, Werner Heijstek, Michel RV Chaudron, and Thomas Kühne. 2011. A cognitive perspective on developer comprehension of software design documentation. In Proceedings of the 29th ACM international conference on Design of communication. 211–218.Google ScholarDigital Library
- Timothy R Shaffer, Jenna L Wise, Braden M Walters, Sebastian C Müller, Michael Falcone, and Bonita Sharif. 2015. itrace: Enabling eye tracking on software artifacts within the ide to support software engineering tasks. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 954–957.Google ScholarDigital Library
- Zohreh Sharafi, Alessandro Marchetto, Angelo Susi, Giuliano Antoniol, and Yann-Gaël Guéhéneuc. 2013. An empirical study on the efficiency of graphical vs. textual representations in requirements comprehension. In 2013 21st International Conference on Program Comprehension (ICPC). IEEE, 33–42.Google ScholarCross Ref
- Zohreh Sharafi, Timothy Shaffer, Bonita Sharif, and Yann-Gaël Guéhéneuc. 2015. Eye-tracking metrics in software engineering. In 2015 Asia-Pacific Software Engineering Conference (APSEC). IEEE, 96–103.Google ScholarCross Ref
- Zohreh Sharafi, Bonita Sharif, Yann-Gaël Guéhéneuc, Andrew Begel, Roman Bednarik, and Martha Crosby. 2020. A practical guide on conducting eye tracking studies in software engineering. Empirical Software Engineering 25 (2020), 3128–3174.Google ScholarDigital Library
- Zohreh Sharafi, Zéphyrin Soh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. 2012. Women and men—different but equal: On the impact of identifier style on source code reading. In 2012 20th IEEE International Conference on Program Comprehension (ICPC). IEEE, 27–36.Google ScholarCross Ref
- Bonita Sharif and Jonathan I Maletic. 2010. An eye tracking study on camelcase and under_score identifier styles. In 2010 IEEE 18th International Conference on Program Comprehension. IEEE, 196–205.Google ScholarDigital Library
- Susan Elliott Sim, Sukanya Ratanotayanon, Oluwatosin Aiyelokun, and Erin Morris. 2006. An initial study to develop an empirical test for software engineering expertise. Institute for Software Research, University of California, Irvine, CA, USA, Technical Report# UCI-ISR-06-6(2006).Google Scholar
- Amit Singhal et al. 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24, 4 (2001), 35–43.Google Scholar
- Ian Sommerville. 2001. Software documentation. Software engineering 2(2001), 143–154.Google Scholar
- Giriprasad Sridhara, Emily Hill, Divya Muppaneni, Lori Pollock, and K Vijay-Shanker. 2010. Towards automatically generating summary comments for java methods. In Proceedings of the 25th IEEE/ACM international conference on Automated software engineering. 43–52.Google ScholarDigital Library
- Ze Tang, Chuanyi Li, Jidong Ge, Xiaoyu Shen, Zheling Zhu, and Bin Luo. 2021. AST-transformer: Encoding abstract syntax trees efficiently for code summarization. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1193–1195.Google ScholarDigital Library
- William C Thompson. 2016. Observer Effects. A Guide to Forensic DNA Profiling(2016), 171–173.Google Scholar
- Anneliese Von Mayrhauser and A Marie Vans. 1995. Program comprehension during software maintenance and evolution. Computer 28, 8 (1995), 44–55.Google ScholarDigital Library
- Ruyun Wang, Hanwen Zhang, Guoliang Lu, Lei Lyu, and Chen Lyu. 2020. Fret: Functional reinforced transformer with bert for code summarization. IEEE Access 8(2020), 135591–135604.Google ScholarCross Ref
- Michel Wedel and Rik Pieters. 2017. A review of eye-tracking research in marketing. In Review of marketing research. Routledge, 123–147.Google Scholar
- Marvin Wyrich, Justus Bogner, and Stefan Wagner. 2022. 40 years of designing code comprehension experiments: A systematic mapping study. arXiv preprint arXiv:2206.11102(2022).Google Scholar
- Chunyan Zhang, Junchao Wang, Qinglei Zhou, Ting Xu, Ke Tang, Hairen Gui, and Fudong Liu. 2022. A survey of automatic source code summarization. Symmetry 14, 3 (2022), 471.Google ScholarCross Ref
Recommendations
An Extractive-and-Abstractive Framework for Source Code Summarization
(Source) Code summarization aims to automatically generate summaries/comments for given code snippets in the form of natural language. Such summaries play a key role in helping developers understand and maintain source code. Existing code summarization ...
A Neural-Network based Code Summarization Approach by Using Source Code and its Call Dependencies
Internetware '19: Proceedings of the 11th Asia-Pacific Symposium on InternetwareCode summarization aims at generating natural language abstraction for source code, and it can be of great help for program comprehension and software maintenance. The current code summarization approaches have made progress with neural-network. However,...
Leveraging Comment Retrieval for Code Summarization
Advances in Information RetrievalAbstractOpen-source code often suffers from mismatched or missing comments, leading to difficult code comprehension, and burdening software development and maintenance. In this paper, we design a novel code summarization model CodeFiD to address this ...
Comments