Automated program improvement with reinforcement learning and graph neural networks

Sukur, Nataša; Milošević, Nemanja; Pracner, Doni; Budimac, Zoran

doi:10.1007/s00500-023-08559-1

Automated program improvement with reinforcement learning and graph neural networks

Application of soft computing
Published: 03 June 2023

Volume 28, pages 2593–2604, (2024)
Cite this article

Soft Computing Aims and scope Submit manuscript

238 Accesses
1 Citation
Explore all metrics

Abstract

Automated software transformations and the process of automated program repair and improvement are an important aspect of modern software engineering practices. In this paper, we describe a system which uses a graph-based deep learning model that can be trained to automatically transform and improve computer programs. By operating on language-agnostic, universal graph-like structures easily extractable from source code files (abstract syntax trees), the deep learning agent learns which transformations should be effectively applied to various structures recognized in the source code in order to improve it. By defining a metric which we want to improve and introducing an optimization task—a reinforcement learning setting, an agent learns to automatically apply a chain of transformations to the program, drastically improving it. While similar program improvement processes exist, they exclusively use exhaustive search algorithms to try all the possible code transformations which is a long process susceptible to local optimum issues. Our solution aims to model and embed structural knowledge about the programs being transformed which greatly helps the agent to choose best possible code transformations to apply. Elements of the approach we present in this paper are further applicable not just to automatic software improvement tasks, but also to other code-related tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Introduction to Reinforcement Learning

A survey on interpretable reinforcement learning

Article 19 April 2024

Data availability

All software presented in this paper is Free and Open Source software under the GNU Public License. Our source code, datasets, results and instructions for reproducing the results are available in the following GitHub repository: https://github.com/nsukur/rl. We also provide a training set of intermediate graphs for training of the embedding model, which is linked in the aforementioned GitHub repository.

Notes

References

Allamanis M, Brockschmidt M, Khademi M (2017) Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740
Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: Learning distributed representations of code. In: Proceedings of the ACM on programming languages 3(POPL), 1–29
Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: Evolutionary computation, 2008. CEC 2008.(IEEE world congress on computational intelligence). IEEE Congress On, pp 162–168. IEEE
Balog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D (2016) Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989
Ben-Nun T, Jakobovits AS, Hoefler T (2018) Neural code comprehension: a learnable representation of code semantics. Adv Neural Inf Process Syst 31
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. arXiv preprint arXiv:1606.01540
Bunel R, Hausknecht M, Devlin J, Singh R, Kohli P (2018) Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276
Campbell GA, SonarSource S (2020) Cognitive complexity. SonarSource, Geneva
Google Scholar
Cummins C, Fisches ZV, Ben-Nun T, Hoefler T, Leather H (2020) Programl: Graph-based deep learning for program optimization and analysis. arXiv preprint arXiv:2003.10536
Dasgupta SS, Ray SN, Talukdar P (2018) Hyte: Hyperplane-based temporally aware knowledge graph embedding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2001–2011
Ebert C, Cain J, Antoniol G, Counsell S, Laplante P (2016) Cyclomatic complexity. IEEE softw 33(6):27–29
Article Google Scholar
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428
Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, pp 947–954. ACM
Fu C, Chen H, Liu H, Chen X, Tian Y, Koushanfar F, Zhao J (2019) Coda: an end-to-end neural program decompiler. Adv Neural Inf Process Syst 32
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Gupta R, Pal S, Kanade A, Shevade S (2017) Deepfix: fixing common c language errors by deep learning. In: Thirty-first AAAI conference on artificial intelligence
Hagberg A, Swart P, S Chult D (2008) Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States)
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30
Harrand N, Soto-Valero C, Monperrus M, Baudry B (2020) Java decompiler diversity and its application to meta-decompilation. J Syst Softw 168:110645
Article Google Scholar
Heo K, Lee W, Pashakhanloo P, Naik M (2018) Effective program debloating via reinforcement learning. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 380–394
Huang S, Ontañón S (2020) A closer look at invalid action masking in policy gradient algorithms. arXiv preprint arXiv:2006.14171
Katz DS, Ruchti J, Schulte E (2018) Using recurrent neural networks for decompilation. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pp 346–356. IEEE
Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 802–811. IEEE Press, San Francisco, CA, USA
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Le Goues C, Nguyen T, Forrest S, Weimer W (2012) Genprog: a generic method for automatic software repair. IEEE Trans Software Eng 38(1):54
Article Google Scholar
Liang C, Berant J, Le Q, Forbus KD, Lao N (2016) Neural symbolic machines: learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020
Li Z, Wu Q, Qian K (2019) Adabot: Fault-tolerant java decompiler. arXiv preprint arXiv:1908.06748
Małkowski A, Grzechociński J, Wawrzyński P (2022) Graph autoencoder with constant dimensional latent space. arXiv preprint arXiv:2201.12165
Nie M, Chen D, Wang D (2023) Reinforcement learning on graphs: a survey. IEEE Trans Emerg Top Comput Intell. https://doi.org/10.1109/TETCI.2022.3222545
Article Google Scholar
Parisotto E, Mohamed A-r, Singh R, Li L, Zhou D, Kohli P (2016) Neuro-symbolic program synthesis. arXiv preprint arXiv:1611.01855
Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2015) Spoon: a library for implementing analyses and transformations of java source code. Softw Pract Exp 46:1155–1179. https://doi.org/10.1002/spe.2346
Article Google Scholar
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Pracner D, Budimac Z (2017) Enabling code transformations with fermat on simplified bytecode. J Softw Evol Process 29(5):1857. https://doi.org/10.1002/smr.1857
Article Google Scholar
Raffin A, Hill A, Ernestus M, Gleave A, Kanervisto A, Dormann N (2019) Stable baselines 3
Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Higher Ed, London
Google Scholar
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Shin ECR, Song D, Moazzezi R (2015) Recognizing functions in binaries with neural networks. In: 24th USENIX security symposium (USENIX Security 15), pp 611–626
Sukur N, Pracner D, Budimac Z (2019) Fitness functions and transformations in an automated process. In: Budimac Z, Koteska B (eds) SQAMIA 2019, 8th workshop of software quality, analysis, monitoring, improvement, and applications, pp 17–011710. CEUR-WS.org, Ohrid, North Macedonia. http://ceur-ws.org/Vol-2217/
Tang C-Y, Liu C-H, Chen W-K, You SD (2020) Implementing action mask in proximal policy optimization (ppo) algorithm. ICT Express 6(3):200–203
Article Google Scholar
Wang X, Lyu D, Li M, Xia Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B et al (2021) Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 international conference on management of data, pp 2628–2638
Wang H, Wang S, Xu D, Zhang X, Liu X (2020) Generating effective software obfuscation sequences with reinforcement learning. IEEE Trans Depend Secure Comput
Ward M (2013) Assembler restructuring in fermat. In: SCAM, pp 147–156. IEEE, Eindhoven, Netherlands. https://doi.org/10.1109/SCAM.2013.6648196
Watson AH, Wallace DR, McCabe TJ (1996) Structured testing: A testing methodology using the cyclomatic complexity metric. Special Publication (NIST SP), National Institute of Standards and Technology, Gaithersburg, MD, Gaithersburg, MD, USA
Yuan Y, Banzhaf W (2018) Arja: automated repair of java programs via multi-objective genetic programming. IEEE Trans Software Eng 46(10):1040–1067. https://doi.org/10.1109/TSE.2018.2874648
Article Google Scholar
Yuan Y, Banzhaf W (2019) Toward better evolutionary program repair: an integrated approach. ACM Trans Softw Eng Methodol. https://doi.org/10.1145/3360004
Article Google Scholar
Zhang L, Rosenblatt G, Fetaya E, Liao R, Byrd W, Might M, Urtasun R, Zemel R (2018) Neural guided constraint logic programming for program synthesis. Adv Neural Inf Process Syst 31

Download references

Acknowledgements

The authors gratefully acknowledge the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125).

Funding

This study was funded with the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125)

Author information

Authors and Affiliations

University of Novi Sad, Faculty of Sciences, Department of Mathematics and Informatics, Trg Dositeja Obradovića 4, Novi Sad, 21000, Serbia
Nataša Sukur, Nemanja Milošević, Doni Pracner & Zoran Budimac

Authors

Nataša Sukur
View author publications
You can also search for this author in PubMed Google Scholar
Nemanja Milošević
View author publications
You can also search for this author in PubMed Google Scholar
Doni Pracner
View author publications
You can also search for this author in PubMed Google Scholar
Zoran Budimac
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nataša Sukur.

Ethics declarations

Conflict of interest

All authors of this manuscript declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This article does not contain any studies with human participants performed by any of the authors, therefore informed consent was not required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sukur, N., Milošević, N., Pracner, D. et al. Automated program improvement with reinforcement learning and graph neural networks. Soft Comput 28, 2593–2604 (2024). https://doi.org/10.1007/s00500-023-08559-1

Download citation

Accepted: 18 May 2023
Published: 03 June 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00500-023-08559-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated program improvement with reinforcement learning and graph neural networks

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Introduction to Reinforcement Learning

A survey on interpretable reinforcement learning

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automated program improvement with reinforcement learning and graph neural networks

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Introduction to Reinforcement Learning

A survey on interpretable reinforcement learning

Data availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation