Skip to main content
Log in

Automated program improvement with reinforcement learning and graph neural networks

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Automated software transformations and the process of automated program repair and improvement are an important aspect of modern software engineering practices. In this paper, we describe a system which uses a graph-based deep learning model that can be trained to automatically transform and improve computer programs. By operating on language-agnostic, universal graph-like structures easily extractable from source code files (abstract syntax trees), the deep learning agent learns which transformations should be effectively applied to various structures recognized in the source code in order to improve it. By defining a metric which we want to improve and introducing an optimization task—a reinforcement learning setting, an agent learns to automatically apply a chain of transformations to the program, drastically improving it. While similar program improvement processes exist, they exclusively use exhaustive search algorithms to try all the possible code transformations which is a long process susceptible to local optimum issues. Our solution aims to model and embed structural knowledge about the programs being transformed which greatly helps the agent to choose best possible code transformations to apply. Elements of the approach we present in this paper are further applicable not just to automatic software improvement tasks, but also to other code-related tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

All software presented in this paper is Free and Open Source software under the GNU Public License. Our source code, datasets, results and instructions for reproducing the results are available in the following GitHub repository: https://github.com/nsukur/rl. We also provide a training set of intermediate graphs for training of the embedding model, which is linked in the aforementioned GitHub repository.

Notes

  1. http://www.gkc.org.uk/fermat.html.

  2. https://perun.pmf.uns.ac.rs/pracner/transformations.

  3. https://www.inria.fr/en.

  4. https://www.ow2.org/.

  5. https://gitlab.com/clatu/fermat3.

References

  • Allamanis M, Brockschmidt M, Khademi M (2017) Learning to represent programs with graphs. arXiv preprint arXiv:1711.00740

  • Alon U, Zilberstein M, Levy O, Yahav E (2019) code2vec: Learning distributed representations of code. In: Proceedings of the ACM on programming languages 3(POPL), 1–29

  • Arcuri A, Yao X (2008) A novel co-evolutionary approach to automatic software bug fixing. In: Evolutionary computation, 2008. CEC 2008.(IEEE world congress on computational intelligence). IEEE Congress On, pp 162–168. IEEE

  • Balog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D (2016) Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989

  • Ben-Nun T, Jakobovits AS, Hoefler T (2018) Neural code comprehension: a learnable representation of code semantics. Adv Neural Inf Process Syst 31

  • Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. arXiv preprint arXiv:1606.01540

  • Bunel R, Hausknecht M, Devlin J, Singh R, Kohli P (2018) Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276

  • Campbell GA, SonarSource S (2020) Cognitive complexity. SonarSource, Geneva

    Google Scholar 

  • Cummins C, Fisches ZV, Ben-Nun T, Hoefler T, Leather H (2020) Programl: Graph-based deep learning for program optimization and analysis. arXiv preprint arXiv:2003.10536

  • Dasgupta SS, Ray SN, Talukdar P (2018) Hyte: Hyperplane-based temporally aware knowledge graph embedding. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2001–2011

  • Ebert C, Cain J, Antoniol G, Counsell S, Laplante P (2016) Cyclomatic complexity. IEEE softw 33(6):27–29

    Article  Google Scholar 

  • Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428

  • Forrest S, Nguyen T, Weimer W, Le Goues C (2009) A genetic programming approach to automated software repair. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, pp 947–954. ACM

  • Fu C, Chen H, Liu H, Chen X, Tian Y, Koushanfar F, Zhao J (2019) Coda: an end-to-end neural program decompiler. Adv Neural Inf Process Syst 32

  • Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864

  • Gupta R, Pal S, Kanade A, Shevade S (2017) Deepfix: fixing common c language errors by deep learning. In: Thirty-first AAAI conference on artificial intelligence

  • Hagberg A, Swart P, S Chult D (2008) Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States)

  • Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 30

  • Harrand N, Soto-Valero C, Monperrus M, Baudry B (2020) Java decompiler diversity and its application to meta-decompilation. J Syst Softw 168:110645

    Article  Google Scholar 

  • Heo K, Lee W, Pashakhanloo P, Naik M (2018) Effective program debloating via reinforcement learning. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pp 380–394

  • Huang S, Ontañón S (2020) A closer look at invalid action masking in policy gradient algorithms. arXiv preprint arXiv:2006.14171

  • Katz DS, Ruchti J, Schulte E (2018) Using recurrent neural networks for decompilation. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pp 346–356. IEEE

  • Kim D, Nam J, Song J, Kim S (2013) Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 802–811. IEEE Press, San Francisco, CA, USA

  • Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Le Goues C, Nguyen T, Forrest S, Weimer W (2012) Genprog: a generic method for automatic software repair. IEEE Trans Software Eng 38(1):54

    Article  Google Scholar 

  • Liang C, Berant J, Le Q, Forbus KD, Lao N (2016) Neural symbolic machines: learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020

  • Li Z, Wu Q, Qian K (2019) Adabot: Fault-tolerant java decompiler. arXiv preprint arXiv:1908.06748

  • Małkowski A, Grzechociński J, Wawrzyński P (2022) Graph autoencoder with constant dimensional latent space. arXiv preprint arXiv:2201.12165

  • Nie M, Chen D, Wang D (2023) Reinforcement learning on graphs: a survey. IEEE Trans Emerg Top Comput Intell. https://doi.org/10.1109/TETCI.2022.3222545

    Article  Google Scholar 

  • Parisotto E, Mohamed A-r, Singh R, Li L, Zhou D, Kohli P (2016) Neuro-symbolic program synthesis. arXiv preprint arXiv:1611.01855

  • Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2015) Spoon: a library for implementing analyses and transformations of java source code. Softw Pract Exp 46:1155–1179. https://doi.org/10.1002/spe.2346

    Article  Google Scholar 

  • Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710

  • Pracner D, Budimac Z (2017) Enabling code transformations with fermat on simplified bytecode. J Softw Evol Process 29(5):1857. https://doi.org/10.1002/smr.1857

    Article  Google Scholar 

  • Raffin A, Hill A, Ernestus M, Gleave A, Kanervisto A, Dormann N (2019) Stable baselines 3

  • Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Higher Ed, London

    Google Scholar 

  • Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

  • Shin ECR, Song D, Moazzezi R (2015) Recognizing functions in binaries with neural networks. In: 24th USENIX security symposium (USENIX Security 15), pp 611–626

  • Sukur N, Pracner D, Budimac Z (2019) Fitness functions and transformations in an automated process. In: Budimac Z, Koteska B (eds) SQAMIA 2019, 8th workshop of software quality, analysis, monitoring, improvement, and applications, pp 17–011710. CEUR-WS.org, Ohrid, North Macedonia. http://ceur-ws.org/Vol-2217/

  • Tang C-Y, Liu C-H, Chen W-K, You SD (2020) Implementing action mask in proximal policy optimization (ppo) algorithm. ICT Express 6(3):200–203

    Article  Google Scholar 

  • Wang X, Lyu D, Li M, Xia Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B et al (2021) Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 international conference on management of data, pp 2628–2638

  • Wang H, Wang S, Xu D, Zhang X, Liu X (2020) Generating effective software obfuscation sequences with reinforcement learning. IEEE Trans Depend Secure Comput

  • Ward M (2013) Assembler restructuring in fermat. In: SCAM, pp 147–156. IEEE, Eindhoven, Netherlands. https://doi.org/10.1109/SCAM.2013.6648196

  • Watson AH, Wallace DR, McCabe TJ (1996) Structured testing: A testing methodology using the cyclomatic complexity metric. Special Publication (NIST SP), National Institute of Standards and Technology, Gaithersburg, MD, Gaithersburg, MD, USA

  • Yuan Y, Banzhaf W (2018) Arja: automated repair of java programs via multi-objective genetic programming. IEEE Trans Software Eng 46(10):1040–1067. https://doi.org/10.1109/TSE.2018.2874648

    Article  Google Scholar 

  • Yuan Y, Banzhaf W (2019) Toward better evolutionary program repair: an integrated approach. ACM Trans Softw Eng Methodol. https://doi.org/10.1145/3360004

    Article  Google Scholar 

  • Zhang L, Rosenblatt G, Fetaya E, Liao R, Byrd W, Might M, Urtasun R, Zemel R (2018) Neural guided constraint logic programming for program synthesis. Adv Neural Inf Process Syst 31

Download references

Acknowledgements

The authors gratefully acknowledge the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125).

Funding

This study was funded with the financial support of the Ministry of Science, Technological Development and Innovation of the Republic of Serbia (Grant No. 451-03-47/2023-01/200125)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nataša Sukur.

Ethics declarations

Conflict of interest

All authors of this manuscript declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This article does not contain any studies with human participants performed by any of the authors, therefore informed consent was not required.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sukur, N., Milošević, N., Pracner, D. et al. Automated program improvement with reinforcement learning and graph neural networks. Soft Comput 28, 2593–2604 (2024). https://doi.org/10.1007/s00500-023-08559-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-08559-1

Keywords

Navigation