skip to main content
10.1145/3624062.3624280acmotherconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion

Published:12 November 2023Publication History

ABSTRACT

This paper introduces Laminar, a novel serverless framework based on dispel4py, a parallel stream-based dataflow library. Laminar efficiently manages streaming workflows and components through a dedicated registry, offering a seamless serverless experience. Leveraging large lenguage models, Laminar enhances the framework with semantic code search, code summarization, and code completion. This contribution enhances serverless computing by simplifying the execution of streaming computations, managing data streams more efficiently, and offering a valuable tool for both researchers and practitioners.

References

  1. 2023. A comprehensive review of State-of-The-Art methods for Java code generation from Natural Language Text. Natural Language Processing Journal 3 (2023), 100013. https://doi.org/10.1016/j.nlp.2023.100013Google ScholarGoogle ScholarCross RefCross Ref
  2. Dirk Eddelbuettel. 2022. A Brief Introduction to Redis. arxiv:2203.06559 [stat.CO]Google ScholarGoogle Scholar
  3. Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. arxiv:2002.08155 [cs.CL]Google ScholarGoogle Scholar
  4. Rosa Filgueira, Amrey Krause, Malcolm Atkinson, Iraklis Klampanos, and Alexander Moreno. 2016. dispel4py: A Python Framework for Data-Intensive Scientific Computing. International Journal of High Performance Computing Applications (IJHPCA) (2016).Google ScholarGoogle Scholar
  5. Rosa Filgueira, Amrey Krause, Alessandro Spinuso, Iraklis Klampanos, Peter Danecek, and Malcolm Atkinson. 2015. Dispel4py: An Open-Source Python library for Data-Intensive Seismology. EGUGA (2015), 6790.Google ScholarGoogle Scholar
  6. Message P Forum. 1994. MPI: A Message-Passing Interface Standard. Technical Report. USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep Code Search. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE ’18). Association for Computing Machinery, New York, NY, USA, 933–944. https://doi.org/10.1145/3180155.3180167Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 7212–7225. https://doi.org/10.18653/v1/2022.acl-long.499Google ScholarGoogle ScholarCross RefCross Ref
  9. Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2020. GraphCodeBERT: Pre-training Code Representations with Data Flow. https://doi.org/10.48550/ARXIV.2009.08366Google ScholarGoogle ScholarCross RefCross Ref
  10. Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. arxiv:2009.08366 [cs.SE]Google ScholarGoogle Scholar
  11. Junjie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, and Nan Duan. 2021. CoSQA: 20, 000+ Web Queries for Code Search and Question Answering. CoRR abs/2105.13239 (2021). arXiv:2105.13239https://arxiv.org/abs/2105.13239Google ScholarGoogle Scholar
  12. Eric Jonas, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the Cloud: Distributed Computing for the 99%. CoRR abs/1702.04024 (2017). arXiv:1702.04024http://arxiv.org/abs/1702.04024Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P Carbone Asterios Katsifodimos, S Ewen Volker Markl, and S Haridi Kostas Tzoumas. 2015. Apache FlinkTM: Stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng 36, 4 (2015).Google ScholarGoogle Scholar
  14. Manoj Kumar. 2019. Serverless architectures review, future trend and the solutions to open problems. American Journal of Software Engineering 6, 1 (2019), 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhuozhao Li, Ryan Chard, Yadu Babuji, Ben Galewsky, Tyler J. Skluzacek, Kirill Nagaitsev, Anna Woodard, Ben Blaiszik, Josh Bryan, Daniel S. Katz, Ian Foster, and Kyle Chard. 2022. Federated Function as a Service for Science. IEEE Transactions on Parallel and Distributed Systems 33, 12 (dec 2022), 4948–4963. https://doi.org/10.1109/tpds.2022.3208767Google ScholarGoogle ScholarCross RefCross Ref
  16. Zihao lI and Rosa Filgueira. 2023. Mapping the repository landscape: harnessing similarity with RepoSim and RepoSnipy. In 2023 IEEE 19th International Conference on e-Science (e-Science). IEEE. https://www.escience-conference.org/2023/ 19th IEEE International Conference on eScience, eScience ; Conference date: 09-10-2023 Through 13-10-2023.Google ScholarGoogle Scholar
  17. Liang Liang, Rosa Filgueira, Yan Yan, and Thomas Heinis. 2022. Scalable adaptive optimizations for stream-based workflows in multi-HPC-clusters and cloud infrastructures. Future Generation Computer Systems 128 (2022), 102–116. https://doi.org/10.1016/j.future.2021.09.036Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung won Hwang, and Alexey Svyatkovskiy. 2022. ReACC: A Retrieval-Augmented Code Completion Framework. arxiv:2203.07722 [cs.SE]Google ScholarGoogle Scholar
  19. Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin B. Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. CoRR abs/2102.04664 (2021).Google ScholarGoogle Scholar
  20. Ruchir Puri, David S. Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladimir Zolotov, Julian Dolby, Jie Chen, Mihir Choudhury, Lindsey Decker, Veronika Thost, Luca Buratti, Saurabh Pujar, Shyam Ramji, Ulrich Finkler, Susan Malaika, and Frederick Reiss. 2021. CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks. arxiv:2105.12655 [cs.SE]Google ScholarGoogle Scholar
  21. Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J. Yadwadkar, Raluca Ada Popa, Joseph E. Gonzalez, Ion Stoica, and David A. Patterson. 2021. What Serverless Computing is and Should Become: The next Phase of Cloud Computing. Commun. ACM 64, 5 (apr 2021), 76–84. https://doi.org/10.1145/3406011Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hossein Shafiei, Ahmad Khonsari, and Payam Mousavi. 2022. Serverless computing: a survey of opportunities, challenges, and applications. Comput. Surveys 54, 11s (2022), 1–32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Shuai Wang, Jinyang Liu, Ye Qiu, Zhiyi Ma, Junfei Liu, and Zhonghai Wu. 2019. Deep learning based code completion models for programming codes. In Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control. 1–9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, and Xin Jiang. 2021. SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation. https://doi.org/10.48550/ARXIV.2108.04556Google ScholarGoogle ScholarCross RefCross Ref
  25. Yue Wang, Weishi Wang, Shafiq Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. https://doi.org/10.48550/ARXIV.2109.00859Google ScholarGoogle ScholarCross RefCross Ref
  26. Christopher Williams and Rosa Filgueira. 2023. RepoGraph: a novel semantic code exploration tool for python repositories based on knowledge graphs and deep learning. In 2023 IEEE 19th International Conference on e-Science (e-Science). IEEE. https://www.escience-conference.org/2023/ 19th IEEE International Conference on eScience, eScience ; Conference date: 09-10-2023 Through 13-10-2023.Google ScholarGoogle Scholar
  27. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, and Jamie Brew. 2019. HuggingFace’s Transformers: State-of-the-art Natural Language Processing. CoRR abs/1910.03771 (2019). arXiv:1910.03771Google ScholarGoogle Scholar
  28. Chunyan Zhang, Junchao Wang, Qinglei Zhou, Ting Xu, Ke Tang, Hairen Gui, and Fudong Liu. 2022. A Survey of Automatic Source Code Summarization. Symmetry 14, 3 (2022). https://www.mdpi.com/2073-8994/14/3/471Google ScholarGoogle Scholar

Index Terms

  1. Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis
      November 2023
      2180 pages
      ISBN:9798400707858
      DOI:10.1145/3624062

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 November 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)12

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format