Skip to main content

On the Efficiency of Building Large Collections of Software: Modeling, Algorithms, and Experimental Results

  • Conference paper
  • First Online:
Software Technologies (ICSOFT 2022)

Abstract

Building software from source code requires a build environment that meets certain requirements, such as the presence of specific compilers, libraries, or other tools. Unfortunately, requirements for different packages can conflict with each other, so it is often impossible to use a single build environment when building a large collection of software. This paper develops techniques to minimize the number of distinct build environments required, and measures the practical impact of our techniques on build time. In particular, we introduce the notion of a “conflict graph,” and prove that the problem of minimizing the number of build environments is equivalent to the graph coloring problem on this graph. We explore several heuristic techniques to compute conflict graph colorings, finding solutions that result in surprisingly small sets of build environments. Using Ubuntu 20.04 as our primary experimental dataset, we computed just 4 different environments that were sufficient for building the “Top 500” most popular source packages, and 11 build environments were sufficient for building all 30,646 source packages included in Ubuntu 20.04. Finally, we experimentally evaluate the benefit of these environments by comparing the work required for building the “Top 500” with our environments to the work required using the traditional minimal environment build. We saw that the total work required for building these packages dropped from 139h36m (139 h and 36 min) to 54 h 18 m, a 61% reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pbuilder-team.pages.debian.net/pbuilder/.

  2. 2.

    https://apt-team.pages.debian.net/python-apt/library.

  3. 3.

    https://github.com/srtate/BuildEnvAnalysis.

  4. 4.

    http://webdocs.cs.ualberta.ca/~joe/Coloring/.

  5. 5.

    Software available at http://cucis.ece.northwestern.edu/projects/MAXCLIQUE/.

References

  1. Claes, M., Mens, T., Di Cosmo, R., Vouillon, J.: A historical analysis of Debian package incompatibilities. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 212–223, May 2015

    Google Scholar 

  2. Decan, A., Mens, T., Claes, M.: On the topology of package dependency networks: a comparison of three programming language ecosystems. In: Proceedings of the 10th European Conference on Software Architecture Workshops (ECSAW), pp. 1–4, November 2016

    Google Scholar 

  3. Decan, A., Mens, T., Grosjean, P.: An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir. Softw. Eng. 24(1), 381–416 (2019)

    Article  Google Scholar 

  4. Di Cosmo, R., Treinen, R., Zacchiroli, S.: Formal aspects of free and open source software components. In: 11th International Symposium on Formal Methods for Components and Objects (FMCO), pp. 216–239 (2013)

    Google Scholar 

  5. Galindo, J., Benavides, D., Segura, S.: Debian packages repositories as software product line models. In: Towards Automated Analysis. In: Proceeding of the First International Workshop on Automated Configuration and Tailoring of Applications (ACOTA) (2010)

    Google Scholar 

  6. Garey, M.R., Johnson, D.S.: Computers and Intractability. W.H. Freeman, San Francisco (1979)

    MATH  Google Scholar 

  7. González-Barahona, J.M., et al.: Analyzing the anatomy of GNU/Linux distributions: methodology and case studies (Red Hat and Debian). In: Free/Open Source Software Development, pp. 27–58 (2003)

    Google Scholar 

  8. Jackson, I., Schwarz, C., Morris, D.A.: Debian policy manual (version 4.6.0.1) (2021). https://www.debian.org/doc/debian-policy/

  9. Kikas, R., Gousios, G., Dumas, M., Pfahl, D.: Structure and evolution of package dependency networks. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 102–112, May 2017

    Google Scholar 

  10. Mancinelli, F., et al.: Managing the complexity of large free and open source package-based software distributions. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), pp. 199–208, September 2006

    Google Scholar 

  11. Nussbaum, L.: Rebuilding Debian using distributed computing. In: Proceedings of the 7th International Workshop on Challenges of Large Applications in Distributed Environments (CLADE), pp. 11–16, June 2009

    Google Scholar 

  12. Pattabiraman, B., Patwary, M.M.A., Gebremedhin, A.H., Liao, W.K., Choudhary, A.: Fast algorithms for the maximum clique problem on massive sparse graphs. In: Algorithms and Models for the Web Graph, pp. 156–169 (2013)

    Google Scholar 

  13. de Sousa, O.F., de Menezes, M.A., Penna, T.: Analysis of the package dependency on Debian GNU/Linux. J. Comput. Interdisc. Sci. 1(2), 127–133 (2009)

    Google Scholar 

  14. Tate, S.R., Yuan, B.: Minimum size build environment sets and graph coloring. In: Proceedings of the 17th International Conference on Software Technologies, ICSOFT 2022, Lisbon, Portugal, 11–13 July 2022, pp. 57–67 (2022)

    Google Scholar 

  15. The Ubuntu Web Team: Ubuntu popularity contest (2021). https://popcon.ubuntu.com/

  16. Vouillon, J., Cosmo, R.D.: On software component co-installability. ACM Trans. Softw. Eng. Methodol. 22(4), 34:1–34:35 (2013)

    Google Scholar 

  17. Wang, J., Wu, Q., Tan, Y., Xu, J., Sun, X.: A graph method of package dependency analysis on Linux Operating system. In: 2015 4th International Conference on Computer Science and Network Technology (ICCSNT), pp. 412–415, December 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen R. Tate .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tate, S.R., Yuan, B. (2023). On the Efficiency of Building Large Collections of Software: Modeling, Algorithms, and Experimental Results. In: Fill, HG., van Sinderen, M., Maciaszek, L.A. (eds) Software Technologies. ICSOFT 2022. Communications in Computer and Information Science, vol 1859. Springer, Cham. https://doi.org/10.1007/978-3-031-37231-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37231-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37230-8

  • Online ISBN: 978-3-031-37231-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics