Abstract
Building software from source code requires a build environment that meets certain requirements, such as the presence of specific compilers, libraries, or other tools. Unfortunately, requirements for different packages can conflict with each other, so it is often impossible to use a single build environment when building a large collection of software. This paper develops techniques to minimize the number of distinct build environments required, and measures the practical impact of our techniques on build time. In particular, we introduce the notion of a “conflict graph,” and prove that the problem of minimizing the number of build environments is equivalent to the graph coloring problem on this graph. We explore several heuristic techniques to compute conflict graph colorings, finding solutions that result in surprisingly small sets of build environments. Using Ubuntu 20.04 as our primary experimental dataset, we computed just 4 different environments that were sufficient for building the “Top 500” most popular source packages, and 11 build environments were sufficient for building all 30,646 source packages included in Ubuntu 20.04. Finally, we experimentally evaluate the benefit of these environments by comparing the work required for building the “Top 500” with our environments to the work required using the traditional minimal environment build. We saw that the total work required for building these packages dropped from 139h36m (139 h and 36 min) to 54 h 18 m, a 61% reduction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Claes, M., Mens, T., Di Cosmo, R., Vouillon, J.: A historical analysis of Debian package incompatibilities. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 212–223, May 2015
Decan, A., Mens, T., Claes, M.: On the topology of package dependency networks: a comparison of three programming language ecosystems. In: Proceedings of the 10th European Conference on Software Architecture Workshops (ECSAW), pp. 1–4, November 2016
Decan, A., Mens, T., Grosjean, P.: An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir. Softw. Eng. 24(1), 381–416 (2019)
Di Cosmo, R., Treinen, R., Zacchiroli, S.: Formal aspects of free and open source software components. In: 11th International Symposium on Formal Methods for Components and Objects (FMCO), pp. 216–239 (2013)
Galindo, J., Benavides, D., Segura, S.: Debian packages repositories as software product line models. In: Towards Automated Analysis. In: Proceeding of the First International Workshop on Automated Configuration and Tailoring of Applications (ACOTA) (2010)
Garey, M.R., Johnson, D.S.: Computers and Intractability. W.H. Freeman, San Francisco (1979)
González-Barahona, J.M., et al.: Analyzing the anatomy of GNU/Linux distributions: methodology and case studies (Red Hat and Debian). In: Free/Open Source Software Development, pp. 27–58 (2003)
Jackson, I., Schwarz, C., Morris, D.A.: Debian policy manual (version 4.6.0.1) (2021). https://www.debian.org/doc/debian-policy/
Kikas, R., Gousios, G., Dumas, M., Pfahl, D.: Structure and evolution of package dependency networks. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 102–112, May 2017
Mancinelli, F., et al.: Managing the complexity of large free and open source package-based software distributions. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), pp. 199–208, September 2006
Nussbaum, L.: Rebuilding Debian using distributed computing. In: Proceedings of the 7th International Workshop on Challenges of Large Applications in Distributed Environments (CLADE), pp. 11–16, June 2009
Pattabiraman, B., Patwary, M.M.A., Gebremedhin, A.H., Liao, W.K., Choudhary, A.: Fast algorithms for the maximum clique problem on massive sparse graphs. In: Algorithms and Models for the Web Graph, pp. 156–169 (2013)
de Sousa, O.F., de Menezes, M.A., Penna, T.: Analysis of the package dependency on Debian GNU/Linux. J. Comput. Interdisc. Sci. 1(2), 127–133 (2009)
Tate, S.R., Yuan, B.: Minimum size build environment sets and graph coloring. In: Proceedings of the 17th International Conference on Software Technologies, ICSOFT 2022, Lisbon, Portugal, 11–13 July 2022, pp. 57–67 (2022)
The Ubuntu Web Team: Ubuntu popularity contest (2021). https://popcon.ubuntu.com/
Vouillon, J., Cosmo, R.D.: On software component co-installability. ACM Trans. Softw. Eng. Methodol. 22(4), 34:1–34:35 (2013)
Wang, J., Wu, Q., Tan, Y., Xu, J., Sun, X.: A graph method of package dependency analysis on Linux Operating system. In: 2015 4th International Conference on Computer Science and Network Technology (ICCSNT), pp. 412–415, December 2015
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tate, S.R., Yuan, B. (2023). On the Efficiency of Building Large Collections of Software: Modeling, Algorithms, and Experimental Results. In: Fill, HG., van Sinderen, M., Maciaszek, L.A. (eds) Software Technologies. ICSOFT 2022. Communications in Computer and Information Science, vol 1859. Springer, Cham. https://doi.org/10.1007/978-3-031-37231-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-37231-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37230-8
Online ISBN: 978-3-031-37231-5
eBook Packages: Computer ScienceComputer Science (R0)