Skip to main content

Recovering High-Level Structure of Software Systems Using a Minimum Description Length Principle

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2464))

Abstract

In [12] a system was described for finding good hierarchical decompositions of complex systems represented as collections of nodes and links, using a genetic algorithm, with an information theoretic fitness function (representing complexity) derived from a minimum description length principle. This paper describes the application of this approach to the problem of reverse engineering the high-level structure of software systems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Briand, L.C., Morasca, S., and Basili, V.R. (1996) Property-based software engineering measurement: Refining the additivity properties. IEEE Transactions on Software Engineering, 22(1):68–86.

    Article  Google Scholar 

  2. Collins, R. and Jefferson, D. (1991) Selection in massively parallel genetic algorithms. Proceedings of the Fourth International Conference on Genetic Algorithms, ICGA-91 Belew, R.K. and Booker, L.B. (eds.), Morgan Kaufmann.

    Google Scholar 

  3. Doval, D., Mancoridis, S., and Mitchell, B.S. (1999) Automatic Clustering of Software Systems using a Genetic Algorithm. IEEE Proceedings of the 1999 International Conference on Software Tools and Engineering Practice (STEP’99).

    Google Scholar 

  4. Glover, F. (1989) Tabu Search-Part I. ORSA Journal on Computing, Vol. 1, No. 3, pp. 190–206.

    MATH  Google Scholar 

  5. Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley.

    Google Scholar 

  6. Harman, M., Hierons, R., and Proctor, M. (2002) A New Representation and Crossover Operator for Search-Based Optimization of Software Modularization. Submitted to GECCO-2002.

    Google Scholar 

  7. Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. Now published by MIT Press.

    Google Scholar 

  8. Hutchens, D., and Basili, R. (1985) System StructureAnalysis: Clustering with Data Bindings. IEEE Transactions on Software Engineering, SE-11(8):749–757, 1985.

    Article  Google Scholar 

  9. Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P. (1983) Optimization by Simulated Annealing, Science, 220, 4598, 671–680.

    Article  MathSciNet  Google Scholar 

  10. Koza, J.R. (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press.

    Google Scholar 

  11. Li, M. and Vitanyi, P. (1997) An Introduction to Kolmogorov Complexity Theory and Its Applications. Springer-Verlag.

    Google Scholar 

  12. Lutz, R. (2001) Evolving Good Hierarchical Decompositions of Complex Systems. Journal of Systems Architecture, 47, pp. 613–634.

    Article  Google Scholar 

  13. Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R. (1998) Using automatic clustering to produce high-level system organizations of source code. In International Workshop on Program Comprehension (IWPC’98) IEEE Computer Society Press, Los Alamitos, California, USA, pp.45–53.

    Google Scholar 

  14. McIlhagga, M., Husbands, P., and Ives, R. (1996) A comparison of simulated annealing, dispatching rules and a coevolutionary distributed genetic algorithm as optimization techniques for various integrated manufacturing planning problems. In Proceedings of PPSN IV, Volume I. LNCS 1141, pp. 604–613, Springer-Verlag.

    Google Scholar 

  15. Mitchell, M. (1996) An Introduction to Genetic Algorithms. MIT Press.

    Google Scholar 

  16. Mitchell, T.M. (1997) Machine Learning. McGraw-Hill.

    Google Scholar 

  17. Rissanen, J. (1978) Modelling by the shortest data description. Automatica-J.IFAC, 14, pp.465–471.

    Article  MATH  Google Scholar 

  18. Shannon, C.E. (1948) The mathematical theory of communications. Bell System Technical Journal 27:379–423, 623-656.

    MathSciNet  Google Scholar 

  19. Thornton, C.J. and du Boulay, B. (1992) Artificial Intelligence Through Search. Intellect, Oxford, England.

    Google Scholar 

  20. Wiggerts, T. (1997) Using clutering algorithms in legacy systems remodularisation. In Proc. Working Conference on Reverse Engineering (WCRE’97)

    Google Scholar 

  21. Wood, J.A. (1998) Improving Software Designs via the Minimum Description Length Principle. Ph.D. Thesis, University of Sussex (available from http://cogslib.cogs.susx.ac.uk)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lutz, R. (2002). Recovering High-Level Structure of Software Systems Using a Minimum Description Length Principle. In: O’Neill, M., Sutcliffe, R.F.E., Ryan, C., Eaton, M., Griffith, N.J.L. (eds) Artificial Intelligence and Cognitive Science. AICS 2002. Lecture Notes in Computer Science(), vol 2464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45750-X_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-45750-X_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44184-7

  • Online ISBN: 978-3-540-45750-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics