Skip to main content
Log in

Software Project Effort Estimation Based on Multiple Parametric Models Generated Through Data Clustering

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Parametric software effort estimation models usually consists of only a single mathematical relationship. With the advent of software repositories containing data from heterogeneous projects, these types of models suffer from poor adjustment and predictive accuracy. One possible way to alleviate this problem is the use of a set of mathematical equations obtained through dividing of the historical project datasets according to different parameters into subdatasets called partitions. In turn, partitions are divided into clusters that serve as a tool for more accurate models. In this paper, we describe the process, tool and results of such approach through a case study using a publicly available repository, ISBSG. Results suggest the adequacy of the technique as an extension of existing single-expression models without making the estimation process much more complex that uses a single estimation model. A tool to support the process is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Boehm B, Abts C, Chulani S. Software development cost estimation approaches — A survey. USC Center for Software Engineering Technical Report USC-CSE-2000-505, 2000.

  2. Parametric Estimating Initiative. Parametric Estimating Handbook, 2nd Edition, 1999.

  3. Stensrud E, Foss T, Kitchenham B, Myrtveit I. An empirical validation of the relationship between the magnitude of relative error and project size. In Proc. the Eighth IEEE Symp. Software Metrics, Ottawa, Canada, 2002, pp.3–12.

  4. Cuadrado-Gallego J J, Sicilia M A, Garre M et al. An empirical study of process-related attributes in segmented software cost-estimation relationships. Journal of Systems and Software, 2006, 79(3): 351–361.

    Google Scholar 

  5. Shepperd M, Schofield C, Kitchenham B. Effort estimation using analogy. In Proc. 8th Int. Conf. Software Engineering, IEEE Computer Society Press, Berlin, 1996, pp.170–178.

  6. Xu Z, Khoshgoftaar T. Identification of fuzzy models of software cost estimation. Fuzzy Sets and Systems, 2004, 145(1): 141–163.

    Article  MathSciNet  Google Scholar 

  7. Pedrycz W, Succi G. Genetic granular classifiers in modeling software quality. The Journal of Systems and Software, 2002, 76(3): 277–285.

    Article  Google Scholar 

  8. Dick S, Meeks A, Last M et al. Data mining in software metrics databases. Fuzzy Sets and Systems, 2004, 145(1): 81–110.

    Article  MathSciNet  Google Scholar 

  9. Lung C H, Zaman M, Nandi A. Applications of clustering techniques to software partitioning, recovery and restructuring. Journal of Systems and Software, 2004, 73(2): 227–244.

    Article  Google Scholar 

  10. Dolado J. On the problem of the software cost function. Information and Software Technology, 2001, 43(1): 61–72.

    Article  Google Scholar 

  11. Shepperd M, Schofield C. Estimating software project effort using analogies. IEEE Trans. Software Engineering, 1997, 23(11): 736–743.

    Article  Google Scholar 

  12. Oligny S, Bourque P, Abran A, Fournier B. Exploring the relation between effort and duration in software engineering project. In Proc. World Computer Congress, Beijing, China, August 21–25, 2000, pp.175–178.

  13. Marquardt W. An algorithm for least squares estimation of non-linear parameters. J. Soc. Indust. Appl. Math., 1963, 11: 431–441.

    Article  MATH  MathSciNet  Google Scholar 

  14. Conte S D, Dunsmore H E, Shen V Y. Software Engineering Metrics and Models. Menlo Park: Benjamin/Cummings, CA, 1986.

    Google Scholar 

  15. Kohavi R, John G. Automatic parameter selection by minimizing estimated error. In Proc. 12th Int. Conf. Machine Learning, San Francisco, 1995, pp.304–312.

  16. Witten I H, Frank E. Data Mining, Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco: Morgan Kaufmann Publishers, USA, 2005.

    Google Scholar 

  17. NESMA. NESMA FPA counting practices manual (CPM 2.0), 1996.

  18. Dreger J B. Function Point Analysis. Englewood Cliffs, NJ: Prentice Hall, 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan J. Cuadrado Gallego.

Additional information

This work is supported by the Spanish Ministry of Science and Technology under Grant No. CICYT TIN2004-06689-C03.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gallego, J.J.C., Rodríguez, D., Sicilia, M.Á. et al. Software Project Effort Estimation Based on Multiple Parametric Models Generated Through Data Clustering. J Comput Sci Technol 22, 371–378 (2007). https://doi.org/10.1007/s11390-007-9043-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-007-9043-5

Keywords

Navigation