Skip to main content

Advertisement

Log in

Automated Problem Determination Using Call-Stack Matching

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

Abstract

We present an architecture and algorithms for performing automated software problem determination using call-stack matching. In an environment where software is used by a large user community, the same problem may re-occur many times. We show that this can be detected by matching the program call-stack against a historical database of call-stacks, so that as soon as the problem has been resolved once, future cases of the same or similar problems can be automatically resolved. This would greatly reduce the number of cases that need to be dealt with by human support analysts. We also show how a call-stack matching algorithm can be automatically learned from a small sample of call-stacks labeled by human analysts, and examine the performance of this learning algorithm on two different data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. BoundsChecker, http://www.compuware.com/products/devpartner/bounds.htm.

  2. TrueTime, http://www.hallogram.com/devvb/truetime/.

  3. VTune, http://developer.intel.com/software/products/vtune/index.htm.

  4. PurifyPlus, http://www-306.ibm.com/software/awdtools/purifyplus/.

  5. S. Shende, A. D. Malony, and R. Ansell-Bell, Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation, Proceedings of Parallel and Distributed Processing Techniques and Applications, 2001.

  6. “Comparing and contrasting runtime error detection technologies,” available from MicroQuill at http://www.microquill.com/heapagent/ha_comp.htm.

  7. HPROF, http://java.sun.com/developer/TechTips/2000/tt0124.html.

  8. xdProf, http://xdprof.sourceforge.net/.

  9. M. Pietrek, Under the Hood: The .NET Profiling API and the DNProfiler Tool, MSDN Magazine, December 2001.

  10. JVMDI, http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jvmdi-spec.html.

  11. D. Crevier, AI: The Tumultuous History of the Search for Artificial Intelligence, Basic Books, 1993.

  12. A. Zeller and R. Hildebrandt, Simplifying and isolating failure-inducing input, IEEE Transactions on Software Engineering 28(2), pp. 183–200, February 2002.

    Google Scholar 

  13. J. D. Choi and A. Zeller, Isolating Failure-Inducing Thread Schedules, Proceedings of the International Symposium on Software Testing and Analysis, July 2002.

  14. H. Cleve and A. Zeller, Finding Failure Cases through Automated Testing, Proceedings of the Fourth International Workshop on Automated Debugging, 2000.

  15. B. Liblit, A. Aiken, A. Zheng, and M. Jordan, Sampling User Executions for Bug Isolation, Workshop on Remote Analysis and Measurement of Software Systems, 2003.

  16. M. Chen, A. Zheng, J. Lloyd, M. Jordan, and E. Brewer, Failure Diagnosis Using Decision Trees, International Conference on Autonomic Computing, 2004.

  17. A. Podgurski, D. Leon, P. Francis, and M. Minch, Automated Support for Classifying and Prioritizing Software Failure Reports, 25th International Conference on Software Engineering, pp. 465, 2003.

  18. S. Hangal and M. Lam, Tracking Down Software Bugs Using Automatic Anomaly Detection, 24th International Conference on Software Engineering, pp. 291, 2002.

  19. M. D. Ernst, J. Cockrell, W. Grisowold, and D. Notkin, dynamically discovering likely program invariants to support program evolution, IEEE Transactions on Software Engineering, Vol. 27, No. 2, February 2001.

  20. T. Acorn and S. Walden, SMART: Support management reasoning technology for Compaq customer service, Innovative Applications of Artificial Intelligence, Vol. 4, 1992.

  21. T. Li, S. Zhu, and M. Ogihara, Mining Patterns from Case Base Analysis, Workshop on Integrating Data Mining and Knowledge Management, 2001.

  22. L. Lewis, Managing Computer Networks: A Case-Based Reasoning Approach, Artech House Publishers, 1995.

  23. H. H. Feng, O. Kolesnikov, P. Fogla, W. Lee, and W. Gong, Anomaly Detection Using Call Stack Information, Proceedings of the 2003 IEEE Symposium on Security and Privacy, pp. 62, 2003.

  24. J. Lambert, Using Stack Traces to Identify Failed Executions in a Java Distributed System, Masters Thesis, Case Western Reserve University, 2002.

  25. J. Lambert and A. Podgurski, xdProf: A Tool for the capture and analysis of stack traces in a distributed Java system, International Society of Optical Engineering (SPIE) Proceedings, Vol. 4521, pp. 96–105, 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Brodie.

Additional information

Mark Brodie is a research staff member in the “Machine Learning for Systems” group at the IBM T.J. Watson Research Center in Hawthorne, NY. He did his undergraduate work in Mathematics at the University of the Witwatersrand in South Africa and received his PhD in Computer Science at the University of Illinois in 2000. His research interests include machine learning, data mining, and problem determination.

Sheng Ma received his BS degree in Electrical Engineering from Tsinghua University, Beijing China, in 1992, and his MS and PhD with honors in Electrical Engineering from Rensselaer Polytechnic Institute, Troy, NY, in 1995 and 1998, respectively. He joined the IBM T.J. Watson Research Center as a research staff member in 1998 and became manager of the “Machine Learning for Systems” group in 2001. His current research interests include machine learning, data mining, network traffic modeling and control, and network and computer systems management.

Leonid Rachevsky is a software systems analyst in the “Machine Learning for Systems” group at the IBM T.J. Watson Research Center in Hawthorne, NY. He obtained his MSc in Mathematics at Kazan State University and his PhD in Technical Science (Applied Mathematics) at the Kazan Institute of Chemical Engineering in Kazan, USSR (now Russia). He has worked extensively as a software engineer and senior software analyst in Israel, Canada and the United States.

Jon Champlin is an advisory software engineer with the Lotus division of IBM’s Software Group. He received a Bachelors of Computer Science from Siena College in 1993. He is part of the external support group and has developed several serviceability features for Lotus Notes/Domino.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brodie, M., Ma, S., Rachevsky, L. et al. Automated Problem Determination Using Call-Stack Matching. J Netw Syst Manage 13, 219–237 (2005). https://doi.org/10.1007/s10922-005-4443-8

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10922-005-4443-8

Keywords

Navigation