Skip to main content

Fmeter: Extracting Indexable Low-Level System Signatures by Counting Kernel Function Calls

  • Conference paper
Middleware 2012 (Middleware 2012)

Abstract

System monitoring tools serve to provide operators and developers with an insight into system execution and an understanding of system behavior under a variety of scenarios. Many system abnormalities leave a significant impact on the system execution which may arise out of performance issues, bugs, or errors. Having the ability to quantify and search such behavior in the system execution history can facilitate new ways of looking at problems. For example, operators may use clustering to group and visualize similar system behaviors. We propose a monitoring system that extracts formal, indexable, low-level system signatures using the classical vector space model from the field of information retrieval and text mining. We drive an analogy between the representation of kernel function invocations with terms within text documents. This parallel allows us to automatically index, store, and later retrieve and compare the system signatures. As with information retrieval, the key insight is that we need not rely on the semantic information in a document. Instead, we consider only the statistical properties of the terms belonging to the document (and to the corpus), which enables us to provide both an efficient way to extract signatures at runtime and to analyze the signatures using statistical formal methods. We have built a prototype in Linux, Fmeter, which extracts such low-level system signatures by recording all kernel function invocations. We show that the signatures are naturally amenable to formal processing with statistical methods like clustering and supervised machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hellerstein, J.L.: Engineering autonomic systems. In: ICAC 2009 (2009)

    Google Scholar 

  2. Schroeder, B., Pinheiro, E., Weber, W.D.: DRAM errors in the wild: a large-scale field study. In: SIGMETRICS 2009 (2009)

    Google Scholar 

  3. Dean, J.: Designs, Lessons and Advice from Building Large Distributed Systems. Keynote Talk: LADIS 2009 (2009)

    Google Scholar 

  4. Cohen, I., Zhang, S., Goldszmidt, M., Symons, J., Kelly, T., Fox, A.: Capturing, indexing, clustering, and retrieving system history. In: SOSP 2005 (2005)

    Google Scholar 

  5. Mavinakayanahalli, A., Panchamukhi, P., Keniston, J., Keshavamurthy, A., Hiramatsu, M.: Probing the guts of kprobes. In: Linux Symposium 2006 (2006)

    Google Scholar 

  6. Ftrace - Function Tracer, http://lwn.net/Articles/322666/

  7. Oprofile, http://oprofile.sourceforge.net

  8. Bhatia, S., Kumar, A., Fiuczynski, M.E., Peterson, L.: Lightweight, high-resolution monitoring for troubleshooting production systems. In: OSDI 2008 (2008)

    Google Scholar 

  9. Cretu-Ciocarlie, G.F., Budiu, M., Goldszmidt, M.: Hunting for problems with artemis. In: Proceedings of WASL (2008)

    Google Scholar 

  10. Massie, M.L., Chun, B.N., Culler, D.E.: The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. In: Proceedings of Parallel Computing (2004)

    Google Scholar 

  11. Sekar, R., Bendre, M., Dhurjati, D., Bollineni, P.: A fast automaton-based method for detecting anomalous program behaviors. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy (SP), pp. 144–155 (2001)

    Google Scholar 

  12. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  13. Booth, A.D.: A “law” of occurrences for words of low frequency. Information and Control 10(4), 386–393 (1967)

    Article  MATH  Google Scholar 

  14. Grishchenko: http://wikipedia.org/wiki/File:Wikipedia-n-zipf.png

  15. Boyd-Wickizer, S., Morris, R., Kaashoek, M.F.: Reinventing scheduling for multicore systems. In: HotOS 2009 (2009)

    Google Scholar 

  16. Debugfs, http://lwn.net/Articles/115405/

  17. Srivastava, A., Eustace, A.: ATOM - A System for Building Customized Program Analysis Tools. In: PLDI 1994 (1994)

    Google Scholar 

  18. Edge, J.: A lockless ring-buffer, http://lwn.net/Articles/340400/

  19. Edge, J.: One ring buffer to rule them all? http://lwn.net/Articles/388978/

  20. Brandenburg, B.B., Anderson, J.H.: Feather-trace: A light-weight event tracing toolkit. In: OSPERT 2007 (2007)

    Google Scholar 

  21. Krieger, O., Auslander, M., Rosenburg, B., Wisniewski, R.W., Xenidis, J., Da Silva, D., Ostrowski, M., Appavoo, J., Butrico, M., Mergen, M., Waterland, A., Uhlig, V.: K42: building a complete operating system. In: EuroSys (2006)

    Google Scholar 

  22. Staelin, C.: lmbench: Portable Tools for Performance Analysis. In: USENIX ATC 1996 (1996)

    Google Scholar 

  23. Joachims, T.: Svmlight, http://svmlight.joachims.org/

  24. Joachims, T.: Learning to Classify Text Using Support Vector Machines. Dissertation. Springer (2002)

    Google Scholar 

  25. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer (1995)

    Google Scholar 

  26. Netperf, http://netperf.org/

  27. Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A sense of self for unix processes. In: IEEE Symposium on Security and Privacy (1996)

    Google Scholar 

  28. Li, P., Gao, D., Reiter, M.K.: Automatically Adapting a Trained Anomaly Detector to Software Patches. In: Balzarotti, D. (ed.) RAID 2009. LNCS, vol. 5758, pp. 142–160. Springer, Heidelberg (2009)

    Google Scholar 

  29. Bodik, P., Goldszmidt, M., Fox, A., Woodard, D.B., Andersen, H.: Fingerprinting the datacenter: automated classification of performance crises. In: EuroSys 2010 (2010)

    Google Scholar 

  30. Anderson, J.M., Berc, L.M., Dean, J., Ghemawat, S., Henzinger, M.R., Leung, S.T.A., Sites, R.L., Vandevoorde, M.T., Waldspurger, C.A., Weihl, W.E.: Continuous profiling: where have all the cycles gone? In: SOSP 1997 (1997)

    Google Scholar 

  31. Dean, J., Hicks, J.E., Waldspurger, C.A., Weihl, W.E., Chrysos, G.: Profileme: hardware support for instruction-level profiling on out-of-order processors. In: MICRO 1997 (1997)

    Google Scholar 

  32. Sweeney, P.F., Hauswirth, M., Cahoon, B., Cheng, P., Diwan, A., Grove, D., Hind, M.: Using hardware performance monitors to understand the behavior of java applications. In: Proceedings of the 3rd Virtual Machine Research and Technology Symposium, VM (2004)

    Google Scholar 

  33. DiFatta, C., Klein, D.V., Poepping, M.: Carnegie mellon’s cydat: Harnessing a wide array of telemetry data to enhance distributed system diagnostics. In: Proceedings of WASL (2008)

    Google Scholar 

  34. Park, K., Pai, V.S.: Comon: a mostly-scalable monitoring system for planetlab. SIGOPS Oper. Syst. Rev. 40(1), 65–74 (2006)

    Article  Google Scholar 

  35. Salfner, F., Tschirpke, S.: Error log processing for accurate failure prediction. In: WASL 2008 (2008)

    Google Scholar 

  36. Sandeep, S.R., Swapna, M., Niranjan, T., Susarla, S., Nandi, S.: Cluebox: A performance log analyzer for automated troubleshooting. In: WASL 2008 (2008)

    Google Scholar 

  37. Fulp, E.W., Fink, G.A., Haack, J.N.: Predicting computer system failures using support vector machines. In: Proceedings of WASL (2008)

    Google Scholar 

  38. Hauswirth, M., Sweeney, P.F., Diwan, A., Hind, M.: Vertical profiling: understanding the behavior of object-priented applications. In: OOPSLA 2004 (2004)

    Google Scholar 

  39. Redstone, J., Swift, M.M., Bershad, B.N.: Using computers to diagnose computer problems. In: Proceedings of HotOS, pp. 91–86 (2003)

    Google Scholar 

  40. Musuvathi, M., Qadeer, S., Ball, T., Basler, G., Nainar, P.A., Neamtiu, I.: Finding and reproducing heisenbugs in concurrent programs. In: OSDI 2008 (2008)

    Google Scholar 

  41. Ronsse, M., Christiaens, M., Bosschere, K.D.: Cyclic debugging using execution replay. In: Proceedings of the International Conference on Computational Science-Part II 2001 (2001)

    Google Scholar 

  42. Guo, Z., Wang, X., Tang, J., Liu, X., Xu, Z., Wu, M., Kaashoek, M.F., Zhang, Z.: R2: An application-level kernel for record and replay. In: OSDI 2008 (2008)

    Google Scholar 

  43. Tucek, J., Lu, S., Huang, C., Xanthos, S., Zhou, Y.: Triage: diagnosing production run failures at the user’s site. In: SOSP 2007 (2007)

    Google Scholar 

  44. Qin, F., Tucek, J., Zhou, Y., Sundaresan, J.: Rx: Treating bugs as allergies - a safe method to survive software failures. ACM Trans. Comput. Syst. 25, 7 (2007)

    Article  Google Scholar 

  45. Cohen, I., Goldszmidt, M., Kelly, T., Symons, J., Chase, J.S.: Correlating instrumentation data to system states: a building block for automated diagnosis and control. In: OSDI 2004 (2004)

    Google Scholar 

  46. Sequeira, K., Zaki, M.: Admit: anomaly-based data mining for intrusions. In: KDD 2002 (2002)

    Google Scholar 

  47. Ghosh, A.K., Schwartzbard, A.: A study in using neural networks for anomaly and misuse detection. In: Proceedings of the 8th conference on USENIX Security Symposium 1999 (1999)

    Google Scholar 

  48. Bonfante, G., Kaczmarek, M., Marion, J.Y.: Control flow graphs as malware signatures. In: Proceedings of the International Workshop on the Theory of Computer Viruses (2007)

    Google Scholar 

  49. Lane, T., Brodley, C.E.: Temporal sequence learning and data reduction for anomaly detection. ACM Trans. Inf. Syst. Secur. 2(3), 295–331 (1999)

    Article  Google Scholar 

  50. Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M.I.: Detecting large-scale system problems by mining console logs. In: SOSP 2009 (2009)

    Google Scholar 

  51. Lou, J.G., Fu, Q., Yang, S., Xu, Y., Li, J.: Mining invariants from console logs for system problem detection. In: USENIX ATC 2010 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 IFIP International Federation for Information Processing

About this paper

Cite this paper

Marian, T., Weatherspoon, H., Lee, KS., Sagar, A. (2012). Fmeter: Extracting Indexable Low-Level System Signatures by Counting Kernel Function Calls. In: Narasimhan, P., Triantafillou, P. (eds) Middleware 2012. Middleware 2012. Lecture Notes in Computer Science, vol 7662. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35170-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35170-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35169-3

  • Online ISBN: 978-3-642-35170-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics