An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Lane, Terran; Brodley, Carla E.

doi:10.1023/A:1021830128811

An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Published: April 2003

Volume 51, pages 73–107, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Download PDF

Terran Lane¹ &
Carla E. Brodley²

1595 Accesses
41 Citations
Explore all metrics

Abstract

This paper introduces the computer security domain of anomaly detection and formulates it as a machine learning task on temporal sequence data. In this domain, the goal is to develop a model or profile of the normal working state of a system user and to detect anomalous conditions as long-term deviations from the expected behavior patterns. We introduce two approaches to this problem: one employing instance-based learning (IBL) and the other using hidden Markov models (HMMs). Though not suitable for a comprehensive security solution, both approaches achieve anomaly identification performance sufficient for a low-level “focus of attention” detector in a multitier security system. Further, we evaluate model scaling techniques for the two approaches: two clustering techniques for the IBL approach and variation of the number of hidden states for the HMM approach. We find that over both model classes and a wide range of model scales, there is no significant difference in performance at recognizing the profiled user. We take this invariance as evidence that, in this security domain, limited memory models (e.g., fixed-length instances or low-order Markov models) can learn only part of the user identity information in which we're interested and that substantially different models will be necessary if dramatic improvements in user-based anomaly detection are to be achieved.

References

Aha, D., Kibler, D., & Albert, M. (1991). Instance-based learning algorithms. Machine Learning, 6:1,37–66.
Google Scholar
Anderson, J. P. (1980). Computer security threat monitoring and surveillance. Technical Report (unnumbered), Fort Washington, PA: James P. Anderson Co.
Google Scholar
Angulin, D. (1987). Learning regular sets from queries and counterexamples. Information and Computation, 75, 87–106.
Google Scholar
Aslam, J. A., & Rivest, R. L. (1990). Inferring graphs from walks. In Proceedings of the Third Annual Workshop on Computational Learning Theory (pp. 359–370). Rochester, NY: ACM Press.
Google Scholar
Balasubramaniyan, J. S., Garcia-Fernandez, J. O., Isacoff, D., Spafford, E., & Zamboni, D. (1998). An architecture for intrusion detection using autonomous agents. Technical Report COAST TR 98/05, Wes Lafayette, IN: Purdue University, COAST Laboratory.
Google Scholar
Bollobás, B., Das, G., Gunopulos, D., & Mannila, H. (1997). Time-series similarity problems and well-separated geometric sets. In Thirteenth Annual ACM Symposium on Computational Geometry. Rochester, NY: ACM Press.
Google Scholar
Burl, M. C., Fayyad, U. M., Perona, P., Smyth, P., & Burl, M. P. (1994). Automating the hunt for volcanoes on Venus. In Proceedings of the 1994 Computer Vision and Pattern Recognition Conference (pp. 302–309). Los Alamitos, CA: IEEE Computer Society Press.
Google Scholar
Casella, G., & Berger, R. L. (1990). Statistical inference. Pacific Grove, CA: Brooks/Cole.
Google Scholar
Chenoweth, T., & Obradovic, Z. (1996). A multi-component nonlinear prediction system for the S&P 500 index Neurocomputing, 10:3, 275–290.
Google Scholar
Cis (1999). NetRanger 2.2.1 user guide. Available on Cisco Documentation CD-ROM or at http://www.cisco.com/univercd/cc/td/doc/product/iaabu/netrangr/nr221/nr221ug/index.htm. San Jose, CA: Cisco Systems Inc.
Das, G., Gunopulos, D., & Mannila, H. (1997). Finding similar time series. In Proceedings of The Fourth Inter-national Conference on Knowledge Discovery and Data Mining.
Dasarathy, B. V. (1991). Nearest neighbor (NN) norms: NN pattern classification techniques. Los Alamitos, CA: IEEE Computer Society Press.
Google Scholar
Davison, B. D., & Hirsh, H. (1998). Predicting sequences of user actions. In Proceedings of the AAAI-98/ICML-98 Joint Workshop on AI Approaches to Time-Series Analysis (pp. 5–12).
Denning, D. E. (1987). An intrusion-detection model. IEEE Transactions on Software Engineering, 13:2, 222–232.
Google Scholar
Domingos, P. (1995). Rule induction and instance-based learning: A unified approach. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, Montreal, Canada (pp. 1226–1232). San Mateo, CA: Morgan Kaufmann.
Google Scholar
DuMouchel, W., & Schonlau, M. (1998). Afast computer intrusion detection algorithm based on hypothesis testing of command transition probabilities. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (pp. 189–193). AAAI Press.
Fawcett, T. & Provost, F. (1999). Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining.
Fayyad, U. M., Weir, N., & Djorgovski, S. (1993). SKICAT: A machine learning system for automated cataloging of large scale sky surveys. In Proceedings of the Tenth International Conference on Machine Learning (pp. 112–119).
Forrest, S., Hofmeyr, S. A., Somayaji, A., & Longstaff, T. A. (1996). A sense of self for UNIX processes. In Proceedings of 1996 IEEE Symposium on Security and Privacy. Los Alamitos, CA: IEEE Computer Society Press.
Google Scholar
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.
Google Scholar
Fukunaga, K. (1990). Statistical pattern recognition (2nd edn.). San Diego, CA: Academic Press.
Google Scholar
Gordon, S. (1996). Current computer virus threats, countermeasures, and strategic solutions. White paper, McAfee Associates.
Greenberg, S. (1988). Using UNIX: Collected traces of 168 users. Technical Report 88/333/45, Alberta, Canada: University of Calgary, Department of Computer Science. Includes tar-format cartridge tape.
Google Scholar
Heberlein, L. T., Dias, G. V., Levitt, K. N., Mukherjee, B., Wood, J., & Wolber, D. (1990). A network security monitor. In Proceedings of the 1990 IEEE Symposium on Research in Security and Privacy (pp. 296–304).
ISS (2000). RealSecure product datasheet. Available at http://www.iss.net/customer care/resource center/product lit/. Atlanta, GA: Internet Security Systems.
Juang, B.-H. (1984). On the hidden Markov model and dynamic time warping for speech recognition—A unified view. AT&T Bell Laboratories Technical Journal, 63:7, 1213–1243.
Google Scholar
Kumar, S., & Spafford, E. (1994). An application of pattern matching in intrusion detection. Technical Report CSD-TR-94-013, West Lafayette, IN: Purdue University, Computer Science.
Google Scholar
Laird, P., & Saul, R. (1994). Discrete sequence prediction and its applications. Machine Learning, 15:1,43–68.
Google Scholar
Lane, T. (1998). Filtering techniques for rapid user classification. WS-98-07, Menlo Park, CA: AAAI Press.
Google Scholar
Lane, T. (1999). Hidden markov models for human/computer interface modeling. In Proceedings of the IJCAI-99 Workshop on Learning About Users (Sixteenth International Joint Conference on Artificial Intelligence) (pp. 35–44
Lane, T. (2000). Machine Learning Techniques for the Computer Security Domain of Anomaly Detection. Ph.D. thesis, W. Lafayette, IN: Purdue University, Electrical and Computer Engineering.
Google Scholar
Lane, T., & Brodley, C. E. (1997a). An application of machine learning to anomaly detection. In Proceedings of the Twentieth National Information Systems Security Conference (Vol 1, pp. 366–380). Gaithersburg, MD: The National Institute of Standards and Technology and the National Computer Security Center, National Institute of Standards and Technology.
Google Scholar
Lane, T., & Brodley, C. E. (1997b). Detecting the abnormal: Machine learning in computer security. Technical Report TR-ECE 97-1, W. Lafayette, IN: Purdue University, Electrical and Computer Engineering.
Google Scholar
Lane, T., & Brodley, C. E. (1997c). Sequence matching and learning in anomaly detection for computer security. In Proceedings of AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management (Fourteenth National Conference on Artificial Intelligence) (pp. 43–49).
Lane, T., & Brodley, C. E. (1998). Approaches to online learning and concept drift for user identification in computer security. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (pp. 259–263). Menlo Park, CA: AAAI Press.
Google Scholar
Lane, T., & Brodley, C. E. (1999). Temporal sequence learning and data reduction for anomaly detection. ACM Transactions on Information and System Security, 2:3, 295–331.
Google Scholar
Lee, W., Stolfo, S., & Chan, P. (1997). Learning patterns from UNIX process execution traces for intrusion detection. In Proceedings of AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management (Fourteenth National Conference on Artificial Intelligence) (pp. 50–56).
Lee, W., Stolfo, S. J., & Mok, K. W. (1998). Mining audit data to build intrusion detection models. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (pp. 66–72). Menlo Park, CA: AAAI Press.
Google Scholar
Lunt, T. F. (1990). IDES: An intelligent system for detecting intruders. In Proceedings of the Symposium: Computer Security, Threat and Countermeasures, Rome, Italy.
Moon, T. K. (1996, November). The expectation-maximization algorithm. IEEE Signal Processing Magazine, 47–59.
Norton, S. W. (1994). Learning to recognize promoter sequences in E. coli by modelling uncertainty in the training data. In Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA (pp. 657–663).
Oppenheim, A., & Schafer, R. (1989). Discrete-time signal processing. Signal processing. Englewood Cliffs, NJ: Prentice Hall.
Google Scholar
Orwant, J. (1995). Heterogeneous learning in the Doppelg¨ anger user modeling system. User Modeling and User-Adapted Interaction, 4:2, 107–130.
Google Scholar
Pfleeger, C. P. (1997). Security in computing (2nd edn.). Upper Saddle River, NJ: Prentice Hall PTR.
Google Scholar
Porras, P., & Neumann, P. (1997). EMERALD: Event monitoring enabling responses to anomalous live distur-bances. In Proceedings of the Twentieth National Information Systems Security Conference (pp. 353–365). </del>Gaithersburg, MD: The National Institute of Standards and Technology and the National Computer Security Center, National Institute of Standards and Technology.
Google Scholar
Power, R. (1998). Current and future danger: A CSI primer on computer crime & information warfare. San Francisco, CA: Computer Security Institute.
Google Scholar
Provost, F., & Fawcett, T. (1998). Robust classification systems for imprecise environments. In Proceedings of the Fifteenth National Conference on Artificial Intelligence. Menlo Park, CA: AAAI Press.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77:2.
Google Scholar
Rabiner, L., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs, NJ: Prentice Hall.
Google Scholar
Rivest, R. L., & Schapire, R. E. (1989). Inference of finite automata using homing sequences. In Proceedings of the Twenty First Annual ACM Symposium on Theoretical Computing (pp. 411–420).
Ryan, J., Lin, M.-J., & Miikkulainen, R. (1997). Intrusion detection with neural networks. In Proceedings of AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management (pp. 72–77). AAAI Press.
Salzberg, S. (1991). A nearest hyperrectangular learning method. Machine Learning, 6:3, 251–276.
Google Scholar
Salzberg, S. (1995). Locating protein coding regions in human DNA using a decision tree algorithm. Journal of Computational Biology, 2:3, 473–485.
Google Scholar
Schaffer, C. (1994). Cross-validation, stacking, and bi-level methods for stacking: Meta-methods for classification learning. In P. Cheeseman, & W. Oldford (Eds.), Selecting models from data: Artificial intelligence and Statistics IV. New York: Springer-Verlag.
Google Scholar
Schonlau, M. (2000). Personal communication.
Sheskin, D. J. (1997). Handbook of parametric and nonparametric statistical procedures. Boca Raton, FL: CRC Press.
Google Scholar
Shyu, C. R., Kak, A. C., Brodley, C. E., & Broderick, L. S. (1999). Testing for human perceptual categories in a physician-in-the-loop CBIR system for medical imagery. In Proc. IEEE Workshop of Content-Based Access of Image and Video Databases, Fort Collins, CO.
Smaha, S. E. (1988). Haystack: An intrusion detection system. In Proceedings of the Fourth Aerospace Computer Security Applications Conference (pp. 37–44).
Smyth, P. (1994a). Hidden Markov monitoring for fault detection in dynamic systems. Pattern Recognition, 27:1, 149–164.
Google Scholar
Smyth, P. (1994b). Markov monitoring with unknown states. IEEE Journal on Selected Areas in Communications, special issue on Intelligent Signal Processing for Communications, 12:9, 1600–1612.
Google Scholar
Stoll, C. (1989). The Cuckoo's egg. Pocket Books.
Stough, T., & Brodley, C. E. (1997). Image feature reduction through spoiling: Its application to multiple matched filters for focus of attention. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.
Theus, M., & Schonlau, M. (1998). Intrusion detection based on structural zeroes. Statistical Computing & Graphics Newsletter, 9:1,12–17.
Google Scholar
Wespi, A., Darcier, M., & Debar, H. (1999). Intrusion detection using variable-length audit trail patterns. Technical Report RZ 3164 (# 93210), Zurich, Switzerland: IBM Research.
Google Scholar
Wilson, D. R., & Martinez, T. R. (2000). Reduction techniques for exemplar-based learning algorithms. Machine Learning, 38:3, 257–268.
Google Scholar
Yoshida, K., & Motoda, H. (1996). Automated user modeling for intelligent interface. International Journal of Human-Computer Interaction, 8:3, 237–258.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of New Mexico, Albuquerque, NM, USA
Terran Lane
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
Carla E. Brodley

Authors

Terran Lane
View author publications
You can also search for this author in PubMed Google Scholar
Carla E. Brodley
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lane, T., Brodley, C.E. An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection. Machine Learning 51, 73–107 (2003). https://doi.org/10.1023/A:1021830128811

Download citation

Issue Date: April 2003
DOI: https://doi.org/10.1023/A:1021830128811

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Abstract

Article PDF

Similar content being viewed by others

Cybersecurity data science: an overview from machine learning perspective

A review on fault detection and diagnosis techniques: basics and beyond

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Abstract

Article PDF

Similar content being viewed by others

Cybersecurity data science: an overview from machine learning perspective

A review on fault detection and diagnosis techniques: basics and beyond

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation