Skip to main content
Log in

Foundations of r-contiguous matching in negative selection for anomaly detection

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

Negative selection and the associated r-contiguous matching rule is a popular immune-inspired method for anomaly detection problems. In recent years, however, problems such as scalability and high false positive rate have been empirically noticed. In this article, negative selection and the associated r-contiguous matching rule are investigated from a pattern classification perspective. This includes insights in the generalization capability of negative selection and the computational complexity of finding r-contiguous detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. “To find” or “to generate” detectors means the same in this article.

  2. The symbol * represents either a 1 or 0.

  3. s[1,…, l] denotes characters of s at positions 1…l.

  4. The Link between recurrent events, renewal theory and the r-contiguous matching probability was discovered originally in Percus et al. (1993) and rediscovered in Ranang (2002). Percus et al. (1993) presented in the probability approximation (2) which is only valid for r ≥ l/2. However, they also cited Uspensky’s textbook [see pp. 77 in Uspensky (1937)], where the approximation of the r-contiguous matching probability for 1 ≤ r ≤ l is presented.

  5. The DLL algorithm is sometimes also called DPL or DPLL algorithm (Freeman 1996; Ouyang 1998).

  6. It is still an open problem to prove where the exact phase transition threshold is located. Latest theoretical work (Achlioptas et al. 2005) shows that the threshold r k lies within the boundary 2.68 < r k  < 4.51 for k = 3.

References

  • Achlioptas D, Naor A, Peres Y (2005) Rigorous location of phase transitions in hard optimization problems. Nature 435:759–764

    Article  Google Scholar 

  • Ayara M, Timmis J, de Lemos R, de Castro LN, Duncan R (2002) Negative selection: how to generate detectors. In: Proceedings of the 1nd International Conference on Artificial Immune Systems (ICARIS), pp 89–98. University of Kent at Canterbury Printing Unit

  • Balthrop J, Forrest S, Glickman M (2002) Revisiting lisys: Parameters and normal behavior. In: Proceedings of congress on evolutionary computation (CEC). IEEE Press, New York, pp 1045–1050

  • Bishop CM (1994) Novelty detection and neural network validation. IEE Proceedings - Vision, Image and Signal processing 141(4):217–222

    Article  Google Scholar 

  • Brueggemann T, Kern W (2004) An improved deterministic local search algorithm for 3-SAT. Theoretical Computer Science 329(1–3):303–313

    Article  MATH  MathSciNet  Google Scholar 

  • Cormen TH, Leiserson CE, Rivest RL, Stein C (2002) Introduction to algorithms, 2nd edn. MIT Press, Cambridge

  • Dasgupta D, Forrest S (1995) Tool breakage detection in milling operations using a negative-selection algorithm. Tech. Rep. CS95-5, University of New Mexico

  • Dasgupta D, Forrest S (1996) Novelty detection in time series data using ideas from immunology. In: Proceedings of the 5th international conference on intelligent systems, pp 82–87

  • Dasgupta D, Forrest S (1998) Artificial immune systems and their applications, Chapter. An anomaly detection algorithm inspired by the immune system. Springer-Verlag, NY, pp 262–277

  • Davis M, Putnam H (1960) A computing procedure for quantification theory. Journal of the ACM (JACM) 7(3):201–215

    Article  MATH  MathSciNet  Google Scholar 

  • Davis M, Logemann G, Loveland D (1962) A machine program for theorem-proving. Communications of the ACM 5(7):394–397

    Article  MATH  MathSciNet  Google Scholar 

  • de Castro LN, Timmis J (2002) Artificial immune systems: a new computational intelligence approach. Springer Verlag, London

  • D’haeseleer P, Forrest S, Helman P (1996) An immunological approach to change detection: algorithms, analysis, and implications. In: Proceedings of the Symposium on Research in Security and Privacy, pp 110–119. IEEE Computer Society Press

  • Esponda F, Forrest S (2002) Detector coverage under the r-contiguous bits matching rule. Tech. Rep. TR-CS-2002-03, University of New Mexico

  • Esponda F, Forrest S, Helman P (2003) The crossover closure and partial match detection. In: Proceedings of the 2nd international conference on artificial immune systems (ICARIS), Lecture Notes in Computer Science, vol 2787. Springer-Verlag, NY, pp 249–260

  • Feller W (1968) An introduction to probability theory and its applications, vol 1, 3rd edn. Wiley, NY

  • Forrest S, Perelson AS, Allen L, Cherukuri R (1994) Self-nonself discrimination in a computer. In: Proceedings of the Symposium on Research in Security and Privacy. IEEE Computer Society Press, pp 202–212

  • Freeman JW (1996) Hard random 3-SAT problems and the Davis–Putnam procedure. Artificial Intelligence 81(1–2):183–198

    Article  MathSciNet  Google Scholar 

  • Freitas AA, Timmis J (2003) Revisiting the foundations of artificial immune systems: a problem-oriented perspective. In: Proceedings of the 2nd international conference on artificial immune systems (ICARIS), Lecture Notes in Computer Science, vol 2787. Springer-Verlag, NY, pp 229–241

  • Gent IP, Walsh T (1994) The SAT phase transition. In: Proceedings of the 11th European conference on artificial intelligence. Wiley, NY, pp 105–109

  • Glickman M, Balthrop J, Forrest S (2005) A machine learning evaluation of an artificial immune system. Evolutionary Computation 13(2):179–212

    Article  Google Scholar 

  • González F, Dasgupta D, Gómez J (2003) The effect of binary matching rules in negative selection. In: Genetic and evolutionary computation—GECCO. Lecture Notes in Computer Science, vol 2723. Springer-Verlag, Chicago, pp 195–206

  • Hofmeister T, Schöning U, Schuler R, Watanabe O (2002) A probabilistic 3-SAT algorithm further improved. In: 19th Annual symposium on theoretical aspects of computer science (STACS), Lecture Notes in Computer Science, vol 2285. Springer-Verlag, NY, pp 192–202

  • Hofmeyr SA (1999) An immunological model of distributed detection and its application to computer security. Ph.D. thesis, University of New Mexico, Albuquerque, NM

  • Mitchell T (1997) Machine learning. McGraw Hill

  • Ouyang M (1998) How good are branching rules in DPLL. Discrete Applied Mathematics 89(1–3):281–286

    Article  MATH  Google Scholar 

  • Percus JK, Percus OE, Perelson AS (1993) Predicting the size of the T-cell receptor and antibody combining region from consideration of efficient self-nonself discrimination. Proc Natl Acad Sci USA 90:1691–1695

    Article  Google Scholar 

  • Ranang MT (2002) An artificial immune system approach to preserving security in computer networks. Master’s thesis, Norges Teknisk-Naturvitenskapelige Universitet

  • Reischuk KR (1990) Einführung in die Komplexitätstheorie. BG Teubner, Stuttgart

  • Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Computation 13(7):1443–1471

    Article  MATH  Google Scholar 

  • Selman B, Mitchell DG, Levesque HJ (1996) Generating hard satisfiability problems. Artificial Intelligence 81(1–2):17–29

    Article  MathSciNet  Google Scholar 

  • Singh S (2002) Anomaly detection using negative selection based on the r-contiguous matching rule. In: Proceedings of the 1st international conference on artificial immune systems (ICARIS). Unversity of Kent at Canterbury Printing Unit, pp 99–106

  • Steinwart I, Hush D, Scovel C (2005) A classification framework for anomaly detection. Journal of Machine Learning Research 6:211–232

    MathSciNet  Google Scholar 

  • Stibor T, Timmis J, Eckert C (2006a) Generalization regions in hamming negative selection. In: Intelligent information processing and web mining, advances in soft computing. Springer-Verlag, Germany, pp 447–456

  • Stibor T, Timmis J, Eckert C (2006b) The link between r-contiguous detectors and k-CNF satisfiability. In: Proceedings of congress on evolutionary computation (CEC). IEEE Press, New York, USA, pp 491–498

  • Tarassenko L, Hayton P, Cerneaz N, Brady M (1995) Novelty detection for the identification of masses in mammograms. In: Proceedings of the 4th IEE international conference on artificial neural networks, pp 442–447

  • Tax DMJ, Duin RPW (1999) Data domain description using support vectors. In: European symposium on artificial neural networks—ESANN, pp 251–256

  • Taylor DW, Corne DW (2003) An investigation of the negative selection algorithm for fault detection in refrigeration systems. In: Proceedings of the 2nd international conference on artificial immune systems (ICARIS), Lecture Notes in Computer Science, vol 2787. Springer-Verlag, NY, pp 34–45

  • Uspensky JV (1937) Introduction to mathematical probability. McGraw-Hill

  • Welzl E (2005) Boolean satisfiability—combinatorics and algorithms. Lecture Notes. http://www.inf.ethz.ch/∼emo/SmallPieces/SAT.ps

  • Wierzchoń ST (2000a) Generating optimal repertoire of antibody strings in an artificial immune system. In: Intelligent Information Systems. Springer Verlag, NY, pp 119–133

  • Wierzchoń ST (2000b) Discriminative power of the receptors activated by k-contiguous bits rule. Journal of Computer Science and Technology 1(3):1–13

    Google Scholar 

Download references

Acknowledgment

The author thanks Erin Gardner and Dawn Yackzan for their valuable suggestions and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Stibor.

Appendix

Appendix

Fig. 14
figure 14

Coherence between detector coverage and induced holes for stepwise increasing matching length r. The gray shaded area is covered by the generated detectors, the white area represents holes. The black points represent normal examples (\(|{\mathcal{S}}| = 250\)) which are generated by a mixture of Gaussian distributions (see Fig. 6a)

Fig. 15
figure 15

Visualized experimental results as in Fig. 14, however with \(|{\mathcal{S}}| = 5,000\) generated normal examples

Fig. 16
figure 16

Visualized experimental results as in Fig. 14, however the \(|{\mathcal{S}}| = 250\) generated normal examples are sampled of probability distribution depicted in Fig. 6b

Fig. 17
figure 17

Visualized experimental results as in Fig. 16, however with \(|{\mathcal{S}}| = 5,000\) generated normal examples

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stibor, T. Foundations of r-contiguous matching in negative selection for anomaly detection. Nat Comput 8, 613–641 (2009). https://doi.org/10.1007/s11047-008-9097-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-008-9097-5

Keywords

Navigation