ABSTRACT
Cyber entities in many ways mimic the behavior of organic systems. Individuals or groups compete for limited resources using a variety of strategies, the most effective of which are reused and refined in later 'generations'. Traditionally this behavior has made detection of malicious entities very difficult because 1) recognition systems are often built on exact matching to a pattern that can only be 'learned' after a malicious entity reveals itself and 2) the enormous volume and variation in benign entities is an overwhelming source of previously unseen entities that often confound detectors.
To turn the tables of complexity on the would-be attackers, we have developed a method for mapping the sequence of behaviors in which cyber entities engage to strings of text and analyze these strings using modified bioinformatics algorithms. Bioinformatics algorithms optimize the alignment between text strings even in the presence of mismatches, insertions or deletions and do not require an a priori definition of the patterns one is seeking. Nor do they require any type of exact matching. This allows the data itself to suggest meaningful patterns that are conserved between cyber entities. We demonstrate this method on data generated from live network traffic.
The impact of this approach is that it can rapidly calculate similarity measures of previously unseen cyber entities in terms of well-characterized entities. These measures may also be used to organize large collections of data into families, making it possible to identify motifs indicative of each family.
Supplemental Material
Available for Download
- S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, "Basic local alignment search tool," J. Mol. Biol., vol. 215, pp. 403--410, 1990.Google ScholarCross Ref
- S. Altschul, T. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. Lipman, "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs," Nucleic Acids Res., vol. 25, pp. 3389--3402, 1997.Google ScholarCross Ref
- C. Oehmen and J. Nieplocha, "ScalaBLAST: A scalable implementation of BLAST for High Performance Data-Intensive Bioinformatics Analysis," IEEE Trans. Parallel. Dist. Sys., vol. 17, pp. 740--749, 2006. Google ScholarDigital Library
- D. Patterson, "Latency lags bandwidth," Communications of the ACM., vol. 47, pp. 71--75, 2004. Google ScholarDigital Library
- J. Risch, D. Rex, S. Dowson, T. Walters, R. May, and B. Moon, "The STARLIGHT Information Visualization System," in IEEE International Information Visualization Conference (IV '97), London, England, 1997. Google ScholarDigital Library
Index Terms
- An organic model for detecting cyber-events
Recommendations
Identification of mouse mslp2 gene from EST databases by repeated searching, comparison, and assembling
The NPHS2 gene is expressed in podocytes and encodes the integral membrane protein called podocin, which is believed to play an important role in the renal function of glomerular filtration. Mutations in this gene can cause serious renal function ...
In silico infection of the human genome
EvoBIO'12: Proceedings of the 10th European conference on Evolutionary Computation, Machine Learning and Data Mining in BioinformaticsThe human genetic sequence database contains DNA sequences very like those of mycoplasma bacteria. It appears such bacteria infect not only molecular Biology laboratories but their genes were picked up from contaminated samples and inserted into GenBank ...
Evolutionary drift models for moving target defense
CSIIRW '13: Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research WorkshopOne of the biggest challenges faced by cyber defenders is that attacks evolve more rapidly than our ability to recognize them. We propose a moving target defense concept in which the means of detection is set in motion. This is done by moving away from ...
Comments