Abstract
While conventional malware detection approaches increasingly fail, modern heuristic strategies often perform dynamically, which is not possible in many applications due to related effort and the quantity of files.
Based on existing work from [1] and [2] we analyse an approach towards statistical malware detection of PE executables. One benefit is its simplicity (evaluating 23 static features with moderate resource constrains), so it might support the application on large file amounts, e.g. for network-operators or a posteriori analyses in archival systems. After identifying promising features and their typical values, a custom hypothesis-based classification model and a statistical classification approach using the WEKA machine learning tool [3] are generated and evaluated. The results of large-scale classifications are compared showing that the custom, hypothesis based approach performs better on the chosen setup than the general purpose statistical algorithms. Concluding, malicious samples often have special characteristics so existing malware-scanners can effectively be supported.
Chapter PDF
References
Merkel, R.: Statistische Merkmale zur Anomaliedetektion in ausführbaren Dateien. Diploma thesis, Otto-von-Guericke-University of Magdeburg (2009)
Hoppe, T., Merkel, R., Krätzer, C., Dittmann, J.: Statistische Schadcodedetektion in ausführbaren Dateien. In: D-A-CH Security 2009, Syssec (2009)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Microsoft Corporation: Microsoft Portable Executable and Common Object File Format Specification, Revision 8.1 (2008)
Cohen, F.: Computer Viruses - Theory and Experiments, publication for PhD thesis (1984), http://all.net/books/virus/index.html (last access March 2010)
Szor, P.: The Art of Computer Virus Research and Defense. Symantec Press (2005)
Skoudis, E., Zeltser, L.: Malware: Fighting Malicious Code, 2nd edn. Prentice-Hall, Englewood Cliffs (2004)
Treadwell, S., Zhou, M.: A heuristic approach for detection of obfuscated malware. In: Proceedings of the 2009 IEEE ISI, Richardson, Texas, USA, June 08-11, pp. 291–299 (2009)
Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 241–252. Springer, Heidelberg (2003)
Borgelt, C., Timm, H., Kruse, R.: Probabilistic networks and fuzzy clustering as generalizations of naïve bayes classifiers. In: Reusch, B., Temme, K.-H. (eds.) Computational Intelligence in Theory and Practice, Heidelberg, Germany (2001)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proc International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Merkel, R., Hoppe, T., Kraetzer, C., Dittmann, J. (2010). Statistical Detection of Malicious PE-Executables for Fast Offline Analysis. In: De Decker, B., Schaumüller-Bichl, I. (eds) Communications and Multimedia Security. CMS 2010. Lecture Notes in Computer Science, vol 6109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13241-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-13241-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13240-7
Online ISBN: 978-3-642-13241-4
eBook Packages: Computer ScienceComputer Science (R0)