ABSTRACT
Malware detection is a crucial aspect of software security. Current malware detectors work by checking for "signatures," which attempt to capture (syntactic) characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes such detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter syntactic properties of the malware byte sequence without significantly affecting their execution behavior.This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behaviors of malware as well as the program being checked for infection, and uses abstract interpretation to "hide" irrelevant aspects of these behaviors. As a concrete application of our approach, we show that the semantics-aware malware detector proposed by Christodorescu et al. is complete with respect to a number of common obfuscations used by malware writers.
- B. Barak, O. Goldreich, R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan, and K. Yang. On the (im)possibility of obfuscating programs. In Advances in Cryptology (CRYPTO'01), volume 2139 of Lecture Notes in Computer Science, pages 1 -- 18, Santa Barbara, CA, USA, Aug. 19--23, 2001. Springer Berlin/Heidelberg.]] Google ScholarDigital Library
- D. Chess and S. White. An undetectable computer virus. In Proceedings of the 2000 Virus Bulletin Conference (VB2000), Orlando, FL, USA, Sept. 27--29, 2000. Virus Bulletin.]]Google Scholar
- S. Chow, Y. Gu, H. Johnson, and V. Zakharov. An approach to the obfuscation of control-flow of sequential computer programs. In G. Davida and Y. Frankel, editors, Proceedings of the 4th International Information Security Conference (ISC'01), volume 2200 of Lecture Notes in Computer Science, pages 144--155, Malaga, Spain, Oct. 1--3, 2001. Springer Berlin/Heidelberg.]] Google ScholarDigital Library
- M. Christodorescu, S. Jha, S. A. Seshia, D. Song, and R. E. Bryant. Semantics-aware malware detection. In Proceedings of the 2005 IEEE Symposium on Security and Privacy (S&P'05), pages 32--46, Oakland, CA, USA, May 8--11, 2005. IEEE Computer Society.]] Google ScholarDigital Library
- F. B. Cohen. Computer viruses: Theory and experiments. Computers and Security, 6:22--35, 1987.]] Google ScholarDigital Library
- C. Collberg, C. Thomborson, and D. Low. A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Sciences, The University of Auckland, July 1997.]]Google Scholar
- C. Collberg, C. Thomborson, and D. Low. Manufacturing cheap, resilient, and stealthy opaque constructs. In Proceedings of the 25th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'98), pages 184--196, San Diego, CA, USA, Jan. 19--21, 1998. ACM Press.]] Google ScholarDigital Library
- P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction of approximation of fixed points. In Proceedings of the 4th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'77), pages 238--252, Los Angeles, CA, USA, Jan. 17--19, 1977. ACM Press.]] Google ScholarDigital Library
- P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Proceedings of the 6th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'79), pages 269--282, San Antonio, TX, USA, Jan. 29--31, 1979. ACM Press.]] Google ScholarDigital Library
- P. Cousot and R. Cousot. Systematic design of program transformation frameworks by abstract interpretation. In Proceedings of the 29th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'02), pages 178--190, Portland, OR, USA, Jan. 16--18, 2002. ACM Press.]] Google ScholarDigital Library
- M. Dalla Preda and R. Giacobazzi. Control code obfuscation by abstract interpretation. In Proceedings of the 3rd IEEE International Conference on Software Engineeering and Formal Methods (SEFM'05), pages 301--310, Koblenz, Germany, Sept. 5--9, 2005. IEEE Computer Society.]] Google ScholarDigital Library
- M. Dalla Preda and R. Giacobazzi. Semantic-based code obfuscation by abstract interpretation. In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05), volume 3580 of Lecture Notes in Computer Science, pages 1325--1336, Lisboa, Portugal, July 11--15, 2005. Springer Berlin/Heidelberg.]] Google ScholarDigital Library
- T. Detristan, T. Ulenspiegel, Y. Malcom, and M. S. von Underduk. Polymorphic shellcode engine using spectrum analysis. Phrack, 11(61):published online at http://www.phrack.org (last accessed on Jan. 16, 2004), Aug. 2003.]]Google Scholar
- S. Goldwasser and Y. T. Kalai. On the impossibility of obfuscation with auxiliary input. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05), pages 553--562, Washington, DC, USA, Oct. 22--25, 2005. IEEE Computer Society.]] Google ScholarDigital Library
- A. Gupta and R. Sekar. An approach for detecting self-propagating email using anomaly detection. In G. Vigna, E. Jonsson, and C. Kruegel, editors, Proceedings of the 6th International Symposium on Recent Advances in Intrusion Detection (RAID'03), volume 2820 of Lecture Notes in Computer Science, pages 55--72, Pittsburgh, PA, USA, Sept. 8--10, 2003. Springer Berlin/Heidelberg.]]Google Scholar
- Intel Corporation. IA-32 Intel Architecture Software Developer's Manual.]]Google Scholar
- M. Jordan. Dealing with metamorphism. Virus Bulletin, pages 4--6, Oct. 2002.]]Google Scholar
- J. Kinder, S. Katzenbeisser, C. Schallhart, and H. Veith. Detecting malicious code by model checking. In K. Julisch and C. Krügel, editors, Proceedings of the 2nd International Conference on Intrusion and Malware Detection and Vulnerability Assessment (DIMVA'05), volume 3548 of Lecture Notes in Computer Science, pages 174--187, Vienna, Austria, July 7--8, 2005. Springer Berlin/Heidelberg.]] Google ScholarDigital Library
- J. Z. Kolter and M. A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), pages 470--478, Seattle, WA, USA, Aug. 22--25, 2004. ACM Press.]] Google ScholarDigital Library
- W.-J. Li, K. Wang, S. J. Stolfo, and B. Herzog. Fileprints: Identifying file types by n-gram analysis. In Proceedings of the 6th Annual IEEE Systems, Man, and Cybernetics (SMC) Workshop on Information Assurance (IAW'05), pages 64--71, West Point, NY, June 15--17, 2005. United States Military Academy.]]Google Scholar
- C. Linn and S. Debray. Obfuscation of executable code to improve resistance to static disassembly. In Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS'03), pages 290--299, Washington, DC, USA, Oct. 27--30, 2003. ACM Press.]] Google ScholarDigital Library
- P. Morley. Processing virus collections. In Proceedings of the 2001 Virus Bulletin Conference (VB2001), pages 129--134, Prague, Czech Republic, Sept. 27--28, 2001. Virus Bulletin.]]Google Scholar
- C. Nachenberg. Computer virus-antivirus coevolution. Communications of the ACM, 40(1):46--51, Jan. 1997.]] Google ScholarDigital Library
- Rajaat. Polymorphism. 29A Magazine, 1(3), 1999.]]Google Scholar
- Symantec Corporation. Symantec Internet Security Threat Report: Trends for January 06--June 06, volume X. Symantec Corporation, Sept. 25, 2006.]]Google Scholar
- P. Ször. The Art of Computer Virus Research and Defense. Addison-Wesley Professional, 2005.]] Google ScholarDigital Library
- P. Ször and P. Ferrie. Hunting for metamorphic. In Proceedings of the 2001 Virus Bulletin Conference (VB2001), pages 123--144, Prague, Czech Republic, Sept. 27--28, 2001. Virus Bulletin.]]Google Scholar
- H. Wee. On obfuscating point functions. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC'05), pages 523--532, Baltimore, MD, USA, May 21--24, 2005. ACM Press.]] Google ScholarDigital Library
- z0mbie. Automated reverse engineering: Mistfall engine. Published online at http://www.madchat.org//vxdevl/papers/vxers/Z0mbie/autorev.txt (last accessed on Sep. 29, 2006).]]Google Scholar
- z0mbie. Real permutating engine. Published online at http://vx.netlux.org/vx.php?id=er05 (last accessed on Sep. 29, 2006).]]Google Scholar
Index Terms
- A semantics-based approach to malware detection
Recommendations
A semantics-based approach to malware detection
Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic ...
A semantics-based approach to malware detection
Proceedings of the 2007 POPL ConferenceMalware detection is a crucial aspect of software security. Current malware detectors work by checking for "signatures," which attempt to capture (syntactic) characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic ...
Metamorphic malware detection using base malware identification approach
Malware is a malicious program that is intentionally developed to harm computer systems. Because the metamorphic malwares are advanced in nature, they mutate their code in each generation by employing code obfuscation techniques to thwart detection. ...
Comments