skip to main content
10.1145/1190216.1190270acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
Article

A semantics-based approach to malware detection

Published:17 January 2007Publication History

ABSTRACT

Malware detection is a crucial aspect of software security. Current malware detectors work by checking for "signatures," which attempt to capture (syntactic) characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes such detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter syntactic properties of the malware byte sequence without significantly affecting their execution behavior.This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behaviors of malware as well as the program being checked for infection, and uses abstract interpretation to "hide" irrelevant aspects of these behaviors. As a concrete application of our approach, we show that the semantics-aware malware detector proposed by Christodorescu et al. is complete with respect to a number of common obfuscations used by malware writers.

References

  1. B. Barak, O. Goldreich, R. Impagliazzo, S. Rudich, A. Sahai, S. Vadhan, and K. Yang. On the (im)possibility of obfuscating programs. In Advances in Cryptology (CRYPTO'01), volume 2139 of Lecture Notes in Computer Science, pages 1 -- 18, Santa Barbara, CA, USA, Aug. 19--23, 2001. Springer Berlin/Heidelberg.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Chess and S. White. An undetectable computer virus. In Proceedings of the 2000 Virus Bulletin Conference (VB2000), Orlando, FL, USA, Sept. 27--29, 2000. Virus Bulletin.]]Google ScholarGoogle Scholar
  3. S. Chow, Y. Gu, H. Johnson, and V. Zakharov. An approach to the obfuscation of control-flow of sequential computer programs. In G. Davida and Y. Frankel, editors, Proceedings of the 4th International Information Security Conference (ISC'01), volume 2200 of Lecture Notes in Computer Science, pages 144--155, Malaga, Spain, Oct. 1--3, 2001. Springer Berlin/Heidelberg.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Christodorescu, S. Jha, S. A. Seshia, D. Song, and R. E. Bryant. Semantics-aware malware detection. In Proceedings of the 2005 IEEE Symposium on Security and Privacy (S&P'05), pages 32--46, Oakland, CA, USA, May 8--11, 2005. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. B. Cohen. Computer viruses: Theory and experiments. Computers and Security, 6:22--35, 1987.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Collberg, C. Thomborson, and D. Low. A taxonomy of obfuscating transformations. Technical Report 148, Department of Computer Sciences, The University of Auckland, July 1997.]]Google ScholarGoogle Scholar
  7. C. Collberg, C. Thomborson, and D. Low. Manufacturing cheap, resilient, and stealthy opaque constructs. In Proceedings of the 25th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'98), pages 184--196, San Diego, CA, USA, Jan. 19--21, 1998. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction of approximation of fixed points. In Proceedings of the 4th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'77), pages 238--252, Los Angeles, CA, USA, Jan. 17--19, 1977. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Proceedings of the 6th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'79), pages 269--282, San Antonio, TX, USA, Jan. 29--31, 1979. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Cousot and R. Cousot. Systematic design of program transformation frameworks by abstract interpretation. In Proceedings of the 29th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'02), pages 178--190, Portland, OR, USA, Jan. 16--18, 2002. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Dalla Preda and R. Giacobazzi. Control code obfuscation by abstract interpretation. In Proceedings of the 3rd IEEE International Conference on Software Engineeering and Formal Methods (SEFM'05), pages 301--310, Koblenz, Germany, Sept. 5--9, 2005. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Dalla Preda and R. Giacobazzi. Semantic-based code obfuscation by abstract interpretation. In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05), volume 3580 of Lecture Notes in Computer Science, pages 1325--1336, Lisboa, Portugal, July 11--15, 2005. Springer Berlin/Heidelberg.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Detristan, T. Ulenspiegel, Y. Malcom, and M. S. von Underduk. Polymorphic shellcode engine using spectrum analysis. Phrack, 11(61):published online at http://www.phrack.org (last accessed on Jan. 16, 2004), Aug. 2003.]]Google ScholarGoogle Scholar
  14. S. Goldwasser and Y. T. Kalai. On the impossibility of obfuscation with auxiliary input. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05), pages 553--562, Washington, DC, USA, Oct. 22--25, 2005. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Gupta and R. Sekar. An approach for detecting self-propagating email using anomaly detection. In G. Vigna, E. Jonsson, and C. Kruegel, editors, Proceedings of the 6th International Symposium on Recent Advances in Intrusion Detection (RAID'03), volume 2820 of Lecture Notes in Computer Science, pages 55--72, Pittsburgh, PA, USA, Sept. 8--10, 2003. Springer Berlin/Heidelberg.]]Google ScholarGoogle Scholar
  16. Intel Corporation. IA-32 Intel Architecture Software Developer's Manual.]]Google ScholarGoogle Scholar
  17. M. Jordan. Dealing with metamorphism. Virus Bulletin, pages 4--6, Oct. 2002.]]Google ScholarGoogle Scholar
  18. J. Kinder, S. Katzenbeisser, C. Schallhart, and H. Veith. Detecting malicious code by model checking. In K. Julisch and C. Krügel, editors, Proceedings of the 2nd International Conference on Intrusion and Malware Detection and Vulnerability Assessment (DIMVA'05), volume 3548 of Lecture Notes in Computer Science, pages 174--187, Vienna, Austria, July 7--8, 2005. Springer Berlin/Heidelberg.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Z. Kolter and M. A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), pages 470--478, Seattle, WA, USA, Aug. 22--25, 2004. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. W.-J. Li, K. Wang, S. J. Stolfo, and B. Herzog. Fileprints: Identifying file types by n-gram analysis. In Proceedings of the 6th Annual IEEE Systems, Man, and Cybernetics (SMC) Workshop on Information Assurance (IAW'05), pages 64--71, West Point, NY, June 15--17, 2005. United States Military Academy.]]Google ScholarGoogle Scholar
  21. C. Linn and S. Debray. Obfuscation of executable code to improve resistance to static disassembly. In Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS'03), pages 290--299, Washington, DC, USA, Oct. 27--30, 2003. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Morley. Processing virus collections. In Proceedings of the 2001 Virus Bulletin Conference (VB2001), pages 129--134, Prague, Czech Republic, Sept. 27--28, 2001. Virus Bulletin.]]Google ScholarGoogle Scholar
  23. C. Nachenberg. Computer virus-antivirus coevolution. Communications of the ACM, 40(1):46--51, Jan. 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Rajaat. Polymorphism. 29A Magazine, 1(3), 1999.]]Google ScholarGoogle Scholar
  25. Symantec Corporation. Symantec Internet Security Threat Report: Trends for January 06--June 06, volume X. Symantec Corporation, Sept. 25, 2006.]]Google ScholarGoogle Scholar
  26. P. Ször. The Art of Computer Virus Research and Defense. Addison-Wesley Professional, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Ször and P. Ferrie. Hunting for metamorphic. In Proceedings of the 2001 Virus Bulletin Conference (VB2001), pages 123--144, Prague, Czech Republic, Sept. 27--28, 2001. Virus Bulletin.]]Google ScholarGoogle Scholar
  28. H. Wee. On obfuscating point functions. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC'05), pages 523--532, Baltimore, MD, USA, May 21--24, 2005. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. z0mbie. Automated reverse engineering: Mistfall engine. Published online at http://www.madchat.org//vxdevl/papers/vxers/Z0mbie/autorev.txt (last accessed on Sep. 29, 2006).]]Google ScholarGoogle Scholar
  30. z0mbie. Real permutating engine. Published online at http://vx.netlux.org/vx.php?id=er05 (last accessed on Sep. 29, 2006).]]Google ScholarGoogle Scholar

Index Terms

  1. A semantics-based approach to malware detection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      POPL '07: Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
      January 2007
      400 pages
      ISBN:1595935754
      DOI:10.1145/1190216
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 42, Issue 1
        Proceedings of the 2007 POPL Conference
        January 2007
        379 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1190215
        Issue’s Table of Contents

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 January 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate824of4,130submissions,20%

      Upcoming Conference

      POPL '25

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader