Elsevier

Computers & Security

Volume 115, April 2022, 102613
Computers & Security

Enhancing malware analysis sandboxes with emulated user behavior

https://doi.org/10.1016/j.cose.2022.102613Get rights and content

Abstract

Cybersecurity teams have widely used malware analysis sandboxes to investigate the threat of malware. Correspondingly, armored malware adopts various anti-sandbox techniques to evade analysis, from simple environment-specific traits detection to complex real-user operation environment verification. Particularly, malware may identify sandbox environments by checking several system artifacts that are impacted by the accumulation of normal user activities, such as file accesses. It remains a challenge to defeat this type of anti-sandbox technique. In this paper, we design an emulation-based system called UBER to enhance malware analysis sandboxes. The core idea is to generate realistic system artifacts based on automatically derived user profile models. We solve two major challenges. First, we generate authentic system artifacts continuously to emulate the real-user behaviors. Second, we integrate the generated artifacts stealthily to hide the trace of the emulation. We implement a system prototype using Python system-event monitor and automation control modules. Our experimental results demonstrate that UBER is capable of generating believable system artifacts and effectively mitigates the sandbox evasion techniques that exploit system fingerprinting.

Introduction

Along with the increasing number of sophisticated malware that adopts packing and obfuscation techniques (Cheng, Ming, Fu, Peng, Chen, Zhang, Marion, 2018, Sharif, Yegneswaran, Saidi, Porras, Lee, 2008), malware analysis sandbox systems have been widely used to help malware analysts provide the fine-grained dissection of malicious functionalities via monitoring malware’s run-time behaviors (Analysis, Inc, 2021, McAfee). Meanwhile, malware developers began to develop countermeasures to circumvent sandboxes (Afianian et al., 2019). For Example, malware can identify sandbox environments by checking system traits such as usernames, system settings, analysis instrumentation files, and installed drivers (Chen, Andersen, Mao, Bailey, Nazario, 2008, Kapravelos, Cova, Kruegel, Vigna, 2011, Keragala, 2016, Lindorfer, Kolbitsch, Comparetti, 2011). Moreover, advanced malware may evade sandboxes by performing timing attacks (Brengel, Backes, Rossow, 2016, Pék, Bencsáth, Buttyán, 2011), detecting CPU virtualization (Alwabel, Shi, Bartlett, Mirkovic, 2014, Martignoni, Paleari, Fresi Roglia, Bruschi, 2010, Paleari, Martignoni, Roglia, Bruschi, 2009), checking process introspection indicators (Blackthorne, Bulazel, Fasano, Biernat, Yener, 2016, Petsas, Voyatzis, Athanasopoulos, Polychronakis, Ioannidis, 2014), or even leveraging reverse Turing test (Baird, Coates, Fateman, 2003, Yokoyama, Ishii, Tanabe, Papa, Yoshioka, Matsumoto, Kasama, Inoue, Brengel, Backes, et al., 2016).

Researchers have proposed various mitigation approaches such as binary modification (Vasudevan and Yerraballi, 2006), hiding environmental artifacts (Willems et al., 2007), path exploration (Branco et al., 2012), state modification (Kang et al., 2009), heterogeneous analysis (Xu, Kim, 2017, Yan, Jayachandra, Zhang, Yin, 2012), and bare-metal analysis (Spensky et al., 2016) to minimize or eliminate discrepancies between malware analysis sandboxes and real systems. However, it remains a challenge to defeat a new anti-sandbox technique that leverages system artifacts (i.e., the registry entries, the event logs, the browsing histories, and the cached files) accumulated by normal user operations to distinguish sandbox environments from real systems (Miramirkhani et al., 2017).

To tackle the defect of lacking authentic system artifacts in the existing sandbox designs, one spontaneous approach is to construct the sandbox environments by directly cloning real user systems. However, this approach has some limitations (Miramirkhani et al., 2017). First, the system artifacts from a real user system may contain the user’s private information that needs to be quarantined. It is challenging to thoroughly clean user private information and retain the most authentic system artifacts. Second, the system artifacts in the cloned system become outdated quickly without continuous user interactions. It is time-consuming to clone the latest real user system and then conduct data cleaning for each round of malware analysis.

In this paper, we propose a new framework called User Behavior Emulator (UBER) that profiles authentic user activities to generate realistic system artifacts via user behavior emulation for the sandbox environments. Our design is based on one basic observation, namely, most malware is developed for mass attacks that do not aim at compromising a targeted computer. Therefore, it is plausible to profile a user behavior model using any normal authentic user and then simulate user activities according to an abstracted user behavior model. UBER consists of four components, namely, computer usage collector, user profile generator, artifacts generator, and update scheduler. The computer usage collector first leverages the system-event monitor technique to capture long-time computer usage information from real users. Then, the user profile generator performs statistical analysis on the collected usage information to construct a user behavior profile. Based on the user behavior profile, the artifacts generator continuously generates realistic system artifacts by emulating authentic user behavior via automation control techniques. Finally, the update scheduler periodically integrates the emulated system artifacts into the malware analysis sandbox to ensure ”up-to-date” artifacts in the sandbox.

Instead of directly modeling the patterns of various low-level system artifacts, we choose to emulate the high-level user behavior, which is the source of artifact generation. It is difficult to generate complete and consistent system artifacts; however, we can derive the user behavior from the collected system artifacts, similar to the user profile in intrusion detection field (Peng et al., 2016). UBER collects comprehensive computer usage information including system events and application logs to construct user behavior profiles. To minimize privacy leakage, UBER only records the statistical characteristics (i.e., application usage times, file operations, and UI events) of the computer usage information. Moreover, UBER relies on the statistical data from public websites (e.g., Alexa2, Google Trends3) to profile web browser activities, such as accessing top sites, searching common terms.

We adopt a new deployment strategy in UBER to perform user behavior emulation within an isolated always-on system (continuously running to accumulate the user artifacts) and then copy this system to the malware analysis sandboxes on demand. This approach has three advantages over the solution that deploys the emulator directly in the malware analysis sandboxes. First, it can prevent the emulation processes from being exploited as an indicator for the evasion malware to identify sandboxes. In other words, it keeps our design much stealthier. Second, it can avoid enlarging the attack surface of the sandboxes, in contrast to emulating user behavior directly on sandbox environments. Third, it prevents the emulation processes from competing for system resources and interfering with the analysis results. UBER includes an update scheduler to ensure up-to-date artifacts in sandboxes by performing the copy process regularly or on-demand since the system artifacts copied into the sandbox environments will become obsolete without persistent user operations. Given that one malware analysis sandbox is usually rolled back to its initial state after each malware analysis (Yokoyama et al., 2016), UBER replaces the initial state with the always-on emulated system containing the most up-to-date artifacts.

We implement a prototype of UBER based on python system-event monitor and automation control modules (Hammond, Mangalapilly, Moses-palmer, Pywinauto, Selenium). To evaluate the effectiveness of the generated artifacts of UBER, we deploy the artifacts generator on a virtual machine with a fresh (newly installed) Windows OS as a sandbox and manually operate the cloned fresh virtual machine as a ”real” system simultaneously for comparison. After running these two systems for one month, we observe that both systems accumulate overall comparable amounts of system artifacts. We further explore the daily variation of artifacts generation and find that UBER is able to simulate the artifacts accumulation processes of real user systems. Finally, we leverage the state-of-the-art classifier provided by Miramirkhani et al. (2017) to verify the authenticity of the system deploying with UBER. The experimental results indicate that UBER can effectively generate realistic artifacts via the emulation of real user operations to defeat the anti-sandbox technique that leverages system artifacts. Moreover, UBER can be extended to analyze targeted malware that aims to compromise a specific target machine by automatically profiling user behavior models representing specific user activities.

In summary, this paper makes the following contributions:

  • We design an emulation-based system called the User Behavior Emulators (UBER) to enhance malware analysis sandboxes by generating realistic system artifacts based on the automatically derived user profile model.

  • We develop a new approach to create high-fidelity sandbox environments by emulating the high-level user behavior in an isolated always-on system and then stealthily merging this system to the malware analysis sandboxes.

  • We implement a prototype of UBER and our experimental results demonstrate the effectiveness of UBER in defeating the sandbox evasion technique that exploits various system fingerprinting.

This paper is an extension of our conference paper (Feng et al., 2019), which has been published in the Proceedings of ICICS 2019. In this version, we improve the original approach in the following aspects:

  • Automation user profile generation: we redesign the collector to monitor the comprehensive computer usage information automatically and derive user profiles over a manual collection process.

  • Extending the system artifacts list: we design a new emulation strategy to complement 44 artifacts through user behavior emulation, instead of generating a small number of typical artifacts.

  • Reevaluating the prototype system: we enrich experiments to verify the effectiveness of our approach against user behavior artifacts detection for the sandbox.

  • Adding detailed discussion about this approach: we compare our solution with existing research and other alternate solutions of user artifacts generation.

This paper is structured as follows. Section 2 presents the security goal and threat model of UBER. Section 3 presents the overview design of UBER system and Section 4 presents the detailed implementation of the prototype of UBER. Section 5 evaluates the effectiveness of UBER via experiments. Section 6 discusses the limitation and future direction of UBER. Section 7 outlines related work. Section 8 concludes this paper.

Section snippets

Threat model

In this paper, we focus on defeating the malware-used evasion techniques of checking authentic system artifacts generated in normal user activities (Miramirkhani, Appini, Nikiforakis, Polychronakis, 2017, Yokoyama, Ishii, Tanabe, Papa, Yoshioka, Matsumoto, Kasama, Inoue, Brengel, Backes, et al., 2016). In a real user system, normal users perform various actions such as browsing websites, editing office software, and coding computer programs. All those actions can generate accumulated system

System design

In this section, we first present the overview architecture of the UBER system, then we introduce the design of each component.

Prototype implementation

We implement a prototype of UBER on Windows OS. The main reason is that the most effective sandbox detection solution (Miramirkhani et al., 2017) of checking the wear-and-tear artifacts was developed on Windows. Thus, we implement UBER on Windows to show that our sandbox system can defeat the detection mechanism proposed by Miramirkhani et al. (2017). The implementation architecture is shown in Fig. 4. UBER first leverages python monitor modules including watchdog (Mangalapilly, 2021), pynput (

Effectiveness evaluation

We evaluate the defense effectiveness of UBER against user behavior artifacts detection. To the best of our knowledge, there is no publicly released malware that detects the sandbox with the technique introduced by paper (Miramirkhani et al., 2017). Therefore, it is difficult to evaluate our solution using real malware. To overcome this challenge, we present the defense effectiveness of UBER in three aspects. First, we compare the long-term accumulated artifacts generated by the real user and

More fine-grained profiling

UBER generates the most commonly identified user artifacts by emulating the relevant activities. This approach does not apply to generating artifacts associated with proprietary, customized, or otherwise less popular software. This type of software usually generates its own unique artifacts. If one attacker aims at a target utilizing this specific software, this approach will become less effective. To defeat the sandbox evasion technique that utilizes these software-specific artifacts, we could

Artifacts identification-based sandbox evasion techniques

Many researchers have studied sandbox evasion techniques and corresponding mitigation strategies. Afianian et al. Afianian et al. (2019) present a detailed survey on existing sandbox evasion techniques and summarize possible countermeasures towards that evasion malware. Alexei et al. Bulazel and Yener (2017) systematically review evasion techniques against automated dynamic malware analysis on PC, mobile, and web. Dilshan (Keragala, 2016) provides an overview of commonly used evasion techniques

Conclusion

With the wide application of malware analysis sandboxes, malware authors have also developed sophisticated evasion techniques to evade sandbox environments. Among them, one particular technique is to fingerprint various artifacts generated during the normal usage of the real system, which cannot be countered via state-of-the-art mitigation strategies. Given existing malware analysis sandboxes deployed on pristine operating system images, we investigate the typical system artifacts and propose a

CRediT authorship contribution statement

Songsong Liu: Conceptualization, Methodology, Software, Validation, Writing – review & editing. Pengbin Feng: Conceptualization, Methodology, Software, Validation, Writing – original draft. Shu Wang: Visualization, Writing – review & editing. Kun Sun: Supervision, Funding acquisition. Jiahao Cao: Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We would like to thank the authors (Miramirkhani et al., 2017) for helping verify the effectiveness of our approach and Tommy Chin for proofreading this paper. This work is partially supported by ONR grants N00014-16-1-3214, N00014-16-1-3216, and N00014-18-2893.

Songsong Liu received his B.S degree in Information Security from Wuhan University in 2012 and his M.S. degree in Information Security from Huazhong University of Science and Technology in 2015. He is currently a Ph.D. candidate in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include system security, moving target defense, digital forensics.

References (54)

  • R. Ferreira et al.

    Repositioning privacy concerns: web servers controlling url metadata

    Journal of Information Security and Applications

    (2019)
  • P. Megyesi et al.

    User behavior based traffic emulator: a framework for generating test data for dpi tools

    Comput. Networks

    (2015)
  • J. Peng et al.

    User profiling in intrusion detection: a review

    Journal of Network and Computer Applications

    (2016)
  • A. Afianian et al.

    Malware dynamic analysis evasion techniques: a survey

    ACM Computing Surveys (CSUR)

    (2019)
  • A. Alwabel et al.

    Safe and automated live malware experimentation on public testbeds

    7th Workshop on Cyber Security Experimentation and Test (CSET 14)

    (2014)
  • Analysis, F. M., 2021. Safely execute and analyze malware in a secure environment. accessed in May....
  • H.S. Baird et al.

    Pessimalprint: a reverse turing test

    Int. J. Doc. Anal. Recogn.

    (2003)
  • J. Blackthorne et al.

    Avleak: fingerprinting antivirus emulators through black-box testing

    10th USENIX Workshop on Offensive Technologies (WOOT 16)

    (2016)
  • T.M. Braje

    Advanced tools for cyber ranges

    Technical Report

    (2016)
  • R.R. Branco et al.

    Scientific but not academical overview of malware anti-debugging, anti-disassembly and anti-vm technologies

    Black Hat

    (2012)
  • M. Brengel et al.

    Detecting hardware-assisted virtualization

    International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

    (2016)
  • A. Bulazel et al.

    A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web

    Proceedings of the 1st Reversing and Offensive-oriented Trends Symposium

    (2017)
  • P.M. Cao et al.

    CAUDIT: Continuous auditing of SSH servers to mitigate brute-force attacks

    16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)

    (2019)
  • T. Chakraborty et al.

    Forge: a fake online repository generation engine for cyber deception

    IEEE Trans Dependable Secure Comput

    (2019)
  • X. Chen et al.

    Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware

    2008 IEEE international conference on dependable systems and networks with FTCS and DCC (DSN)

    (2008)
  • B. Cheng et al.

    Towards paving the way for large-scale windows malware analysis: Generic binary unpacking with orders-of-magnitude performance boost

    Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

    (2018)
  • P. Dutta et al.

    Simulated user bots: Real time testing of insider threat detection systems

    2018 IEEE Security and Privacy Workshops (SPW)

    (2018)
  • P. Feng et al.

    Uber: Combating sandbox evasion via user behavior emulators

    ICICS

    (2019)
  • Hammond, M., 2021. pywin32. accessed in May....
  • F. Hassan et al.

    Utility-preserving privacy protection of textual documents via word embeddings

    IEEE Trans Knowl Data Eng

    (2021)
  • M.G. Kang et al.

    Emulating emulation-resistant malware

    Proceedings of the 1st ACM workshop on Virtual machine security

    (2009)
  • A. Kapravelos et al.

    Escape from monkey island: Evading high-interaction honeyclients

    International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

    (2011)
  • D. Keragala

    Detecting malware and sandbox evasion techniques

    SANS Institute InfoSec Reading Room

    (2016)
  • D. Kirat et al.

    Malgene: Automatic extraction of malware analysis evasion signature

    Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security

    (2015)
  • D. Kirat et al.

    Barecloud: Bare-metal analysis-based evasive malware detection

    23rd USENIX Security Symposium (USENIX Security 14)

    (2014)
  • C. Kruegel

    Full system emulation: Achieving successful automated dynamic analysis of evasive malware

    Proc. BlackHat USA Security Conference

    (2014)
  • M. Lindorfer et al.

    Detecting environment-sensitive malware

    International Workshop on Recent Advances in Intrusion Detection

    (2011)
  • Cited by (0)

    Songsong Liu received his B.S degree in Information Security from Wuhan University in 2012 and his M.S. degree in Information Security from Huazhong University of Science and Technology in 2015. He is currently a Ph.D. candidate in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include system security, moving target defense, digital forensics.

    Pengbin Feng received his B.S. degree in computer science and technology from Xidian University in 2013 and his Ph.D. degree in computer architecture from Xidian University in 2019. He was a postdoctoral research fellow at George Mason University between 2019 and 2021. He is currently an Assistant Professor with the School of Cyber Engineering at Xidian University. His research interests include information flow analysis, malware detection, and binary analysis.

    Shu Wang received his B.S. degree in Communication Engineering in 2014 and M.S. degree in Signal and Information Processing in 2017 from Nanjing University of Posts and Telecommunications. He is currently a Ph.D. student in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include software security, IoT security, adversarial machine learning.

    Kun Sun received his Ph.D. degree in Computer Science at North Carolina State University in 2006. Now he is an Associate Professor in the Department of Information Sciences and Technology at George Mason University. He is also the director of Sun Security Laboratory. Before joining GMU, he was an Assistant Professor at the College of William and Mary. His research focuses on systems and network security. Dr. Sun has more than 15 years of working experience in both industry and academia, publishing over 80 conference and journal papers. His current research focuses on the trustworthy computing environment, moving target defense, software security, password management, and software-defined networking.

    Jiahao Cao received his B.Eng. degree in communication engineering from Beijing University of Posts and Telecommunications in 2015 and his Ph.D. degree in computer science and technology from Tsinghua University in 2020. He was a visiting student at George Mason University between 2018 and 2019. He is currently a Postdoctoral Researcher with the Department of Computer Science and Technology, Tsinghua University. His research interests include network traffic analysis, SDN security, and container security.

    1

    Both authors contributed equally to this research.

    View full text