Enhancing malware analysis sandboxes with emulated user behavior
Introduction
Along with the increasing number of sophisticated malware that adopts packing and obfuscation techniques (Cheng, Ming, Fu, Peng, Chen, Zhang, Marion, 2018, Sharif, Yegneswaran, Saidi, Porras, Lee, 2008), malware analysis sandbox systems have been widely used to help malware analysts provide the fine-grained dissection of malicious functionalities via monitoring malware’s run-time behaviors (Analysis, Inc, 2021, McAfee). Meanwhile, malware developers began to develop countermeasures to circumvent sandboxes (Afianian et al., 2019). For Example, malware can identify sandbox environments by checking system traits such as usernames, system settings, analysis instrumentation files, and installed drivers (Chen, Andersen, Mao, Bailey, Nazario, 2008, Kapravelos, Cova, Kruegel, Vigna, 2011, Keragala, 2016, Lindorfer, Kolbitsch, Comparetti, 2011). Moreover, advanced malware may evade sandboxes by performing timing attacks (Brengel, Backes, Rossow, 2016, Pék, Bencsáth, Buttyán, 2011), detecting CPU virtualization (Alwabel, Shi, Bartlett, Mirkovic, 2014, Martignoni, Paleari, Fresi Roglia, Bruschi, 2010, Paleari, Martignoni, Roglia, Bruschi, 2009), checking process introspection indicators (Blackthorne, Bulazel, Fasano, Biernat, Yener, 2016, Petsas, Voyatzis, Athanasopoulos, Polychronakis, Ioannidis, 2014), or even leveraging reverse Turing test (Baird, Coates, Fateman, 2003, Yokoyama, Ishii, Tanabe, Papa, Yoshioka, Matsumoto, Kasama, Inoue, Brengel, Backes, et al., 2016).
Researchers have proposed various mitigation approaches such as binary modification (Vasudevan and Yerraballi, 2006), hiding environmental artifacts (Willems et al., 2007), path exploration (Branco et al., 2012), state modification (Kang et al., 2009), heterogeneous analysis (Xu, Kim, 2017, Yan, Jayachandra, Zhang, Yin, 2012), and bare-metal analysis (Spensky et al., 2016) to minimize or eliminate discrepancies between malware analysis sandboxes and real systems. However, it remains a challenge to defeat a new anti-sandbox technique that leverages system artifacts (i.e., the registry entries, the event logs, the browsing histories, and the cached files) accumulated by normal user operations to distinguish sandbox environments from real systems (Miramirkhani et al., 2017).
To tackle the defect of lacking authentic system artifacts in the existing sandbox designs, one spontaneous approach is to construct the sandbox environments by directly cloning real user systems. However, this approach has some limitations (Miramirkhani et al., 2017). First, the system artifacts from a real user system may contain the user’s private information that needs to be quarantined. It is challenging to thoroughly clean user private information and retain the most authentic system artifacts. Second, the system artifacts in the cloned system become outdated quickly without continuous user interactions. It is time-consuming to clone the latest real user system and then conduct data cleaning for each round of malware analysis.
In this paper, we propose a new framework called User Behavior Emulator (UBER) that profiles authentic user activities to generate realistic system artifacts via user behavior emulation for the sandbox environments. Our design is based on one basic observation, namely, most malware is developed for mass attacks that do not aim at compromising a targeted computer. Therefore, it is plausible to profile a user behavior model using any normal authentic user and then simulate user activities according to an abstracted user behavior model. UBER consists of four components, namely, computer usage collector, user profile generator, artifacts generator, and update scheduler. The computer usage collector first leverages the system-event monitor technique to capture long-time computer usage information from real users. Then, the user profile generator performs statistical analysis on the collected usage information to construct a user behavior profile. Based on the user behavior profile, the artifacts generator continuously generates realistic system artifacts by emulating authentic user behavior via automation control techniques. Finally, the update scheduler periodically integrates the emulated system artifacts into the malware analysis sandbox to ensure ”up-to-date” artifacts in the sandbox.
Instead of directly modeling the patterns of various low-level system artifacts, we choose to emulate the high-level user behavior, which is the source of artifact generation. It is difficult to generate complete and consistent system artifacts; however, we can derive the user behavior from the collected system artifacts, similar to the user profile in intrusion detection field (Peng et al., 2016). UBER collects comprehensive computer usage information including system events and application logs to construct user behavior profiles. To minimize privacy leakage, UBER only records the statistical characteristics (i.e., application usage times, file operations, and UI events) of the computer usage information. Moreover, UBER relies on the statistical data from public websites (e.g., Alexa2, Google Trends3) to profile web browser activities, such as accessing top sites, searching common terms.
We adopt a new deployment strategy in UBER to perform user behavior emulation within an isolated always-on system (continuously running to accumulate the user artifacts) and then copy this system to the malware analysis sandboxes on demand. This approach has three advantages over the solution that deploys the emulator directly in the malware analysis sandboxes. First, it can prevent the emulation processes from being exploited as an indicator for the evasion malware to identify sandboxes. In other words, it keeps our design much stealthier. Second, it can avoid enlarging the attack surface of the sandboxes, in contrast to emulating user behavior directly on sandbox environments. Third, it prevents the emulation processes from competing for system resources and interfering with the analysis results. UBER includes an update scheduler to ensure up-to-date artifacts in sandboxes by performing the copy process regularly or on-demand since the system artifacts copied into the sandbox environments will become obsolete without persistent user operations. Given that one malware analysis sandbox is usually rolled back to its initial state after each malware analysis (Yokoyama et al., 2016), UBER replaces the initial state with the always-on emulated system containing the most up-to-date artifacts.
We implement a prototype of UBER based on python system-event monitor and automation control modules (Hammond, Mangalapilly, Moses-palmer, Pywinauto, Selenium). To evaluate the effectiveness of the generated artifacts of UBER, we deploy the artifacts generator on a virtual machine with a fresh (newly installed) Windows OS as a sandbox and manually operate the cloned fresh virtual machine as a ”real” system simultaneously for comparison. After running these two systems for one month, we observe that both systems accumulate overall comparable amounts of system artifacts. We further explore the daily variation of artifacts generation and find that UBER is able to simulate the artifacts accumulation processes of real user systems. Finally, we leverage the state-of-the-art classifier provided by Miramirkhani et al. (2017) to verify the authenticity of the system deploying with UBER. The experimental results indicate that UBER can effectively generate realistic artifacts via the emulation of real user operations to defeat the anti-sandbox technique that leverages system artifacts. Moreover, UBER can be extended to analyze targeted malware that aims to compromise a specific target machine by automatically profiling user behavior models representing specific user activities.
In summary, this paper makes the following contributions:
- •
We design an emulation-based system called the User Behavior Emulators (UBER) to enhance malware analysis sandboxes by generating realistic system artifacts based on the automatically derived user profile model.
- •
We develop a new approach to create high-fidelity sandbox environments by emulating the high-level user behavior in an isolated always-on system and then stealthily merging this system to the malware analysis sandboxes.
- •
We implement a prototype of UBER and our experimental results demonstrate the effectiveness of UBER in defeating the sandbox evasion technique that exploits various system fingerprinting.
This paper is an extension of our conference paper (Feng et al., 2019), which has been published in the Proceedings of ICICS 2019. In this version, we improve the original approach in the following aspects:
- •
Automation user profile generation: we redesign the collector to monitor the comprehensive computer usage information automatically and derive user profiles over a manual collection process.
- •
Extending the system artifacts list: we design a new emulation strategy to complement 44 artifacts through user behavior emulation, instead of generating a small number of typical artifacts.
- •
Reevaluating the prototype system: we enrich experiments to verify the effectiveness of our approach against user behavior artifacts detection for the sandbox.
- •
Adding detailed discussion about this approach: we compare our solution with existing research and other alternate solutions of user artifacts generation.
This paper is structured as follows. Section 2 presents the security goal and threat model of UBER. Section 3 presents the overview design of UBER system and Section 4 presents the detailed implementation of the prototype of UBER. Section 5 evaluates the effectiveness of UBER via experiments. Section 6 discusses the limitation and future direction of UBER. Section 7 outlines related work. Section 8 concludes this paper.
Section snippets
Threat model
In this paper, we focus on defeating the malware-used evasion techniques of checking authentic system artifacts generated in normal user activities (Miramirkhani, Appini, Nikiforakis, Polychronakis, 2017, Yokoyama, Ishii, Tanabe, Papa, Yoshioka, Matsumoto, Kasama, Inoue, Brengel, Backes, et al., 2016). In a real user system, normal users perform various actions such as browsing websites, editing office software, and coding computer programs. All those actions can generate accumulated system
System design
In this section, we first present the overview architecture of the UBER system, then we introduce the design of each component.
Prototype implementation
We implement a prototype of UBER on Windows OS. The main reason is that the most effective sandbox detection solution (Miramirkhani et al., 2017) of checking the wear-and-tear artifacts was developed on Windows. Thus, we implement UBER on Windows to show that our sandbox system can defeat the detection mechanism proposed by Miramirkhani et al. (2017). The implementation architecture is shown in Fig. 4. UBER first leverages python monitor modules including watchdog (Mangalapilly, 2021), pynput (
Effectiveness evaluation
We evaluate the defense effectiveness of UBER against user behavior artifacts detection. To the best of our knowledge, there is no publicly released malware that detects the sandbox with the technique introduced by paper (Miramirkhani et al., 2017). Therefore, it is difficult to evaluate our solution using real malware. To overcome this challenge, we present the defense effectiveness of UBER in three aspects. First, we compare the long-term accumulated artifacts generated by the real user and
More fine-grained profiling
UBER generates the most commonly identified user artifacts by emulating the relevant activities. This approach does not apply to generating artifacts associated with proprietary, customized, or otherwise less popular software. This type of software usually generates its own unique artifacts. If one attacker aims at a target utilizing this specific software, this approach will become less effective. To defeat the sandbox evasion technique that utilizes these software-specific artifacts, we could
Artifacts identification-based sandbox evasion techniques
Many researchers have studied sandbox evasion techniques and corresponding mitigation strategies. Afianian et al. Afianian et al. (2019) present a detailed survey on existing sandbox evasion techniques and summarize possible countermeasures towards that evasion malware. Alexei et al. Bulazel and Yener (2017) systematically review evasion techniques against automated dynamic malware analysis on PC, mobile, and web. Dilshan (Keragala, 2016) provides an overview of commonly used evasion techniques
Conclusion
With the wide application of malware analysis sandboxes, malware authors have also developed sophisticated evasion techniques to evade sandbox environments. Among them, one particular technique is to fingerprint various artifacts generated during the normal usage of the real system, which cannot be countered via state-of-the-art mitigation strategies. Given existing malware analysis sandboxes deployed on pristine operating system images, we investigate the typical system artifacts and propose a
CRediT authorship contribution statement
Songsong Liu: Conceptualization, Methodology, Software, Validation, Writing – review & editing. Pengbin Feng: Conceptualization, Methodology, Software, Validation, Writing – original draft. Shu Wang: Visualization, Writing – review & editing. Kun Sun: Supervision, Funding acquisition. Jiahao Cao: Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We would like to thank the authors (Miramirkhani et al., 2017) for helping verify the effectiveness of our approach and Tommy Chin for proofreading this paper. This work is partially supported by ONR grants N00014-16-1-3214, N00014-16-1-3216, and N00014-18-2893.
Songsong Liu received his B.S degree in Information Security from Wuhan University in 2012 and his M.S. degree in Information Security from Huazhong University of Science and Technology in 2015. He is currently a Ph.D. candidate in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include system security, moving target defense, digital forensics.
References (54)
- et al.
Repositioning privacy concerns: web servers controlling url metadata
Journal of Information Security and Applications
(2019) - et al.
User behavior based traffic emulator: a framework for generating test data for dpi tools
Comput. Networks
(2015) - et al.
User profiling in intrusion detection: a review
Journal of Network and Computer Applications
(2016) - et al.
Malware dynamic analysis evasion techniques: a survey
ACM Computing Surveys (CSUR)
(2019) - et al.
Safe and automated live malware experimentation on public testbeds
7th Workshop on Cyber Security Experimentation and Test (CSET 14)
(2014) - Analysis, F. M., 2021. Safely execute and analyze malware in a secure environment. accessed in May....
- et al.
Pessimalprint: a reverse turing test
Int. J. Doc. Anal. Recogn.
(2003) - et al.
Avleak: fingerprinting antivirus emulators through black-box testing
10th USENIX Workshop on Offensive Technologies (WOOT 16)
(2016) Advanced tools for cyber ranges
Technical Report
(2016)- et al.
Scientific but not academical overview of malware anti-debugging, anti-disassembly and anti-vm technologies
Black Hat
(2012)
Detecting hardware-assisted virtualization
International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment
A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web
Proceedings of the 1st Reversing and Offensive-oriented Trends Symposium
CAUDIT: Continuous auditing of SSH servers to mitigate brute-force attacks
16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19)
Forge: a fake online repository generation engine for cyber deception
IEEE Trans Dependable Secure Comput
Towards an understanding of anti-virtualization and anti-debugging behavior in modern malware
2008 IEEE international conference on dependable systems and networks with FTCS and DCC (DSN)
Towards paving the way for large-scale windows malware analysis: Generic binary unpacking with orders-of-magnitude performance boost
Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
Simulated user bots: Real time testing of insider threat detection systems
2018 IEEE Security and Privacy Workshops (SPW)
Uber: Combating sandbox evasion via user behavior emulators
ICICS
Utility-preserving privacy protection of textual documents via word embeddings
IEEE Trans Knowl Data Eng
Emulating emulation-resistant malware
Proceedings of the 1st ACM workshop on Virtual machine security
Escape from monkey island: Evading high-interaction honeyclients
International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Detecting malware and sandbox evasion techniques
SANS Institute InfoSec Reading Room
Malgene: Automatic extraction of malware analysis evasion signature
Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security
Barecloud: Bare-metal analysis-based evasive malware detection
23rd USENIX Security Symposium (USENIX Security 14)
Full system emulation: Achieving successful automated dynamic analysis of evasive malware
Proc. BlackHat USA Security Conference
Detecting environment-sensitive malware
International Workshop on Recent Advances in Intrusion Detection
Cited by (0)
Songsong Liu received his B.S degree in Information Security from Wuhan University in 2012 and his M.S. degree in Information Security from Huazhong University of Science and Technology in 2015. He is currently a Ph.D. candidate in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include system security, moving target defense, digital forensics.
Pengbin Feng received his B.S. degree in computer science and technology from Xidian University in 2013 and his Ph.D. degree in computer architecture from Xidian University in 2019. He was a postdoctoral research fellow at George Mason University between 2019 and 2021. He is currently an Assistant Professor with the School of Cyber Engineering at Xidian University. His research interests include information flow analysis, malware detection, and binary analysis.
Shu Wang received his B.S. degree in Communication Engineering in 2014 and M.S. degree in Signal and Information Processing in 2017 from Nanjing University of Posts and Telecommunications. He is currently a Ph.D. student in the Center for Secure Information Systems (CSIS) at George Mason University. His research interests include software security, IoT security, adversarial machine learning.
Kun Sun received his Ph.D. degree in Computer Science at North Carolina State University in 2006. Now he is an Associate Professor in the Department of Information Sciences and Technology at George Mason University. He is also the director of Sun Security Laboratory. Before joining GMU, he was an Assistant Professor at the College of William and Mary. His research focuses on systems and network security. Dr. Sun has more than 15 years of working experience in both industry and academia, publishing over 80 conference and journal papers. His current research focuses on the trustworthy computing environment, moving target defense, software security, password management, and software-defined networking.
Jiahao Cao received his B.Eng. degree in communication engineering from Beijing University of Posts and Telecommunications in 2015 and his Ph.D. degree in computer science and technology from Tsinghua University in 2020. He was a visiting student at George Mason University between 2018 and 2019. He is currently a Postdoctoral Researcher with the Department of Computer Science and Technology, Tsinghua University. His research interests include network traffic analysis, SDN security, and container security.
- 1
Both authors contributed equally to this research.