Domain Generation Algorithm Detection Utilizing Model Hardening Through GAN-Generated Adversarial Examples

Gould, Nathaniel; Nishiyama, Taishi; Kamiya, Kazunori

doi:10.1007/978-3-030-59621-7_5

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1271))

Included in the following conference series:

International Workshop on Deployable Machine Learning for Security Defense

503 Accesses
3 Altmetric

Abstract

Modern malware families often utilize Domain Generation Algorithms (DGAs) to register addresses for their Command and Control (C&C) servers. Instead of hardcoding the address of the C&C domain in the malware, DGAs are used to frequently change the address of the C&C server, causing static detection methods, such as blacklists, to be ineffective. In response, DGA detection methods have been proposed which attempt to detect these DGA-produced domains in live traffic.

Previous research has investigated using domains generated from a Generative Adversarial Network (GAN) to increase the ability of a detection model to detect unseen DGA variants. Building upon this concept, we test a similar experiment using an improved GAN and detection model. For the GAN, we train a Gradient Penalty Wasserstein GAN using benign domains as an input to produce set generated domains that are difficult to differentiate from real domains. The resulting set of domains have characteristics, such as character distribution, that more closely resemble real domains than sets produced in previous research. We then use these GAN-produced domains as additional examples of DGA domains and use them to augment the training set for a DGA detection model. While a feature engineering approach has been used in previous research, we use a deep learning, convolutional neural network and long short-term memory based detection model which had significantly higher hold-out detection rates for many DGA families. After training, we evaluate the model by comparing its detection rate on several holdout DGA families with GAN augmentation compared to the same model which used an augmented training set. This is shown to increase the detection rate of the classifier (at a standardized false positive rate) on certain DGA families. Further, unlike previous approaches, we conduct significance testing on the resulting detection rates to more accurately show the effect that adversarial hardening had on the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Alomari, E., Manickam, S., Gupta, B., Karuppayah, S., Alfaris, R.: Botnet-based distributed denial of service (DDoS) attacks on web servers: classification and art. arXiv preprint arXiv:1208.0403 (2012)
Anderson, H.S., Woodbridge, J., Filar, B.: DeepDGA: adversarially-tuned domain generation and detection. In: Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security, pp. 13–21 (2016)
Google Scholar
Antonakakis, M., et al.: From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Presented as part of the 21st \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 2012), pp. 491–506 (2012)
Google Scholar
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)
Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8624–8628. IEEE (2013)
Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12, 2451–2471 (1999)
Article Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
Kim, Y., Jernite, Y., Sontag, D., Rush, A.: Character-aware neural language models. arXiv preprint arXiv:1508.06615 (2016)
Kumar, A.D., et al.: Enhanced domain generating algorithm detection based on deep neural networks. In: Alazab, M., Tang, M.J. (eds.) Deep Learning Applications for Cyber Security. ASTSA, pp. 151–173. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13057-2_7
Chapter Google Scholar
Mac, H., Tran, D., Tong, V., Nguyen, L.G., Tran, H.A.: DGA botnet detection using supervised learning methods. In: Proceedings of the Eighth International Symposium on Information and Communication Technology, pp. 211–218 (2017)
Google Scholar
Pathak, A., Qian, F., Hu, Y.C., Mao, Z.M., Ranjan, S.: Botnet spam campaigns can be long lasting: evidence, implications, and analysis. ACM SIGMETRICS Perform. Eval. Rev. 37(1), 13–24 (2009)
Article Google Scholar
Peck, J., et al.: CharBot: a simple and effective method for evading DGA classifiers. IEEE Access 7, 91759–91771 (2019)
Article Google Scholar
Plohmann, D., Yakdan, K., Klatt, M., Bader, J., Gerhards-Padilla, E.: A comprehensive measurement study of domain generating malware. In: 25th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 2016), pp. 263–278 (2016)
Google Scholar
Pochat, V.L., Van Goethem, T., Tajalizadehkhoob, S., Korczyński, M., Joosen, W.: Tranco: a research-oriented top sites ranking hardened against manipulation. arXiv preprint arXiv:1806.01156 (2018)
Schiavoni, S., Maggi, F., Cavallaro, L., Zanero, S.: Phoenix: DGA-based botnet tracking and intelligence. In: Dietrich, S. (ed.) DIMVA 2014. LNCS, vol. 8550, pp. 192–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08509-8_11
Chapter Google Scholar
Sidi, L., Nadler, A., Shabtai, A.: MaskDGA: a black-box evasion technique against DGA classifiers and adversarial defenses. arXiv preprint arXiv:1902.08909 (2019)
Vinayakumar, R., Soman, K.P., Poornachandran, P., Alazab, M., Jolfaei, A.: DBD: deep learning DGA-based botnet detection. In: Alazab, M., Tang, M.J. (eds.) Deep Learning Applications for Cyber Security. ASTSA, pp. 127–149. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13057-2_6
Chapter Google Scholar
Vosoughi, S., Vijayaraghavan, P., Roy, D.: Tweet2Vec: learning tweet embeddings using character-level CNN-LSTM encoder-decoder. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1041–1044 (2016)
Google Scholar
Woodbridge, J., Anderson, H.S., Ahuja, A., Grant, D.: Predicting domain generation algorithms with long short-term memory networks. arXiv preprint arXiv:1611.00791 (2016)
Yu, B., Pan, J., Hu, J., Nascimento, A., De Cock, M.: Character level based detection of DGA domain names. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, Georgia
Nathaniel Gould
NTT Secure Platform Laboratories, Tokyo, Japan
Taishi Nishiyama & Kazunori Kamiya

Authors

Nathaniel Gould
View author publications
You can also search for this author in PubMed Google Scholar
Taishi Nishiyama
View author publications
You can also search for this author in PubMed Google Scholar
Kazunori Kamiya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathaniel Gould .

Editor information

Editors and Affiliations

University of Illinois at Urbana Champaign, Urbana, IL, USA
Gang Wang
Blue Hexagon Inc., Sunnyvale, CA, USA
Arridhana Ciptadi
Blue Hexagon Inc., Sunnyvale, CA, USA
Ali Ahmadzadeh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gould, N., Nishiyama, T., Kamiya, K. (2020). Domain Generation Algorithm Detection Utilizing Model Hardening Through GAN-Generated Adversarial Examples. In: Wang, G., Ciptadi, A., Ahmadzadeh, A. (eds) Deployable Machine Learning for Security Defense. MLHat 2020. Communications in Computer and Information Science, vol 1271. Springer, Cham. https://doi.org/10.1007/978-3-030-59621-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-59621-7_5
Published: 18 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59620-0
Online ISBN: 978-3-030-59621-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics