Abstract
Malicious domains are widely used in network attacks. As DNS traffic features related to domain access time, time-based features of domains can describe the regularity of malicious activities but cannot be circumvented by attackers. These features are commonly used in malicious domain detection. Traditional detection methods generally use static statistical values in analyzing time-based features, but they ignore the temporal regularity of the features, thus resulting in inaccurate feature extraction. In view of this, in this paper, we proposed an analysis method for time-based features of malicious domains based on time series clustering. Firstly, density clustering is used to divide time intervals of time series to preserve the integrity of consecutive requests of malicious domains. Secondly, multiple time-based features are selected to depict malicious activity patterns. Thirdly, dynamic time warping and hierarchical clustering are applied as series similarity measure and clustering method respectively. The proposed method explores malicious domains by analyzing the similarity of time-based features series of different domains. Experimental results show that compared with the detection method using static statistical values, the accuracy and precision in this method improves from 88.49% to 96.30% and from 59.21% to 92.43% respectively, which proves that it can help detect malicious domains effectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jiang, Y., Di, W.: An integrated Chinese malicious webpages detection method based on pre-trained language models and feature fusion. In: Zhao, X., Yang, S., Wang, X., Li, J. (eds.) Web Information Systems and Applications: 19th International Conference, WISA 2022, Dalian, China, September 16–18, 2022, Proceedings, pp. 155–167. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-20309-1_14
Plohmann, D., Yakdan, K., Klatt, M., et al.: A comprehensive measurement study of domain generating malware. In: 25th USENIX Security Symposium, pp. 263–278 (2016)
Almomani, A.: Fast-flux hunter: a system for filtering online fast-flux botnet. Neural Comput. Appl. 29(7), 483–493 (2016). https://doi.org/10.1007/s00521-016-2531-1
Iwahana, K., Takemura, T., Cheng, J., et al.: MADMAX: browser-based malicious domain detection through extreme learning machine. IEEE Access 9, 78293–78314 (2021)
Woodbridge, J., Anderson, H., Ahuja, A., et al.: Predicting domain generation algorithms with long short-term memory networks. ArXiv 1611.00791 (2016)
Saxe, J., Berlin, K.: eXpose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys. ArXiv 1702.08568 (2017)
Liang, Z., Zang, T., Zeng, Y.: Malportrait: sketch malicious domain portals based on passive DNS data. In: IEEE Wireless Communications and Networking Conference (2020)
Han, C., Zhang, Y., Zhang, Y.: Fast flucos: malicious domain name detection method for fast flux based on DNS traffic. J. Commun. 41(5), 37–47 (2020)
Zhang, S., Zhou, Z., Li, D., et al.: Attributed heterogeneous graph neural network for malicious domain. In: 24th International Conference on Computer Supported Cooperative Work in Design, pp. 397–403 (2021)
Bilge, L., Sen, S., Balzarotti, D., et al.: Exposure: a passive DNS analysis service to detect and report malicious domains. ACM Trans. Inf. Syst. Secur. 16(4), 1–28 (2014). https://doi.org/10.1145/2584679
Li, M., Li, Q., Xuan, G., et al.: Identifying compromised hosts under apt using DNS request sequences. J. Parallel Distrib. Comput. 152, 67–78 (2021)
Lazar, D., Cohen, K., Freund, A., et al.: IMDoC: identification of malicious domain campaigns via DNS and communicating files. IEEE Access 9, 45242–45258 (2021)
Niu, W., Xiao, J., Zhang, X., et al.: Malware on internet of UAVs detection combining string matching and fourier transformation. IEEE Internet Things J. 8(12), 9905–9919 (2021)
Tomatsuri, T., Chiba, D., Akiyama, M., et al.: Time-series measurement of parked domain names and their malicious uses. IEICE Trans. Commun. E104B(7), 770–780 (2021)
Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y.: Time-series clustering – A decade review. Inf. Syst. 53(16), 16–38 (2015)
Zhu, D., Li, Z., Hu, P., et al.: Improved DBSCAN algorithm based on relative mass of the data field. In: Proceedings of SPIE - The International Society for Optical Engineering, p. 12168 (2022)
Alaee, S., Mercer, R., Kamgar, K., et al.: Time series motifs discovery under DTW allows more robust discovery of conserved structure. Data Min. Knowl. Disc. 35(3), 863–910 (2021)
Ran, X., Xi, Y., Lu, Y., et al.: Comprehensive survey on hierarchical clustering algorithms and the recent developments. Artif. Intell. Rev. 56(8), 8219–8264 (2023)
NetLab DGA project: http://data.netlab.360.com/dga/. Last accessed 2 May 2023
Alexa's top ranked web sites: http://s3.amazonaws.com/alexa-static/top-1m.csv.zip. Last accessed 2 May 2023
Virustotal: https://www.virustotal.com/. Last accessed 2 May 2023
Acknowledgement
This work was supported by National Key R&D Program of China (2020YFB1805601) and the Research and Planning Project for Higher Education Science of China Association of Higher Education (22XX0403). The computing work in this paper was supported by the Public Service Platform of High Performance Computing by Network & Computation Center of HUST.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yan, G., Wen, K., Hong, J., Liu, L., Zhou, L. (2023). An Analysis Method for Time-Based Features of Malicious Domains Based on Time Series Clustering. In: Yuan, L., Yang, S., Li, R., Kanoulas, E., Zhao, X. (eds) Web Information Systems and Applications. WISA 2023. Lecture Notes in Computer Science, vol 14094. Springer, Singapore. https://doi.org/10.1007/978-981-99-6222-8_29
Download citation
DOI: https://doi.org/10.1007/978-981-99-6222-8_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6221-1
Online ISBN: 978-981-99-6222-8
eBook Packages: Computer ScienceComputer Science (R0)