ecms_neu_mini.png

Digital Library

of the European Council for Modelling and Simulation

 

Title:

Improving Clustering Of Web Bot And Human Sessions By Applying Principal Component Analysis

Authors:

Grazyna Suchacka

Published in:

 

 

(2019). ECMS 2019 Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo, Massimo Ficco, European Council for Modeling and Simulation.

 

DOI: http://doi.org/10.7148/2019

 

ISSN: 2522-2422 (ONLINE)

ISSN: 2522-2414 (PRINT)

ISSN: 2522-2430 (CD-ROM)

 

33rd International ECMS Conference on Modelling and Simulation, Caserta, Italy, June 11th – June 14th, 2019

 

 

Citation format:

Grazyna Suchacka (2019). Improving Clustering Of Web Bot And Human Sessions By Applying Principal Component Analysis, ECMS 2019 Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo, Massimo Ficco European Council for Modeling and Simulation. doi: 10.7148/2019-0434

DOI:

https://doi.org/10.7148/2019-0434

Abstract:

The paper addresses the problem of modeling Web sessions of bots and legitimate users (humans) as feature vectors for their use at the input of classification models. So far many different features to discriminate bots’ and humans’ navigational patterns have been considered in session models but very few studies were devoted to feature selection and dimensionality reduction in the context of bot detection. We propose applying Principal Component Analysis (PCA) to develop improved session models based on predictor variables being efficient discriminants of Web bots. The proposed models are used in session clustering, whose performance is evaluated in terms of the purity of generated clusters. The efficiency of the proposed approach is experimentally verified using real server log data. Results show that PCA may be very efficient in dimensionality reduction and feature selection for session classification aiming at distinguishing Web robots.

Full text: