An anomaly-based approach to the analysis of the social behavior of VoIP users

doi:10.1016/j.comnet.2013.02.009

Computer Networks

Volume 57, Issue 6, 22 April 2013, Pages 1545-1559

https://doi.org/10.1016/j.comnet.2013.02.009 Get rights and content

Abstract

In this paper we present the results of a study we recently conducted by analyzing a large data set of VoIP Call Detail Records (CDRs), provided by an Italian telecom operator. The objectives of this study were twofold: (i) first, to provide a representation of users behavior, as well as of their mutual interaction and communication patterns, allowing to identify certain easily separable user categories; and (ii) second, to design and implement a framework calculating such a representation starting from CDR, capable of operating within certain time constraints, and grouping users using unsupervised techniques.

The paper shows how we can reliably identify behavioral patterns associated with the most common anomalous behaviors of VoIP users. It also exploits the expressive power of relational graphs in order to both validate the results of the unsupervised analysis and ease their interpretation by human operators.

Introduction

The foundation of the work described in this paper consists of the definition of a behavioral model for profiling users of a commercial VoIP (Voice over IP) service. The user profile consists of a set of user-related information, including identifiers, characteristics, preferences, as well as data related to activities that are relevant to understand and predict the user’s behavior in a certain domain.

User profiling is the continuous process of creating, maintaining and updating user profiles aimed at exploiting them in different fields, e.g., marketing, provisioning of targeted services, performing of optimization tasks of various kinds.

As part of the profiling activity, clustering and segmentation of customers can ease the task of understanding what are the main emerging behavioral attitudes the service provider has to meet.

These considerations also hold for VoIP network and service providers.

In this field, it is very useful for telecom operators to gain knowledge about both global information regarding VoIP usage (e.g., in order to manage resource deployment) and more fine-grained details about users’ habits (e.g., in order to design tailored value added ICT services). Moreover, user profiling can be profitably exploited for monitoring and protecting the security of VoIP platforms. As a matter of fact, several types of VoIP threats consist of malicious users (either human agents or automatic tools) issuing unsolicited and unwanted calls towards legitimate VoIP customers with the purpose of telemarketing, making frauds, or, to some extent, causing denial of service. Indeed, a high number of unsolicited calls targeting legitimate users, in addition to raising the customers’ dissatisfaction with respect to the negotiated service level, can even result in network faults and service blocks. Literature refers to such kind of attacks as “social threats”, due to the direct contact the attacker establishes with the victim and to the exploitation of the “social” implications of the usage of VoIP.

In this paper we aim to show how it is possible to capture different families of VoIP-callers behaviors by means of a profiling system based on a well-designed user behavior model. In particular, we show how we succeed in isolating anomalous behaviors that can be potentially malicious and that are then worth to be paid attention by the security manager. The study is conducted on real-world data, in the form of VoIP Call Detail Records (CDRs), made available to us by an Italian telecom operator in the framework of a sponsored research project called CAST (Countering Attacks and Social Threats in VoIP Networks). The profiling system we herein describe has been developed in the context of such a project, so we will refer to it as to the CAST system.

The rest of the paper is structured as follows. In Section 2 we present a brief taxonomy of the most well-known security threats currently affecting VoIP networks, with a focus on social threats. In Section 3 we will present the current state of the art in the field of VoIP security, with an eye on the most interesting proposals related to social attacks detection. Section 4 presents an overview of the CAST architecture.

The main details associated with its implementation are provided in Section 5, whereas Section 6 illustrates the results we obtained by applying the CAST profiler to the analysis of the real data about VoIP calls which were made available to us. As a further contribution to the analysis, we also provide in Section 7 more information about the application of the theory of social graphs to the analyzed data set. Finally, Section 8 summarizes the main results of our current efforts and indicates the most interesting directions of our future work.

Section snippets

VoIP security threats

VoIP (Voice over IP) communications are becoming more and more widespread nowadays, thanks to their positive impact on both capital and operational expenses. However, the possibility of leveraging the existing Internet infrastructure in order to provide such kind of services also entails a number of new challenges that the operators must face, with special regard to the unavoidable issues associated with the exploitation of ‘open’ solutions, both at the architectural and at the protocol layer.

State of the art in social attacks detection

In this section we briefly present the most interesting research studies aimed at countering SPIT [3]. As it will become apparent in the following, most of the literature embraces approaches which belong to the wide field of Intrusion Detection, ranging from artificial intelligence algorithms to the exploitation of the typical parameters characterizing VoIP networks, through the study of the connections of the social network associated with this kind of communication infrastructures. This brief

CAST approach to behavior profiling

CAST (Countering Attacks and Social Threats in VoIP Networks¹) mainly focuses on social aspects associated with user interaction patterns, with the final goal to extract the typical profiles and thus be capable of identifying potential anomalous behaviors in the analyzed data. The mentioned objective can be achieved through the application of artificial intelligence techniques allowing for the classification of the multiple instances

CAST implementation

Given the general architecture described above, in the following section we will focus on the details related to the critical aspects we had to consider during the implementation of the system, namely feature selection and interpretation of the output of the classification process through the application of the theory of social graphs.

CAST in action: user profiling based on social habits

In this section we will finally see CAST in the field, by showing the results of the analysis of a very large data set. More precisely, as announced in the previous section, after a description of the main patterns embedded in the input data, we will first illustrate the clustering results and then assess their validity through a comparison with the most representative charts produced by the graphical analysis of the data set.

An in-depth analysis based on relational graphs

As we mentioned earlier, a very useful analysis tool is definitely represented by relational graphs highlighting the relationships among callers and callees within a predefined observation window. A relational graph is a graph where each node represents a user and each edge represents a relation (a call, in this context) between users. As an example, such graphs can help identify at a glance the presence of so-called hub nodes, i.e., those users attracting the highest number of calls, both

Conclusions and future work

The work herein documented has been devoted to define, implement and test a framework and a user behavior model capable of exposing typical and anomalous behaviors in operational VoIP platforms. The analysis has been conducted on 2 months of real VoIP traffic data in the form of anonymized “call detail records” provided by an Italian telecom operator, and by exploiting the CAST system, developed with the goal of detecting social attacks by leveraging both clustering algorithms and anomaly

Simonpietro Chiappetta received both his B.Sc. and M.Sc. Degree in Computer Engineering from the University of Napoli “Federico II” in 2007 and 2011, respectively. For his thesis, he cooperated with the research group in Computer Networks at the University of Napoli and worked on the CAST project for the protection of VoIP infrastructures from social attacks. He is currently looking for new working challenges in Dublin, Ireland.

References (17)

H. Kim et al.
DEVS-based modeling of VoIP spam callers’ behavior for SPIT level calculation
Simulation Modelling Practice and Theory
(2009)
C. Mazzariello et al.
Clustering NGN user behavior for anomaly detection
Information Security Technical Report
(2011)
V.S. Alliance
VoIP Security and Privacy Threat Taxonomy
(2005)
K. Mitnick
Ghost in the Wires: My Adventures as the World’s Most Wanted Hacker
(2011)
A. Keromytis
A comprehensive survey of VoIP security research, communications surveys tutorials
IEEE PP
(2011)
C. Porschmann, H. Knospe, Analysis of Spectral Parameters of Audio Signals for the Identification of Spam Over IP...
J. Quittek, S. Niccolini, S. Tartarelli, M. Stiemerling, M. Brunner, T. Ewald, Detecting SPIT Calls by Checking Human...
R. Dantu, P. Kolan, Detecting Spam in VoIP Networks, in: Proceedings of the Steps to Reducing Unwanted Traffic on the...

There are more references available in the full text version of this article.

Cited by (15)

Socioscope: I know who you are, a robo, human caller or service number
2020, Future Generation Computer Systems
Citation Excerpt :
Li et al. [32] use 29 features along with the machine learning algorithms to predict whether the subscriber is a legitimate user or a spammer. Chiappetta et al. [33] used an unsupervised clustering algorithm i.e., the K-Means algorithm to group users based on the behavioral model. Collaborative approaches have been proposed [34,35] where the number of telecommunication operators collaborates for improving the detection time and the detection rate.
Telephony technologies (mobile, VoIP, and fixed) have potentially improved the way we communicate in our daily life and have been widely adopted for business and personal communications. At the same time, scammers, criminals, and fraudsters have also find the telephony network an attractive and affordable medium to target end-users with the advertisement, marketing of legal and illegal products, and bombard them with the huge volume of unwanted calls. These calls would not only trick call recipients into disclosing their private information such as credit card numbers, PIN code which can be used for financial fraud but also causes a lot of displeasure because of continuous ringing. The fraudsters, political campaigners can also use telephony systems to spread malicious information (hate political or religious messages) in real-time through audio or text messages, which have serious political and social consequences if malicious callers are not mitigated in a quick time. In this context, the identification of malicious callers would not only minimize telephony fraud but would also bring peace to the lives of individuals. One way to classifies users as a spammer or legitimate is to get feedback from the call recipients about their recent interactions with the caller, but these systems not only bring inconvenience to callees but also require changes in the system design. The call detail records extensively log the activities of users and can be used to categorize them as the spammer and non-spammer. In this paper, we utilize the information from the call detailed records and proposed a spam detection framework for the telephone network that identifies malicious callers by utilizing the social behavioral features of users within the network. To this extent, we first model the behavior of the users as the directed social graph and then analyze different features of the social graph i.e. the Relationship Network and Call patterns of users towards their peers. We then used these features along with the decision tree to classify callers into three classes i.e. human, spammer and call center. We analyzed the call record data-set consisting of more than 2 million users. We have conducted a detailed evaluation of our framework which demonstrates its effectiveness by achieving acceptable detection accuracy and extremely low false-positive rate. The performance results show that the spammers and call center numbers not only have a large number of non-repetitive calls but also have a large number of short duration calls. Similarly, on the other hand, the legitimate callers have a good number of repetitive calls and most of them interacted for a relatively long duration.
Rapid detection of spammers through collaborative information sharing across multiple service providers
2019, Future Generation Computer Systems
Citation Excerpt :
In a real network scenario, it is possible that spammers have different calling behaviors. In this simulation setup, we evaluate the performance of COSDS system for three types of callers [37]. ( 1) callers calling a large number of unique callees, all their successful calls have a good call duration, but callers do not receive calls from their callees.
Spammers and telemarketers target a very large number of recipients usually dispersed across many Service Providers (SPs). Collaboration and Information sharing between SPs would increase the detection accuracy but detection effectiveness depends on the amount of information shared between SPs. Having service provider’s exchange call detail records would arguably attain the best detection accuracy but would require significant network resources. Moreover, SPs are likely to feel uncomfortable in sharing their call records because call records contain user’s private information as well as operational details of their networks. The challenge towards the design of collaborative Spam over Internet Telephony (SPIT) detection system is two-fold: it should attain high detection accuracy with a small false positive, and should fully protect the privacy of users and their service providers. In this paper, we propose a COllaborative Spit Detection System (COSDS)—a collaborative SPIT detection system for the Voice over IP (VoIP) network where service providers collaborate for the effective and early detection of SPIT callers without raising privacy concerns. To this extent, COSDS relies on a trusted Centralized Repository (CR) and exchange of non-sensitive reputation scores. The CR computes global reputation of users by aggregating the reputation scores provided by the respective collaborating SPs. The data exchanged to the CR is not sensitive regarding users privacy, and cannot be used to infer the relationship network of users. We evaluate the performance of our system using synthetic data that we have generated by simulating the realistic social behavior of spammers and non-spammers in a network. The results show that the COSDS approach has better detection accuracy as compared to the traditional stand-alone detection systems. For instances, in a setup where spammers are making calls to recipients of many SPs, COSDS successfully identifies spammers with the True Positive (TP) rate of around 80% and false positive (FP) rate of around 2% on a first day, which further increases to 100% TP rate and zero FP rate in three days. COSDS approach is fast, requires a small communication overhead, ensures privacy of users and collaborating SP, and requires only few iterations for the reputation convergence within the SP.
Early identification of spammers through identity linking, social network and call features
2017, Journal of Computational Science
Citation Excerpt :
In this section, we compute reputation of an individual (after linking his identities) and classify individuals as spammer if global reputation score of individual is less than automated threshold (Section 4.3). We generated a synthetic data-set for spam and non-spam users using the approach presented in [5,53]. However, in this simulation setup we changed identities of spammers over the time for different overlaps in a friendship network.
Multiple identities are created to gain financial benefits by performing malicious activities such as spamming, committing frauds and abusing the system. A single malicious individual may have a large number of identities in order to make malicious activities to a large number of legitimate individuals. Linking identities of an individual would help in protecting the legitimate users from abuses, frauds, and maintains reputation of the service provider. Simply analyzing each identity's historical behavior is not sufficient to block spammers frequently changing identity because spammers quickly discards the identity and start using new one. Moreover, spammers may appear as a legitimate user on an initial analysis, for example because of small number of interactions from any identity. The challenge is to identify the spammer by analyzing the aggregate behavior of an individual rather than that of a single calling identity. This paper presents EIS (early identification of spammers) system for the early identification of spammers frequently changing identities. Specifically, EIS system consists of three modules and uses social call graph among identities. (1) An ID-CONNECT module that links identities that belongs to a one physical individual based on a social network structure and calling attributes of identities; (2) a reputation module that computes reputation of an individual by considering his aggregate behavior from his different identities; and (3) a detection module that computes automated threshold below which individuals are classified as a spammer or a non-spammer. We evaluate the proposed system on a synthetic data-set that has been generated for the different graph networks and different percentage of spammers. Performance analysis shows that EIS is effective against spammers frequently changing their identities and is able to achieve high true positive rate when spammers have high small overlap in target victims from their identities.
Kerberos: A real-time fraud detection system for IMS-enabled VoIP networks
2017, Journal of Network and Computer Applications
In this paper we present the design, implementation and experimental evaluation of Kerberos, an architecture for the detection of frauds in current generation Voice over IP (VoIP) networks. Kerberos is fed by an On-line Charging System (OCS) generating events associated with the setup, evolution and tear-down of end-user calls in a VoIP network compliant with the IP Multimedia Subsystem (IMS) specification. Such events are properly correlated in order to identify, in real-time, patterns associated with a fraudulent utilization of the Operator's resources. The detection phase can in turn trigger the subsequent remediation actions. Communication between the OCS and Kerberos is based on an asynchronous paradigm, whereas event correlation and analysis are effectively realized through a Complex Event Processing approach. The paper will shed light on both the design and the implementation of the system, whose performance is then evaluated by relying on a real-world dataset of Call Detail Record (CDR) events provided by Tiscali, a well known Italian Operator.
VoIP security auditing model based on COBIT 4.1
2022, International Journal of Security and Networks
Privy: Privacy Preserving Collaboration Across Multiple Service Providers to Combat Telecom Spams
2020, IEEE Transactions on Emerging Topics in Computing

View all citing articles on Scopus

Claudio Mazzariello received a degree in Telecommunications Engineering in 2004, and a Ph.D. in Computer Engineering in 2007, both from the University of Napoli Federico II, Italy. He is currently a PostDoc at the Computer Science Department of the University of Napoli Federico II. His research interests fall in the fields of networking and artificial intelligence, focussing on network security, intrusion and botnet detection, and multiple classifier systems. He is currently involved in several research projects concerning the design of intelligent and cooperating solutions for network security.

Roberta Presta received both his B.Sc. and M.Sc. Degree in Telecommunications Engineering from the University of Napoli “Federico II” in 2005 and 2009, respectively. She is currently a Ph.D. student in Computer Engineering and Systems at the Computer Science Department of University of Napoli “Federico II”. Her research interests primarily fall in the field of networking, with special regard to Real-time multimedia architectures and network security.

Simon Pietro Romano received the degree in Computer Engineering from the University of Napoli “Federico II”, Italy, in 1998. He obtained a Ph.D. degree in Computer Networks in 2001. He is currently an Assistant Professor at the Computer Science Department of the University of Napoli.

His research interests primarily fall in the field of networking, with special regard to real-time multimedia applications, network security and autonomic network management. He is currently involved in a number of European research projects, whose main objective is the design and implementation of effective solutions for Critical Infrastructure Protection. He actively participates in IETF standardization activities, in the RAI (Real-time Applications and Infrastructure) area, where he chairs the SPLICES working group on loosely coupled SIP devices.

View full text

An anomaly-based approach to the analysis of the social behavior of VoIP users

Abstract

Introduction

Section snippets

VoIP security threats

State of the art in social attacks detection

CAST approach to behavior profiling

CAST implementation

CAST in action: user profiling based on social habits

An in-depth analysis based on relational graphs

Conclusions and future work

Simulation Modelling Practice and Theory

Information Security Technical Report

VoIP Security and Privacy Threat Taxonomy

Ghost in the Wires: My Adventures as the World’s Most Wanted Hacker

A comprehensive survey of VoIP security research, communications surveys tutorials

IEEE PP