Classification of Mobile Customers Behavior and Usage Patterns using Self-Organizing Neural Networks

—Mobile usage is witnessing a booming growth attributed to advances in smartphone technologies, the extremely high penetration rate and the availability of popular mobile applications. Telecommunication markets have been injecting huge investments to fulfill the sheer demand on wireless network and mobile services as a result. Such potentials highlights the importance of behavioral segmentation of mobile network users to target different sectors of customers with efficient marketing strategies and ensure customer retention in light of the intense competition. A major hurdle in applying this approach is the number of dimensions underlying customer preferences which makes it hard to visualize similarities among customers and formulate behavioral segments correctly and efficiently. In this paper, we use self-organizing maps, to detect different usage patterns of mobile users. The proposed system is tested using a large sample of customers’ data provided by major mobile operator in Jordan. The study detected different behavioral segments in this market and highlights the role of data users in modern mobile markets. In this context, we give detailed analysis of our results on user behavioral segmentation.


I. INTRODUCTION
The recent years has been witnessing a booming growth in the mobile industry fueled by high penetration rate, advancement of smartphone technologies and the availability of, in some cases, indispensable mobile applications addressing wide spectrum of sectors such as entertainment, gaming, business transactions, banking transactions, reservation, Internet and many others. Mobile technologies has also been demonstrating to be an effective and efficient tool in development initiatives as in [1], [2] and [3]. The popularity of Mobile technologies have attracted huge investments and as result we are having fast-paced markets with intense competition between service providers. Besides the fact of being inherently wide, the customer base for such markets is heterogeneous in the sense that customers have different service requirements which lead to different usage patterns. Nowadays, one important aspect of mobile service providers are customer retention through loyalty and customer satisfaction. This highlights the importance of employing effective marketing strategies to target different sectors of customers with services that are appropriate to them and the ones that maximize their satisfaction.
One strategy in this regard is market segmentation which refers to the process of designating groups of customers that are homogeneous within (have common preferences) to be targeted with appropriate products and services [4]. A major hurdle in applying this approach is the number of dimensions underlying customer preferences which makes it hard to visualize similarities among behaviors and formulate segments. A tool that can be effective in this regard is Kohonen's Self Organizing Map (SOM) which is a neural-network based approach that maps a multidimensional input space to one or twodimensional space that can be visualized [5]. This approach helps perform segmentation while preserving topological relations among the different elements of the space. Unlike other dimension reduction techniques, such as component and factor analysis, a salient feature of SOM is that it makes no assumptions about the underlying distribution [6].
The main advantage of SOM is its specific capability as a classification method. Typically, SOM can produce as a main output, a relation between the generated clusters and the hidden relation between different classifications attribute. The specific capabilities of SOM are to generate a list of non-correlated factors from data where such structure is not initially expressed. So, by using SOM we can discover different patterns from complex data [7,8]. Recent works show how temporal analysis can be used with evolutionary SOM [9]. This paper aims at providing a case study on customer segmentation in mobile telecommunication markets using SOM technology. The study is performed on a rich sample of customers' mobile usage data provided by a major mobile operator in Jordan. The importance of this study is two-fold: First, it is based on real data and not on surveys or artificial simulated data which might lack accuracy in sampling the actual behavior of customers. Second, the study helps identify behavioral segments of mobile customers in a particular geographical region where profound market studies are somehow scant. In fact, it is widely common to target customers based on demographic factors more than behavioral profiles which can lead to mismatches between customers and optimal offers [10].
The study is performed on a data set that describes behavioral profiles of a sample of more than 6000 customers. Each profile is composed of 10 attributes that describe both voice and data usage. For example, we consider attributes such as number and duration of outgoing and incoming calls. We also consider attributes PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… that cover on-net and off-net calling behavior. In this regard, we use SOM to categorize customers according to this set of attributes. Our results show 6 different segments among which data users constitute by far the largest segment, an indication of the importance of data services in modern mobile telecommunication markets.
In this study, we also attempt to assess the value of each segment to the operator from a business standpoint. Therefore, we use a loyalty scale to quantify this value by using the average loyalty rate of customers contributing to each segment. Loyalty rate of each customer is provided by the service provider. The rates are computed according to some usage criteria such as customer's subscription period, revenue per month, accumulative revenue on a scale in which 1 represents the minimum loyalty rates while 10 is the maximum. Loyalty rate of each customer recalculated each month. The average of a six months for each user is calculated and compared to our segments results. We show that data users, while forming the largest segment, have the lowest loyalty rate. This counterintuitive result highlights inefficiencies in marketing strategies that can overlook such wide segment.
The rest of this paper is organized as follows. In section 2, we provide a background on applying SOM for data analysis. We also compare our study to some recent work that has been done in the area of wireless mobile markets. In section 3, we describe the technique of SOM and the different methods for visualizing clusters obtained by this technique. In section 4, we describe our study and the procedure followed in training our data set and formulating segments. In section 5, we shed light on the value of our results by interpreting certain characteristics of the obtained segments. The paper concludes in section 6.

A. Market Segmentation
Market segmentation is needed to cluster customers based on usage behavior. This would allow having better marketing strategies. Motivated by fast-paced markets and high competition, nowadays the business intelligence process has become an indispensable tool for drafting market strategies with the objective of having accurate upto-date market segmentation. Inaccuracy in identifying market segmentation could lead to negative impact on operation. Motivated by this, we attempt from this work to address the problem recognizing and classifying users based on their usage behavior. This, it worth mentioning, is usually not corralled with their demographics profiles [11].
There have been many research efforts in the literature that addressed the automatic recognition of user behavior. In Hanafizadeh and Meysam [12], the authors give a comparison and review of clustering methods used for market segmentation, from the compression it is clear that the self-organizing map method has been used for the market that has multidimensional demands [13] that does not require any previous knowledge about the number and the nature of clusters. On the other hand, other methods like k-means and fuzzy clustering have been used in online shopping market and need to have predefined knowledge about the number of clusters and does not give any kind of relation between clusters as the used SOM method. In previous work, we have examined fuzzy logic approach to user behavior understanding [27].

B. Data Analysis and Clustering Techniques
There is a wide range of applications where SOM can be applied for data analysis and interpretation. Application include but not limited to speech recognition [14], medical imaging analysis [15], and system control [16]. However, the use of SOM for market segmentation, especially for telecommunication markets, remains relatively a new area of research with high potential due to the steep demand on mobile telecommunication services. In fact, it is predicted that the market for mobile and connected devices will nearly double to reach $847 billion by year in 2016 [17].
Multiple market segmentation algorithms are proposed in the literature. Among the most popular of these algorithms are K-mean, Kohonen Vector Quantization (KVQ), and SOM. The way these methods work is via iteratively updating seeds that become representative of clusters. They are known to be able to handle large data sets [18]. In Balakrishnan et al. [19], several clustering methods are compared. It is shown that when using uniform distribution of the data, K-mean outperforms KVQ in classification accuracy. In Kiang et al. [20], a modified technique of SOM, called extended SOM, is applied where nodes are grouped into a fixed number of clusters. This technique is applied to a data set of customers of AT&T and it is shown that the method outperforms K-mean in uncovering market segments.
One main challenge in any clustering problems is determining the correct number of clusters and this could be solved using SOM. This method leads to detecting the different patterns in the data. The visualization of resulting map allows identifying the suitable number of clusters [21,22,23].

C. Telecommunication market behavioral segmentation
We consider telecommunication market behavioral segmentation as nonlinear complex problem that have hidden relationships between large set of parameters. It is quite impossible to know in advance the relationship between these factors. Based on a heuristic evaluation in the context of self-organizing relationships, network patterns can emerge and can be observed and analyzed following a training process. Furthermore, once the training phase is finished, the trained system could be applied for a new data set and we can observe the changes in behavior. The objective is to apply these techniques to different market data sets and to compare the output results, as SOM is also known to have relevant properties when it is used in a supervised context [24].
Main contributions of this work can be summarized by the construction of self-organized map from the data. This allows to analyze the major factors that contribute to clustering emergence. Customer segmentation can then be produced introducing a loyalty rate.
Such analysis is relevant to strategic planners that should be aware of the main factors producing the cluster formation. This allows them to have better marketing plans adapted to users' needs. Using this method with telecommunication customers is so relevant because this kind of usage is highly depending on nonlinear factors that have complex relations.
Jordan has the most deregulated telecommunications market in the MENA region (Middle East and North PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… Africa) [21]. It is the main reason that make one of the major mobile operators in Jordan to collaborate with us, expecting innovative analysis and techniques to better adapt the customer segmentation to high evolving customer uses and technological practices. It is the main reason that make this study having both theoretical significance in developing the self-organized map algorithm and practical significance in using this result to adapt the marketing future strategies.
The closest work to the work presented in this paper is Hanafizadeh and Mirzazadeh [12] which investigates factors that affect customer behavior in an ADSL market. The authors use SOM to study relationships between different market variables. The variables are divided into four categories; geographic, demographic, behavioral, and psychological and then they are ranked using a fuzzydelphi method according to their influence on the market. However, in our work, while we use a data set that belongs to a different market, we use a much larger set than [12]. Furthermore, in our study we use a data set that is obtained from the operator and so we do not depend on questionnaires that might not be necessarily accurate.

III. SELF-ORGANIZING MAP (SOM)
SOM is a methodology that belongs to the wider framework of unsupervised neural computing networks. It is applied to extract useful information from raw data via mapping data samples presented in multi-dimensional space to a reduced space of two or three dimensions to help visualize hidden relations between the different samples. This is achieved by clustering samples that have common traits. The basis of SOM is illustrated in Kohonen network in Figure 1. The network consists of two layers of nodes; input and output where the input layer is fully connected to the output layer. Each node in the input layer is associated with one of the attributes that the data samples are comprised of. The nodes in the output layer are called neurons and they are put together in a two dimensional grid. Each neuron corresponds to an information vector with the same dimensions of the data samples. Figure 1. A Kohonen network for implementing SOM. There are two layers; input and output. Each node in the input layer is fully connected to the nodes (neurons) in the output layer. The neurons are distributed on a grid and they align themselves according to the data set fed into the network SOM formulates clusters by performing a number of training cycles on a part of the data set or the complete data set. The procedure starts by giving the neurons random weights. The data samples are then processed in sequence and the winning neuron for a given sample is the neuron that has the least Euclidean distance to the sample. The weights of the winning neuron and its neighbors are then updated using a weight adaption function. This process is continued until all the neurons that have relationship form clusters. Furthermore, clusters preserve relationships to each other by having related clusters at a close distance to each other while maintaining the underlying data structure. In the following, we give a detailed description of this procedure.

A. SOM Algorithm
Consider a dataset where each element is an Mdimensional vector. Assume there are N neurons. Let ! !" ! ! ! represent the weight of neuron ! with respect to node ! of the input layer at time step!!. The collective set of weights of neuron j at time n can be thus represented by the vector Let !!!! be the learning rate parameter at time step !. The procedure for applying the SOM algorithm can be given as follows: 1. Initialization (! ! !): Assign each neuron a random weight vector with values taken from the interval !!! !! and let !!!! be a small positive value.

Training (!! ! !!):
(i) Activation and similarity matching: Induce the Kohonen network with one data sample at a time.
The winning neuron is identified as the one that has the shortest Euclidean distance from the sample vector. Thus, if the sample is given by !! ! ! !! ! ! ! ! !!!!!!!! ! ! ! ! ! ! ! ! , the winning neuron!! ! !satisfies: (ii) Learning: Update the weight vector of each neuron as follows: where !! ! ! is computed according to the following weight adaption function: Here ! !! ! !!! is the neighborhood rate of neuron ! with respect to the winning neuron. The training process is repeated to cover all or part of the elements of the data set. Typically, there are more than one training phase. Through each phase, the learning rate and the size of the neighborhood around each neuron are decreased so that towards the end clusters are formulated and the map stabilizes as illustrated in the example in Figure 2.

B. Visualizing SOM
There are multiple methods for presenting clusters formed by SOM. Perhaps the most important method is the unified distance matrix, or in short the U-matrix [22]. In this method, distances between weight vectors of the adjacent neurons are computed. Therefore, if using gray scale representation, light color can depict close neurons PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… while dark colors depict widely separated neurons. In this regard, clusters are represented by areas of light color while areas of dark color represent boundaries between the clusters.
Another method for visualizing clusters is the distance matrix. The distance matrix, or alternatively the D-matrix, calculates distances between neurons and constructs a map based on this information. Basically, the D-matrix uses the size of the neurons to tell the average distance to its neighbors [23].

IV. CASE STUDY
In this section, we describe the case study we performed for segmenting a set of telecommunication customers based on their behavioral profiles using SOM. We also describe some methods to visualize and understand the obtained clusters.

A. Preparation of the Dataset
The data set includes blind and random profiles of 6315 customers of a major mobile telecommunication operator in Jordan. Each profile covers the monthly usage behavior of one customer via 10 attributes listed in Table 1. Values of the attributes are averaged over a span of 6 months. Furthermore, in order to make values comparable to each other so that the results are not biased towards large values, all the attributes are linearly normalized.

B. Network training
The map structure used in this study is a hexagonal grid of !!!!!!" neurons. A common rule to calculate the total number of neurons is to choose !""" !! neurons where ! is the number of data samples [24]. The neurons are trained with the data set using MATLAB 7.0 environment. The training is performed in two sequential phases using random initial weights and using a Gaussian neighborhood rate function defined for neurons k and l as: where !!!! is the neighborhood radius at time ! . A detailed description of the training process is given as follows: 1. Coarse phase: This phase aims at quickly adjusting the neurons into groups. Thus, it has a relatively large initial learning rate !!!! ! ! !!!!. The neighborhood radius is also chosen to be high; that is !!!! ! ! half of the map height = 16 to cover large number of neurons in each iteration. Through this phase, the radius is decreased so that at the end of the processes only the direct neighbors of the neurons are affected. This phase is performed for a number of iterations equals to the number of samples in the data set.
2. Fine-tuning phase: This is a relatively longer phase which is iterated twice longer than the previous phase. It aims at shaping the clusters by choosing a much smaller learning rate !!!! ! ! !!!!" and a lower neighborhood radius, !!!! ! ! !!. The neighborhood radius is finally equal to 1 so that it only affects the winning node, not any of the surrounding nodes. Table  2 summarizes all the initial parameters. The quality of formed clusters can be assessed using two types of error measurements: (i) the average quantization error which measures the accuracy of data representation. In particular, it measures the average distance error between data vectors and their best matching units on the map and (ii) the average topological error. This type measures the percentage of data vectors for which the first and second best matching units are not adjacent units. Both errors are computed for the implemented SOM and are given as follows: average quantization error = 0.045% and average topological error = 8.6%.

C. Visualization
The first step in analyzing the obtained clusters is to use visual inspection as it gives an overview of the clusters and their relationships to each other. Three main visual inspection techniques are given in this regard: the Umatrix, the D-matrix, and the component planes. 1

) The D-matrix
A D-matrix presentation of the obtained clusters is shown in Figure 3. Here we use a three dimensional map where the elevation of a neuron is proportional to its average distance from its neighbors. Therefore, areas with PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… low elevation correspond to nodes that are close to each other and thus form clusters.

2) The U-matrix
The U-matrix presentation of the map is shown in Figure 4. The figure is shown in gray scale for the sake of clarity of the presentation. Lightly colored areas, where distances between the neurons are relatively small, correspond to clusters while dark colored area separate clusters form each other.  3) The component planes One useful approach in gaining initial understanding of properties underlying clusters in the SOM is to investigate correlation between the different attributes of the data set. This can be achieved via looking at the component planes comprising the U-matrix as shown in Figure 5. Each subfigure corresponds to a different attribute with a color ramp that extends from dark blue (lowest value) to dark red (highest value). A set of observations can be drawn from these planes in Figure 5: (i) there is positive correlation between the duration of on-net calls and the duration of all calls generated by the customers. In other words, customers who intensively use the network use it more for on-net calls and vice versa. (ii) Customers with high data usage have very low call duration, i.e., data users tend to be less concerned with voice calls; (iii) Customers with high number of generated calls have low call duration and those with high call duration have relatively low number of generated calls; and (iv) customers with high international call duration keep a low profile in using the other services provided by the operator. They have low domestic call duration, SMS, and data usage. Namely, there is a group of customers who mainly need the network for doing international calls.

4) Map clustering
A major issue with visualizing the D-matrix and the Umatrix is the possibility of lack of sharp edges to identify clusters. This is obvious from Figures 3 and 4. Clusters that are clearly identified help better understand common properties underlying these clusters. For this purpose, we apply K-means clustering on the U-matrix to obtain 6 clusters with sharp boundaries as shown in Figure 6. Each of these newly obtained clusters correspond to a myopically identified cluster on the D-matrix in Figure 3. The newly obtained clusters will be used as the basis for our analysis in the next section.

A. Segment characteristics
Now that we identified the clusters obtained by the SOM, we investigate common characteristics of customers constituting these clusters, or what we are going to refer to from now on as segments. In the following, we list the characteristics of each of these segments: Segment 1: high duration but low number of generated calls and moderate number of sent and received SMS. Segment 2: moderate duration of generated calls and moderate to high data usage.
Segment 3: high duration of off-net calls, high number of generated calls, and moderate to low of both duration of generated calls and data usage. Segment 4: very low call duration, high sent and received SMS, and high data usage. PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… Segment 5: very low data usage, low duration of generated calls, and high number of received calls with respect to the number of generated calls. Segment 6: relatively high duration of international calls.
From the previous characteristics, each segment can be labeled as follows: Communication users (segment 1): communication is so essential to their life. Users of this segment mainly use the network for making long duration voice calls.
Moderate users (segment 2): they are moderate voice and data users. Namely, this class of users have moderate call duration and moderate to high data usage.
Active online users (segment 3): this class of users use the network for a wide range of purposes but not for the talkative purpose of it. Furthermore, data services to a certain extent are important to them. In other words, this class can be thought of as a class of customers who use the network for business purposes.
Data users (segment 4): they depend on the network mainly for data communications but not for voice calls.
Passive users (segment 5): they relatively receive more calls than they generate.
International service users (segment 6): they depend on the network mainly for establishing international calls.
This information provides a clear view of the market segments and their characteristics. It is essential for the operator to design its marketing strategies by adapting to the characteristics of these segments

B. Segment profiling
Next we investigate the share of each segment with respect to the operator's subscriber base. For this purpose, we compute the number of customers constituting each segment by counting the number of hits on each neuron. The results are given in Table 3 which shows that segment 4 has the largest population and accounts for 58.4% of the customers while segment 3 is the lowest with only 2.9%. The fact that segment 4 is the largest shows that the tendency in mobile markets is for growing data but slowing voice. Actually, what is striking in these results is that this segment is by far the largest among all the segments. This emphasizes the fact that data users play a key role in any successful marketing strategy as they have the largest market share.
We also investigate the value of each segment to the operator. Therefore, we asked the operator to provide us with the loyalty rate of each customer included in the study. The rates are computed according to some usage criteria such as customer's subscription period and his average monthly bill. Values are represented on a scale from 1 and 10 where 10 is for the most loyal. Table 3 shows the average loyalty rate for customers in each segment. It turns out that segment 6 has the highest average loyalty rate. Interestingly, customers who populate this segment mainly use the network for international calls.
However, more interesting fact is that segment 4, which is the largest segment but has the lowest loyalty rate. Computing the loyalty rate is a complex process calculated by the operator based several factors. Two main factors are considered today by the operators which are: (i) the duration which the customer stays with the same operator without changing its consumption model, (ii) the social and economic pressure on the consumption behavior. These factors can be considered as the global and final output of the customer consumption behavior. Currently the operator does not integrate in its loyalty rate analysis, the dynamical impact of the customer's use which is in continuous evolution. Our analysis aims to fill this gap and is especially dedicated to customer's telecommunications services usage evolution. Our results can help in understanding how telecommunications service use impacts on the loyalty behavior. On the other hand, our analysis is managing data that the social and the economic level of customers are not known. And so, our contribution cannot be compared with the current analysis of the operator which is mainly based on these aspects. But our studies give relevant information to help the strategic decision maker to integrate the impact of telecommunication usage evolution in the loyalty behavior and so to develop new efficient customer segmentation. Our segmentation analysis shows that mobile data markets can be considered as relatively new uses. Customers that are using today intensively data have not been long enough in service to build a reputation and have high loyalty rates. Furthermore, it appears that customers in this segment have low revenue generating profiles which grants them low loyalty rates. Our recommendation is then to be aware of the evolution of these new customers using intensively data and to follow the complex combination with other uses. We can today claims that, concerning the studied population, the data plans are based on the price and thus require review.

C. Building classification models
The knowledge about segment profiling and user categorization can be valuable input for other classification technologies that can be used for recommendation. This can be realized by producing a set of training samples of usage data and the predicted segment as class attribute. This is later used to train models to recommend to customers the most appropriate plans for them. One popular classifier is decision trees which is based on divide and conquer strategy. Decision trees are widely used in classification problems (e.g., [25] and [26]). Decision trees enjoys several advantages among which is its ability to process both numerical and categorical data types. It also maps efficiently to business rules. As an input, they require a set of attributes (defining the customer) and their class value (generated by the PAPER CLASSIFICATION OF MOBILE CUSTOMERS BEHAVIOR AND USAGE PATTERNS USING SELF-ORGANIZING NEURAL… proposed framework). As mentioned earlier, decision trees allows the generation of a set of human understandable rules for recommendations.

VI. CONCLUSION
In this paper, we used artificial neural network techniques for segmenting customers in mobile telecommunication markets. The study was applied to a wide sample of customers of a major mobile operator in Jordan. The study is unique as it is applied to a particular market in the Middle East that has a high mobile penetration rate (138% in 2012 according to government statistics). In this work, the data given by the operator has been mainly based on user usage. No information is known about social and economic level of users and so, our contribution is limited because we cannot manage these user's characteristics which are static data. But we are able to analyze dynamical and critical variables of user usage behavior that have been identified and used in the segmentation process. This kind of customers' usage emergent analysis is not included in the operator strategies and it needs intelligent method of analysis like selforganizing map computation. First we used SOM to segment our data set and visualize the results using techniques such as the D-matrix and the U-matrix. This approach has advantages over other segmentation methods by detecting segments without predefined hypothesis on behavior. Based on the number of segments obtained by the SOM, we used the K-means algorithm to obtain segments with clearly defined borders to help better analyze common properties underlying these segments. Our results show that data users constitute around 58% of the market share. However, the results reveal that this large portion has the lowest loyalty rate among the different obtained segments. This observation emphasizes the importance of effective marketing strategies to bring these customers to a more privileged circle and help maintain them.