Customer churn prediction using a hybrid method and censored data

Article history: Received January 22, 2013 Received in revised format 29 March 2013 Accepted 18 April 2013 Available online April 2


Introduction
Customers are believed to be precious assets in any organization and customer retention (CR) plays essential role on the success of any business unit in todays' competitive environment.Customer relationship management (CRM) plans to understand and to measure the true value of customers and market segmentation is a technique for successful CRM (Judd et al., 1981;Berry & Linoff, 2003).Hung and Tsai (2008), for instance, concentrated on techniques, which provide a human manager with a visualized decision-making technique for market segmentation.They used a market segmentation technique, namely the hierarchical self-organizing segmentation model (HSOS), to deal with a real-world data set for market segmentation of multimedia on demand in Taiwan.The proposed HSOS was capable of providing a general idea of market segmentation step by step considered as an alternative method to other hierarchical cluster techniques for market segmentation.Kim and Yoon (2004) implemented a binomial logit model using a survey of 973 mobile users in Korea and identified determinants of subscriber churn and customer loyalty in the Korean mobile telephony market.The possibility that a subscriber would switch carrier depends on the level of satisfaction with alternative-specific service attributes including call quality, tariff level, etc.Nevertheless, only factors such as call quality, handset type, and brand image could influence customer loyalty as measured by the intention/non-intention to describe the service provider to various people.According to Gerpott et al. ( 2001) CR, customer loyalty (CL), and customer satisfaction (CS) are important objectives for telecommunication network operators.They used some information from a sample of 684 residential customers of digital cellular network operators in Germany and reported that all CR, CL, and CS must be treated as differential constructs, which are causally inter-linked.In their survey, mobile network operators' perceived customer care performance had no significant effect on CR and their findings recommended that an important lever for regulators to promote competition in cellular markets was the enforcement of efficient number portability procedures among mobile network operators.Kim and Kwon (2003) investigated whether network size matters when new subscribers choose their service providers and reported that the intra-network call discounts and the quality signaling effect would likely to be the sources of the size effect.Madden et al. (1999) developed a model associated with the probability of subscriber churn to different service attributes and subscriber characteristics.Their results demonstrated that churn probability was positively associated with monthly ISP expenditure, but inversely associated with household income.Van den Poel and Larivie're (2004) investigated the topic of customer attrition in the context of a European financial services firm and studied predictors of churn incidence as part of CRM.They contributed to the existing literature by combining various kinds of predictors into one comprehensive retention model including several "new" kinds of time-varying covariates associated with actual customer behavior.They also analyzed churn behavior based on a truly random sample of the total population using longitudinal data from a data warehouse.They reported that demographic characteristics, environmental changes and stimulating interactive and continuous' relationships with customers were of major concern when considering retention.In addition, customer behavior predictors only had a limited effect on attrition in terms of total products owned as well as the interpurchase time.Baesens et al. (2002) concentrated on purchase incidence modeling for a European direct mail firm using response models based on statistical and neural network techniques.They reported that Bayesian neural networks could offer a viable alternative for purchase incidence modeling.Coussement and Van den Poel (2008) used support vector machines in a newspaper subscription context to build a churn model with a higher predictive performance.They reported that only when the optimal parameter-selection procedure was applied, support vector machines could outperform traditional logistic regression, whereas random forests outperform both types of support vector machines.Hung et at.( 2006) compared different data mining techniques, which could assign a 'propensity-to-churn' score periodically to each subscriber of a mobile operator.They reported that both decision tree and neural network techniques could deliver accurate churn prediction models by using customer demographics, billing information, contract/service status, call detail records, and service change log.
Keramati and Ardabili (2012) identified different factors, which would influence customer churn, the single most valuable of an organization's assets using one year's data from call log files relating to 3150 customers selected randomly from an Iranian mobile operator call-center database.They reported that a customer's dissatisfaction, their amount of service usage and certain demographic characteristics could impact on their decision to remain or churn.Ngai et al. (2009) in a survey provided a roadmap to help future research and facilitated knowledge accumulation and creation concerning the application of data mining techniques in CRM.
According to Tsai and Lu (2009), churn management is a major task for companies to retain valuable customers, the ability to forecast customer churn seems to be necessary.Pendharkar (2009) implemented genetic algorithm based neural network techniques for predicting customer churn in cellular wireless network services.Seo et al. (2008) used a two-level model of customer retention in the US mobile telecommunication service market.

The proposed method
In this paper, we present a hybrid method based on neural network and Cox regression analysis where neural network is used for outlier data and Cox regression method is implemented for prediction of future events.The proposed Cox regression function has the following forms, In Eq. ( 2), ℎ ( ) is the probability of customer churn at time t, is the j th specification of customer i th and represents the coefficients to be estimated.Categorical specification is one of the advantages of Cox regression function.Note that there are different types of categories in mobile industry, which are categorized in terms of categorical functions such as age-range, busy/idle, etc.The proposed model of this paper uses a hybrid method described in Fig. 1 as follows,

Customer dissatisfaction
The first group of hypothesis is as follows,

H 1a
The number of disconnections calls positively influences customer churn.

H 1b
The number of complains positively influences customer churn.

Level of service usage
The second group of hypothesis is as follows,

H 2a
The monthly charge positively influences customer churn.

H 2b
The call duration positively influences customer churn.

H 2c
Too many digits of calling numbers positively influences customer churn.

H 2d
The number of sms advertisements positively influences customer churn.

H 2e
The number of unwanted calls positively influences customer churn.

Switching costs
There are some occasions where customers wish to switch to another operator and must pay some fine for doing this and this could increase customer churn.For the case study of this paper we only had one switching cost, which was associated with internet option.The following hypothesis is associated with this section, The existence of internet option positively influences customer churn.

Demographic information
The other information, which may influence customer churn is associated with their customers' personal characteristics such as age.Table 1 demonstrates personal characteristics of the customers whose information are used for the proposed study of this paper.Customer age positively influences customer churn.

Customer status
In this survey, customer status is considered as mediation effect and we have considered a customer active when the customer had, at least, one call within the last two months.

H 5
Customer status positively influences customer churn.
Based on different hypotheses accomplished for the proposed study of this paper, we have consider the following model, Customer dissatisfaction (H 1 ) ↓ Level of service usage (H 2 ) Customer status (H 5 ) Customer churn ↑ Switching costs (H 3 ) Demographic information (H 4 )

Predicted data
The proposed study of this paper considers historical data from a group of 3150 customers who used Mobile phone in Iran.When a customer does not make a phone call for a period of two-month or sell his/her SIM card to another person, customer churn occurs.

Censored data
In some experiments, it is not possible to complete the test and in our case when we face with such incident and customer churn does not happen we say right-censored data happens.

Preprocessing
In this survey, we use different hidden layers and epoch to remove the existing noise in the data.Our experiments indicate that we could get the best results for two groups of churn and non-churn with 16 hidden layers and 100 repeats.Table 2 shows details of our findings in this stage.

The results
In order to extract the customer churn probability we have used Cox regression analysis using SPSS software package and Table 3 demonstrates the results of our findings, As we can observe from the results of Table 3, except two cases, in all other cases, we can accept the impact of factors on customer churn.Table 4 and Table 5 demonstrate the results of testing various hypotheses.

Table 4
The results of testing different hypotheses of the survey In order to compare the performance of the proposed model, we have used five criteria including prediction accuracy, errors' type I and II, root mean square error and mean absolute deviation and Table 6 shows the hypotheses for testing the effectiveness of the survey,

Table 6
The hypotheses of the survey for testing the proposed hybrid method Actual Non-churners Churners where RMSE and MAD are calculated using the following, where and are survival value for churn and non-churn customers, respectively.In addition, = ( − 0) and = (1 − ), and finally, ̅ , ̅ are calculated based on the average values of and , respectively.Table 7 shows details of our findings on the results of our survey, As we can observe from the results of Table 7, the hybrid method performs better than pure Cox method.

Conclusi
In

Fig. 1 .
Fig. 1.The framework of the proposed method The proposed model of the paper considers different factors influencing customer churn.

Table 2
The results of preprocessing the data

Table 3
The results of Cox regression analysis

Table 7
The results of comparing the performance of Pure Cox and Hybrid method