Taxonomy of Marketing Strategies Using Bank Customers ’ Clustering

Purpose–The goal of this study is to identify the main clusters of bank customers in order to help commercial banks to better identify their customers and design more efficient marketing strategies. Design/methodology/approach – Data from 250 bank customers were analyzed by using two-step scalable clustering. Findings – Five different clusters of bank customers were identified, namely, favorite customers, creditworthy customers, non-creditworthy customers, passers, and friends. The findings indicated that disparate clusters of bank customers are extremely different based on their loan amount, default risk, account balance, degree of loyalty and profitability for the bank. Practical implications – The differences which were observed between these five clusters of bank customers accentuate the importance of customer clustering and market segmentation in the financial services industry. Customer clustering can help financial institutions to augment their competitiveness by shifting from traditional marketing strategies to target marketing and segmentation-based marketing approaches. Originality/value The most important contribution of this study is the incorporation of a wide range of factors that can potentially affect customer clustering in the analysis, whereas, the majority of previous studies only focused on a limited number of variables in order to determine the customer clusters. Specifically, the customer clustering in this study was performed by using demographic variables, profitability, loan amount, default risk, account balance, loyalty, account type, account closure history, customer location, and account currency.


Introduction
Marketing strategy is concerned with effective and harmonic allocation of resources in order to achieve organizational goals in a specific market for a particular product/service.Therefore, a large portion of strategic marketing decisions are focused on identifying target markets and performing market segmentation.In order to improve product/service quality and to augment the degree of competitiveness, companies should identify the key needs of their target customers and should develop a strategy in order to satisfy these needs.The most salient issue in designing a marketing strategy is the fact that all customers are not the same and should not be treated similarly.Studies in services marketing have shown that in most cases, firms should not provide services to all customers in a similar fashion.Therefore, customer segmentation and customer relationship management are among the most important determinants of business viability (Ansari & Riasi, 2016a).In fact, customer relationship management helps the firms to enhance the customer value and enables them to retain their valuable customers.
Due to the extreme competition among companies and also due to the diversity of the products and services in the markets, identifying and analyzing consumers' behavior, and also adopting the best possible marketing strategies have become inseparable components of customer relationship management.By dividing the customers into different clusters, companies will be able to decide how to effectively allocate their limited resources to different groups of consumers based on their value.Companies try very hard to better identify their customers by analyzing their behavior, but the important point is that consumer behavior analysis requires a variety of industry-specific considerations.Particularly, analyzing the consumer behavior in financial services industry is a very challenging task due to the heterogeneity of the customers and differences in customers' expectations (Ansari & Riasi, 2016b).One of the most useful techniques for analyzing the consumer behavior in financial services industry is customer clustering.By using clustering techniques, customers are divided into homogenous clusters in which customers with similar needs and characteristics are grouped together (Ghazanfari et al., 2010).After identifying the needs and values of its customers, a company can provide better services to its clients which will in turn lead to enhanced customer satisfaction and higher degrees of perceived service quality (Moslehi et al., 2012).This enhanced customer satisfaction and relationship commitment will also have a positive impact on the firm's brand loyalty and brand awareness (Ansari & Riasi, 2016b, 2016c;Fornell, 1992;Fornell & Wernerfelt, 1987;Gallarza & Saura, 2006;Kim et al., 2008;Kumar et al., 2013;Parasuraman et al., 1991;Pont & McQuilken, 2005;Reichheld & Sasser, 1990;Reynolds & Beatty, 1999;Salmones et al., 2009;Veloutsou et al., 2004).
According to Paswan et al. (2012) marketing strategy should be conceptualized as a mechanism which is used for competing in the marketplace.Every company might have customers that do not generate adequate profits; in other words, marketing strategies used for targeting certain groups of a firm's customers might be very costly compared to the revenues which are generated by them.On the other hand, there are always some more profitable groups of customers.The companies are always striving to retain the latter group of customers by offering them various incentives.Identifying these disparate groups of customers and their specific needs can help the firms to improve their profitability and competiveness.By using customer clustering, firms can better identify the behavior of their customers and design better marketing strategies.These relationships accentuate the need for using configuration theory in studying the consumer behavior.Configurations represent a number of specific and separate attributes that are only meaningful when used collectively (Dess et al., 1993;Rosenberg, 1968;Miller, 1987).They are finite in number and represent a uniquely integrated set of dynamics (Mintzberg. 1973;Miller, 1986Miller, , 1987)).According to Dess et al. (1993), by using configurations, researchers can express complicated and interrelated relationships among different variables without resorting to artificial oversimplification of the phenomenon of interest.In other words, configurations are a useful tool for providing rich and complex descriptions of organizations (Dess et al., 1993;Hambrick, 1983;Miller & Friesen, 1977;Mintzberg, 1978).Previous researchers suggested that configuration models should be divided into taxonomies and typologies (Miller & Friesen, 1984;Meyer et al., 1993).While both offer multidimensional views of organizations, they are different with regard to their underlying objectives and key characteristics (Bozarth & McDermott, 1998).According to Bozarth and McDermott (1998), typologies are multidimensional models for ideal types and provide a generalizable grand theory and middle-range theories applicable to individual types.On the other hand, taxonomies are classification systems that categorize phenomena into mutually exclusive and exhaustive sets and are used for generating insight or to advance a predictive task (Bozarth & McDermott, 1998).

Market Segmentation and Customer Clustering
Amid market upheaval, companies are spending more money on improving their marketing strategies in order to enhance their competitiveness in the marketplace (Riasi & Pourmiri, 2015;Riasi, 2015aRiasi, , 2015b)).These marketing strategies are primarily focused on customer segmentation, increasing the level of customer satisfaction, and customer retention.In order to determine their preferences and thus improve marketing decision support, businesses can significantly benefit from analyzing large volumes of customer data which they have collected (Liu & Shih, 2005).Dividing customers into disparate clusters based on their similar buying behaviors, needs, demographics, and other parameters can lead to significant growth in profitability because it helps the companies to optimize their marketing practices.According to Zeithaml et al. (2001), marketing success is equivalent to generating maximum profits from a company's total set of customers when managerial resources are allocated to the groups of customers that can be cultivated most efficiently by the company.Identifying these groups of customers requires advanced customer clustering and market segmentation.Smith (1956) believed that market segmentation is based upon developments on the demand side of the market.Additionally, he believed that market segmentation represents a rational and precise adjustment of product and marketing effort to customer requirements (Smith, 1956).Customer classification and clustering enable the firms to group similar customers together and help managers to better understand the customers' needs; because it is much easier to identify and analyze the characteristics of groups of customers rather than studying each customer individually (Ansari & Riasi, 2016a).

Market Segmentation and Target Marketing
Market segmentation and target marketing are closely related concepts and they are sometimes used interchangeably.Target marketing refers to the identification of a group of buyers sharing common needs or characteristics that a firm intends to serve (Kotler et al., 1991;Aaker et al., 2000).Target marketing has been the driving force behind the success of various well-known companies.In fact, target marketing provides the basis of a user positioning approach, which is a paramount branding strategy closely associated with a specific consumer (Aaker et al., 2000).Despite the similarities between market segmentation and target marketing, there is a big difference in the objectives of these two approaches.Market segmentation is concerned with dividing the market into different subsections based on the customers' expected response to marketing practices; while target marketing focuses on identifying the most interesting parts of the market and designing marketing strategies for customer acquisition in these markets.Therefore, in order to have a successful segmentation-based marketing strategy, a firm should follow three steps: The first step is market segmentation, in which the market is divided into different segments based on the characteristics of the customers and their needs.In the second step, different market segments are targeted using appropriate marketing strategies.In other words, the firm should investigate the customers in each segment and determine the best marketing approach based on the characteristics of that particular market segment and the firm's available resources.Finally, the last step is to determine the market position.In this step, the firm should be able to examine its competitiveness in each market segment by analyzing the customers' attitude toward the firm's brand in comparison to their attitude toward competing brands.
According to Gavett (2014), companies tend to make big mistakes in the market segmentation process and particularly when they start thinking about it: First of all, many companies who adapt segmentation-based marketing tend to uncover existing segments rather than creating their own segments.Second, many companies think that market segmentation should only focus on demographic characteristics and try to define market segments based on demographics.This is a big mistake, because in many cases, customers who have similar demographic characteristics may require different marketing approaches in order to be attracted to a brand.This indicates that companies should be aware that demographics and segmentation are two different things.Third, many companies forget to determine their ultimate objective from market segmentation and as a result of this, they end up spending a lot of money on performing unnecessary market segmentation.Gavett (2014) believes that before starting the market segmentation, firms should ask themselves why they want to segment and what decision they will make based on the information obtained from the process of market segmentation.Therefore, it is salient to consider these three pitfalls before starting to think about market segmentation and allocating a budget for this purpose.After considering these potential pitfalls, companies should decide whether they want to start segmenting by needs or behaviors.After figuring out whether the firm has the right brands, appropriate value proposition, and the right product line, then segmentation can be initiated (Gavett, 2014).

Characteristics of a Useful Segmentation
According to Boespflug (2013) market segments should be identifiable, reachable, significant, relevant, and understood properly in order to be useful.He believes that behind every successful firm there is a good market segmentation strategy that guides the company during every stage of the commercialization process.According to Boespflug (2013) a company that has a useful segmentation model will be able to increase its competitiveness and profitability.Gavett (2014) believes that a useful segmentation should have six characteristics: First, it should be identifiable, meaning that the characteristics of the customers in each market segment should be easily measurable.Second, it should be substantial, meaning that the segments should not be very small because small market segments are very costly to target.Third, it should be accessible, meaning that the segments should be created in such a way that the firm will be able to reach them via communication and distribution channels.Fourth, it should be stable, meaning that the characteristics used for segmenting the customers should not change very quickly in order to allow strategic planning.Fifth, it should be differentiable, meaning that people who are in the same segment should have similar characteristics and needs and their attributes should be different from those who belong to other segments.Sixth, it should be actionable, meaning that the firms should be able to provide products/services to the specified market segments.

Literature Review
Customer segmentation and clustering can be performed according to disparate criteria, including but not limited to profitability (Zeithaml et al., 2001), customer behavior (Ghazanfari et al., 2010;Gough & Sozou, 2005;Neal, 1998), degree of customer loyalty (Ansari & Riasi, 2016a;Cheng & Chen, 2009;Khajvand & Tarokh, 2011;McCarty & Hastak, 2007;Wei et al., 2012), purchase frequency (Ansari & Riasi, 2016a;Khajvand & Tarokh, 2011;McCarty & Hastak, 2007), purchase volume (Ghazanfari et al., 2010;Zeithaml et al., 2001), demographics (Dehghanpour & Rezvani, 2015;Gough & Sozou, 2005) and etc. Wang et al. (2014) proposed a hierarchical analysis structure for customer clustering in order to optimize the logistics network.In thei study, the customers' characteristics were represented by using linguistic variables under major and minor criteria.In the next step, fuzzy integration methodology was used to map the sub-criteria into the higher hierarchical criteria based on the trapezoidal fuzzy numbers.Newstead and D'Elia (2010) studied the concept of customer classification in the auto insurance industry by using log-linear Poisson regression analysis.They analyzed various variables including condition at the time of crash, vehicle type, crash injury severity, state, demographics, and color of the car and found that there is a clear statistically significant relationship between vehicle color and crash risk.Particularly, their results showed that compared to white vehicles, a number of dark colors were associated with higher crash risk.Their findings indicated that auto insurance companies should cluster their customers based on the color of their vehicles.Zeithaml et al. (2001) introduced the concept of customer pyramid which is a methodology that enables a company to augment its profits by customizing its responses to distinct customer profitability tiers.According to the customer pyramid, consumers can be divided into four different segments, namely, platinum tier, gold tier, iron tier, and lead tier.Customers in the platinum tier are typically the most profitable customers of the company, who can be considered as heavy users of the product/services who are committed to the company.The gold tier includes the customers who are less profitable than those in the platinum tier and are not as loyal to the firm even though they are still considered as heavy users of the products/services of the firm.The important characteristic of these customers is that they want price discount which will limit the profit margins of the firm.Customers in the iron tier purchase products/services at the volume that is needed to utilize the firm's capacity but since they have relatively low loyalty and profitability for the firm they are not qualified for special treatment.Finally, the customers in the lead tier are the ones that are not at all profitable for the firm and even complain about the firm to others.These customers are very costly for the company and targeting them is not a good strategy.Neal (1998) proposed a multidimensional segmentation approach based on three variables, namely, customer values, energy expenses, and customer energy management priorities.In this model, customer values were derived from an adaptive conjoint exercise which yielded eight segments based on value.Additionally, energy expenses were used to divide customer into five expenditure groups.Finally, customer energy management priorities were used to cluster the customer into seven different segments.Ghazanfari et al. (2010) studied market segmentation in apparel industry using clustering algorithms.To analyze the clusters and to measure the value of the customers in each cluster, they used RFM (recency, frequency, and monetary value) model.The RFM is a very popular model for customer value analysis and it has been used by many scholars in order to perform market segmentation (Cheng & Chen, 2009;Khajvand & Tarokh, 2011;McCarty & Hastak, 2007).Since RFM analyzes the behavior of the consumers, it can be considered as a behavior-based model (Wei et al., 2012;Yeh et al., 2008).Wei et al. (2012) adopted self-organizing maps (SOM) methodology in order to extend the RFM model.SOM is a neural network method which can be used for clustering problems, visualization, and market screening (Fish & Ruby, 2009;Hanafizadeh & Mirzazadeh, 2011;Hsu et al., 2009;Hung & Tsai, 2008;Kiang & Fisher, 2008;Wang, 2001;Wei et al., 2012).SOM also has the advantage of automatically detecting strong features in large data sets (Wei et al., 2012).Wei et al. (2012) used an extended model which is called LRFM (length, recency, frequency, and monetary value) to perform customer segmentation in a children's dental clinic in Taiwan.Ansari and Riasi (2016a) combined fuzzy c-means clustering and genetic algorithms to cluster the customers of steel industry.They divided the customers into two clusters by using the variables of the LRFM model.After comparing the performance of the combined algorithm (i.e., fuzzy c-means clustering and genetic algorithms) with fuzzy c-means clustering, they found that the combined algorithm had a lower mean squared error (MSE) but a higher run time.

Methodology
The concept of customer segmentation and clustering in the banking industry has become an interesting area of research in recent years.Customers are now demanding higher quality services and more customized products from their banks (Popli & Vadgama, 2012), therefore, it is necessary for the banks to perform customer clustering in order to enhance the quality of their services.Since bank customers vary in their complexity and characteristics, they have become an interesting subject for many studies in the field of services marketing.The data set for this study contains information about different characteristics of 250 loan applicants of a commercial bank located in Isfahan province of Iran.The data were collected during a one-month period (from April 20, 2015 to May 20, 2015).
Fraley and Raftery (1998) define cluster analysis as "partitioning data into meaningful subgroups, when the number of subgroups and other information about their composition may be unknown".According to Fraley and Raftery (1998), clustering techniques range from heuristic methods to more formal procedures that are based on statistical models and usually follow either a hierarchical strategy or a strategy in which observations are relocated among tentative clusters.The studies by Gordon (1981), Hartigan (1975), Kaufman and Rousseeuw (1990), McLachlan and Basford (1988) and Murtagh (1985), provide a good introduction into clustering algorithms.
In this study, two-step cluster analysis using scalable cluster algorithm is performed using IBM SPSS Statistics 23.The main advantage of using this technique is the ability of the two-step cluster analysis to analyze both continuous and categorical variables.In the first step, which is known as pre-clustering, the observations are divided into various small clusters.In this step each observation is first considered as an independent cluster and then the algorithm considers adding new observations to each cluster and merges some of the small clusters.Pre-clustering is performed by generating a data structure called clustering feature (CF) tree, which contains the cluster centroids.The cluster feature of cluster is: In equation 1, represents the number of observations in cluster , is the sum of continuous attributes of the observations in cluster , is the sum of the squared continuous variables of observations in cluster , and , , … , is a ∑ 1 -dimensional vector where the sub-vector is of 1 dimension, given by , … , in which represents the number of observations in cluster whose categorical attribute takes the category, 1, … , 1 (Chiu et al., 2001).
When two clusters (e.g., cluster and ) are merged, it means that two corresponding sets of data points are gathered together to form a new cluster.The , for the new cluster can be calculated as: As described earlier, in pre-clustering step, observations are scanned sequentially and for each of them a decision is made whether the observation should be merged with an existing dense region or not.A CF tree is constructed during this process to store the summary statistics of the dense regions.The CF tree contains various nodes where each node represents one of the inputs.Each leaf in the CF tree is considered as an input for a final subsystem.For each observation, the algorithm initiates from the start node and in each step the observation finds the closest entry in the node and travels to the closest child node.This process continues recursively and the observation descends along the CF tree until it reaches a leaf node.After reaching a leaf node, the observation finds the closest entry and it is absorbed to the closest entry if its distance with the closest entry is within a threshold value.If the distance is not within the threshold, then it starts as a new leaf entry in the leaf node (Chiu et al., 2001).There is an optional outlier-handling step which can be used in the process of constructing the CF tree.According to Chiu et al. (2001), the outliers in this process are observations that cannot be fitted into any of the clusters.Observations in a leaf entry are considered as outliers if the number of records in the entry is less than a certain fraction of the size of the largest leaf entry in the CF tree (Chiu et al., 2001).
The default value for this fraction is 25% in SPSS.Additionally, before constructing the CF tree, the algorithm searches for the potential irregular observations and discards them.After the CF tree is constructed the algorithm reexamines these observations based on the degree to which they will add to the size of the tree.Finally, observations which are not appropriate to be added to the tree will be considered as outliers.If the CF tree becomes larger than the maximum size which is allowed, it should be reconstructed using a larger threshold.As a result, the new CF tree which will be obtained will have a smaller size and hence will have more room for incoming observations.At the end of step one a collection of dense regions is identified and is stored in the leaf nodes of the CF tree (Chiu et al., 2001).The second step is called the clustering step, in which the dense regions identified in step one are merged in order to create the final clusters.If the number of clusters is not known, the method can automatically calculate the number of clusters by using Bayesian information criterion (BIC) and Akaike information criterion (AIC).If the number of clusters is pre-specified, observations will be clustered using hierarchical clustering such that the observations in each cluster will be as similar as possible.In this algorithm, the distance between the clusters is calculated using a log-likelihood-based distance measure.Specifically, the distance between two clusters is based on the decrease in log-likelihood as a result of merging the two clusters.The distance , between clusters and is defined as: Where ∑ log ∑ for v = s, j and , is defined in a similar way.Also is the entropy of the categorical attribute in cluster , is the total number of continuous attributes, is the total number of categorical attributes, is the variance of the continuous attribute in cluster , and is the number of observations in cluster (Chiu et al., 2001).
In order to make sure that the number of clusters determined by this algorithm were appropriate, analysis of variance (ANOVA) was used.After comparing the results for three, four, and five clusters it was found that using five clusters produces the best results.After performing the cluster analysis, the significance of difference between clusters was tested using discriminant function analysis.Additionally, in order to further analyze the differences between the identified clusters, post hoc analysis was performed.

Results
As described in the methodology section, five clusters were identified for the bank customers using a two-step scalable clustering algorithm.Demographic characteristics of each cluster are summarized in table 1   Table 3 displays the mean and standard deviation of continuous attributes which were used in this study.It can be seen that the most profitable customers are those that belong to clusters one and three, whereas the customers in the fifth cluster have the highest default risk and are the least profitable customers.As mentioned earlier, analysis of variance (ANOVA) was performed to analyze the differences between the clusters.The ANOVA results revealed that the F value is significant for all continuous variables (p < 0.05) which indicates that all clusters are significantly different.By using discriminant function analysis, a confusion matrix was created which is displayed in table 5.The matrix displays the number of customers who have been correctly or incorrectly clustered by the algorithm.Under the perfect conditions, only the diagonal of the confusion matrix should have numbers in it; therefore, the confusion matrix displayed here is very close to the perfect condition, because it has only one misclassified observation.
Table 5. Confusion matrix The ANOVA results indicated that the overall difference between the clusters was significant, but it did not indicate which specific clusters differed.Post hoc tests can be used in order to confirm where the differences occurred between the clusters.Since post hoc tests should only be run when the ANOVA results have been shown to be significant, therefore they can be used to further analyze the data for this study.Since the data set for this study met the homogeneity of variances assumption, Tukey's HSD test was used for post hoc analysis.The results of the post hoc analysis are summarized in Table 6.

Specification of the Clusters
The five clusters which were identified using two-step scalable clustering are fully specified in this section.
First cluster or "favorite customers": all of the customers in this cluster are business account holders and the majority of them are privately-held companies.These customers are extremely profitable (mean profitability = 69.3%)for the bank.They are called "favorite customers" because they have the highest average loan amount (mean loan amount = 153 million Rial) compared to the other four clusters.On average the customers in this cluster do business with their bank for roughly 10 years, and rarely close their bank accounts.Furthermore, they do not obtain financing from other banks and have a relatively low default risk (mean default risk = 10.7%).The average monthly income for the primary account holder in this cluster is relatively high.Almost 44.2% of the primary account holders in this cluster are between 41 and 50 years old.Moreover, long-term saving accounts and checking accounts denominated in foreign currencies are highly preferred by these customers.They usually start their business with the bank by opening a checking account but extend their interactions with the bank over time by opening long-term saving accounts and obtaining financing.
Second cluster or "non-creditworthy customers": similar to the favorite customers (i.e., first cluster) the non-creditworthy customers are also business account holders, however the majority of the account holders in this group are publicly-held companies.These customers usually receive small loans (average loan amount = 47 million Rial) and have a very high default risk (mean default risk = 50.6%).All of the primary account holders in this group are between 30 and 40 years old and have relatively low monthly incomes.The combination of high default risk, low monthly income, and lack of experience of primary account holders have contributed to low degrees of profitability for these customers.Finally, these customers have average degrees of loyalty and never close their bank accounts.
Third cluster or "creditworthy customers": the customers in this cluster are personal account holders with very high monthly incomes.Since these customers are the most profitable (mean profitability = 70.5%)account holders they should be the target of marketing strategies.The majority of the customers in this cluster are highly educated engineers and physicians.Since these customers have a relatively low default risk (mean default risk = 28.2%) and are highly profitable for the bank, they receive high amounts of financing (mean loan amount = 147 million Rial).These customers are mostly interested in long-term saving accounts and governmental or private bonds that are distributed by the commercial banks.All of these customers hold bank accounts denominated in domestic currency and have no account closure history.Finally, it was found that there were more male account holders (26.6% of total male account holders in the sample) in this group than female account holders (17.3% of total female account holders in the sample).
Fourth cluster or "passers": the customers in this group are the least loyal (mean loyalty = 1 year) account holders and have very low account balances (mean account balance = 20 million Rial).Additionally, they are more inclined toward short-term saving accounts and checking accounts and are always trying to obtain relatively large loans.The majority of these customers are employees of publicly-and privately-held companies and are extremely sensitive to interest rates.These customers have a high default risk (mean default risk = 43.1%)and are extremely willing to change their banks if a new bank offers them loans with lower interest rates.
Fifth cluster or "friends": these customers have the highest default risk (mean default risk = 64.2%)mainly due to their extremely low monthly incomes.Therefore, they are willing to accept high interest rates.The profitability of these customers is extremely low (mean profitability = 39.9%) for the banks and as a result the banks do not give them loans with large denominations (mean loan amount = 20 million Rial).Although these customers are not very profitable for the banks, they have high degrees of loyalty (mean loyalty = 4 years) and are willing to advertise for the bank services which they have received when talking with their friends or family.This cluster is labeled as "friends" due to these customers' high loyalty and willingness to advertise for the bank services.Another characteristic of the customers in this cluster is their relatively low education.More than half of the customers in this group had a high school degree or below which makes them the least educated cluster.Finally, the account holders in this cluster are mostly interested in short-term saving accounts and checking accounts.

Discussion and Conclusions
According to Mass et al. (2008), customer value in financial services industry is a synonym for customer equity, and the true sense of customer value is the benefits of a product, service or relationship as perceived by the customer.However, Mass et al. (2008) believe that the importance of customer value to the financial services industry is seldom realized.In order to evaluate the value of bank customers and to determine their impact on the bank's performance, it is necessary to identify their key characteristics by using customer clustering.Using customer clustering enables the banks to identify their most profitable customers and to design marketing strategies for each group of customers based on their attributes.The fact that the current study used data from customers of a single bank is the main limitation of this study.Further research is required in order to confirm the robustness of the relationships which were observed.For instance data from different commercial banks in different countries can be compared and the existence of the suggested five customer clusters can be examined.However, the authors believe that the five clusters of bank customers which were introduced in this study, will probably exist in other commercial banks with minor modifications.

Table 1 .
and the values of the categorical attributes are displayed in table 2. According to table 2, the majority of business accounts belong to customers in cluster one, while the majority of personal accounts are held by customers in cluster five.Furthermore, almost all clusters include customers without any account closure history, however the majority of the customers with account closure history belong to the fifth cluster.Moreover, customers with accounts denominated in foreign currency are only observable in the clusters one and two.Demographic characteristics of survey respondents * Each number indicates the percentage of the customers in the entire data set which have characteristic m and belong to cluster i.

Table 2 .
Values of the categorical attributes Each number indicates the percentage of the customers in the entire data set which have attribute value n and belong to cluster i.
** Isfahan is the largest city in Isfahan province, it is the provincial capital and a major financial hub.*** Indicates the percentage of customers from other cities in Isfahan province.

Table 3 .
Mean and standard deviation of continuous attributes

Table 4 .
ANOVA results * DF stands for degrees of freedom.

Table 6 .
Results of post hoc analysis