Evolution of clustering techniques in designing cellular manufacturing systems : A state-of-art review

Article history: Received July 8 2018 Received in Revised Format July 18 2018 Accepted August 12 2018 Available online August 12 2018 This paper presents a review of clustering and mathematical programming methods and their impacts on cell forming (CF) and scheduling problems. In-depth analysis is carried out by reviewing 105 dominant research papers from 1972 to 2017 available in the literature. Advantages, limitations and drawbacks of 11 clustering methods in addition to 8 meta-heuristics are also discussed. The domains of studied methods include cell forming, material transferring, voids, exceptional elements, bottleneck machines and uncertain product demands. Since most of the studied models are NP-hard, in each section of this research, a deep research on heuristics and metaheuristics beside the exact methods are provided. Outcomes of this work could determine some existing gaps in the knowledge base and provide directives for objectives of this research as well as future research which would help in clarifying many related questions in cellular manufacturing systems (CMS). © 2019 by the authors; licensee Growing Science, Canada


Introduction
Designing appropriate cells can reduce system costs and processing time.During the last 2 decades, there have been many cases that reported benefits caused by shifting from job-shop based layout to CMS (Agarwal, 2008).Historically, clustering methods in CMS were popular for their noteworthy advantages in cell forming (CF) problems.These advantages are due to their abilities of using benefits of machinepart similarities for generating cells, however, there are still many clustering techniques that have not been applied for CFPs.In addition, during the last decade, applications of hybrid clustering methods with other powerful searching algorithms such as metaheuristics have provided many new areas for designing cellular systems.Hence, due to wide range of clustering methods, it has become a necessity to study clustering methods and clustering-based hybrids.

Clustering and Cell Forming in Cellular Manufacturing Systems
Forming cells where more similar parts (based on their design, function or manufacturing process) belong to a certain group called cluster is a foundation for cellular manufacturing system studies.Clustering and partitioning techniques are commonly used in forming cells.Theodoridis et al. (2010) defined clustering as: , , … , .Once, given a set of data vectors, they are being grouped in such a way that 'more similar' vectors are in the same cluster and 'less similar' vectors are in different clusters (Fig. 1).The set, containing these clusters is called a clustering of .In CFPs, the use of binary machinecomponent index matrix (MCIM) is very common where array "1" appears if machine is used to produce part and '0' otherwise.

Fig. 1. Graphical view of cluster simulating
Clustering techniques can be classified by considering several points of view.One common classification categorizes them as crisp clustering where each part belongs to an exact part family (PF) despite fuzzy clustering where parts can belong to more than one PF.Another rationale classification simply classified clustering methods as single where one expression is applied for measuring similarity and multipleclustering that uses more than one function.In turn the methods fall into one of 2 main groups: hierarchical methods where joining (or separating) similar smaller clusters into larger clusters repeatedly and non-hierarchical methods where partitioning of datasets is carried out based on non-hierarchical (or undetermined) relations.Clustering procedure can be performed based on sequential information, cost or distance function information (K-mean, K-medoids and C-mean) and miscellaneous issues (such as competitive learning algorithms and spectral clustering).
In this section, some of clustering methods and metaheuristic algorithms for solving CFPs are investigated.

Hierarchical Clustering Methods
Hierarchical clustering algorithms use data of similarities (or dissimilarities) among parts and machines to split a large cell or PF into smaller clusters that are most dissimilar (divisive methods) or to merge more similar machines and parts to larger cells or PF (agglomerative methods).Single Linkage Clustering Algorithm (SLINK) just like many 'similarity coefficient' algorithms, works by measuring similarity index between machines (or clusters with the smallest minimum pairwise distances).Normally, SLINK is used to measure similarities of machine pairs in two cells to join them together in a larger cell.It was first used by McAuley (1972) in order to form machine cells.A Jaccard's similarity coefficient (JSC) formulated as follows: where S is similarity coefficient between machines and .N is number of parts that can be served by both machines J and K and N is the number of parts that can be processed either by machine or .Seifoddini (1989) applied SLINK to solve a CFP with 14 parts and 11 machines.One drawback with SLINK operator is ignoring dissimilarities between other machines while joining cells.As a result, formed cells may confront a number of dissimilar machines which do not contribute to improve the production process.Such drawback is called chaining problem.Risk of encountering chaining problems arises when the number of machines is increased or number of zeros in MCIM is too high.Gu (1991) presented a 2-stage SLINK-based approach for clustering PFs and machine groups (MGs).The contribution of their research was to consider multifunctional machines in clustering.Berardi et al. (1999) applied 2 SLINK-based and 4 rank order clustering (ROC)-based methods for evaluating the effectiveness of clustering techniques in providing layouts with shorter distances.They showed that different core clusters may have significant impact on the total cost.Complete Linkage Clustering (CLINK) calculates similarity coefficient between two cells based on minimum similarity of machine pairs (or clusters with smallest maximum pairwise distances) in two cells (Tarsuslugil & Bloor, 1979).
(2) Süer and Ortega (1994) presented a modified similarity coefficient and applied it to CLINK, average linkage clustering (ALINK) and SLINK.The new similarity coefficient was called machine level basedsimilarity coefficient (MLB-SC) and compared with Jaccard's Similarity Coefficient.
where represents machine types which should be applied for processing both parts and .represents the needed machines for processing part .In addition, they defined another method and applied it for SLINK, ALINK and CLINK: where m is workstation level of operation used for serving product .W is priority weight or cost of a workstation for an operation and n is number of operations.The outcomes show similar result for SLINK, ALINK and CLINK while using MLB-SC and MJSC.Suer and Cedeño (1996) solved the sameness machine trap problem by developing a modified version of MLB-SC which considered machine level beside machine types by calculating maximum level of machines for all products (family) in the cell.In Average Linkage Clustering (ALINK), sometimes known as ALC, average of similarities between all machines in two cells will be taken into account instead of paired machines (Seifoddini, 1988).Hence, the risk of chaining problem emergence will be reduced (or even resolved) since similarities between two machines are not the only reason for joining cells.The similarity coefficient for ALINK is: Similarity coefficient in ALC will be promoted whenever a new machine-cell is formed.Then, similarity coefficient will be recalculated again to find average of new machine-cells.Therefore, ALC is supposed to need more computation time despite SLINK, but provides more reliable solutions.Won and Kim (1997) developed a new clustering algorithm which used multi-criteria for measuring similarities called generalized machine similarity coefficient.The proposed algorithm used SLINK to form a primary cell and then applied relaxed CLINK to assign machines to the primary cell.Then, ALC was used to let machines be assigned to the primary cell considering the routings.Gupta and Seifoddini (1990) used part type, production volume, routing sequence and unit operation time data from early stages of grouping process.They compared CLINK, ALINK, SLINK and WLINK (weighted based linkage) in 50 problems.They found that by using ALINK, the number of assigned machines in the larger cells was significantly less than what was obtained in SLINK.Similar results were found by comparing WLINK & ALINK; and CLINK & WLINK, respectively.In addition, WLINK and CLINK provided fewer machines in the largest cell comparing to ALINK and WLINK, respectively.
Afterward, Gupta (1991) offered a new similarity coefficient for evaluating severity of chaining problem.They reported that SLINK causes more rugged chaining problem than other investigated methods.In addition, comparing to CLINK, SLINK mostly provided smaller cells Further, Baker & Maropoulos (2000) proposed a 3-stage method for CF, layout design and capability analysis to configure, generate and find approximated positions of cells and workstations within them.Irani and Huang (2000) presented a mathematical programming (MP) model for minimizing sum of inter-modular and machine duplications in large scale problems.Then, a heuristic was developed for matching strings and clustering parts based on similarities.Angra et al. (2008) presented two algorithms for CFP where the first one clustered parts and machines using commonality scores that calculated based on processing times.The second algorithm worked based on calculating total processing time of jobs according to number of predicted machines and parts that could be allocated to each cell.

Non-hierarchical Clustering Methods
In non-hierarchical clustering algorithms, a number of seed points are chosen initially for classification machines (or parts).The main disadvantage of classic hierarchical clustering methods is that while 2 points (machines) are grouped together, there is no further chance for retracing or retrieving them in future steps.ZODIAC and GRAFICS are among the most well-known non-hierarchical clustering methods.Chandrasekharan and Rajagopalan (1987) presented an improved version of ideal seeding method as a zero-one seeding clustering algorithm for generating cells.The proposed algorithm (ZODIAC) was able to make clusters using MCIM where each machine set around fixed seed-points.Srinivasan and Narendran (1991) developed a clustering method by using an assignment method for initial cluster seeds (GRAFICS).GRAFICS has 2 stages.In first stage, initial set of part-machines is created using MCIM.Then, clustering is done using maximum density rule.Afterward, Srinivasan (1994) applied minimum spanning tree algorithm (MST) for CFP.The proposed method removed edges for finding alternate starting seeds for clustering.MST is well-known for its fast tracking ability.As another approach, Chen and Heragu (1999) proposed 2 decomposition methods for solving large scale CFPs by decomposing large systems into several subsystems.Then, using nonlinear mixed integer programming (IP) model total cost of inter-cellular movements and resource underutilization were minimized.

Partitioning Methods
Partitioning methods are referred to construct patterns with 2 or more partitions where each partition can involve members.Considering input data sets, partitioning methods can be classified in two categories: K-methods where the number of clusters are taken as input value (K) (like K-means and K-medoids) and C-methods that take threshold value to determine clusters (τ).Generally, a K-means method (also known as Lloyd's algorithm) is to determine points called 'centers' to minimize the Euclidean Space, defined as the sum of distances between all data points to their respective cluster centers.Therefore, in K-means clustering algorithm, each cluster is represented by the mean value of the objects in the cluster.Al-sultan (1997) proposed a K-mean algorithm for large scale problems that was formulated as a mathematical model for minimizing the distance between each part and representative of families.K-mean harmonic clustering algorithm (KHM) which was proposed by Zhang et al. (1999) is another partitioning method that sets clusters by minimizing the harmonic mean of distances between data points from centers: where is number of clusters, C presents cluster number and shows number of points.Ünler and Güngör (2009) applied KHM for a clustering problem which worked based on the degree of pre-defined membership function and grouping efficacy.Chitta and Narasimha Murty (2010) developed a two-level K-mean algorithm to survey the relation between the number and size of the clusters.Their results showed that the proposed method could effectively solve large scale.K-medoids clustering algorithms are similar to K-means but the selections of cluster representatives are restricted to the existent parts as indicated by Fig. 2. The set of vectors (medoids) which structure clusters are determined in order to minimize a cost function that is calculated according to closet distance between each data vector and its medoid (Kaufman & Rousseeuw, 2009).K-medoid algorithm is also used many times to solve CFPs.It is sometimes interpreted as a problem (which is equivalent to P-median problem (PMP)) and sometimes as a heuristic algorithm for solving the corresponding problems.PMP is a mathematical programming method for minimizing the distances between machines : ∑ ∑ .).Historically, Kusiak (1987) developed a zero-one IP to maximize the sum of similarity coefficients defined between pairs of parts.Wang and Roze (1997) proposed a new version of PMP where upper and lower cell size, maximum number of machines per cell and parts per family were taken into consideration.They compared classical and the proposed PMPs using 3 different similarity coefficients.Deutsch et al. (1998) developed a PMP while similarities between all parts were calculated instead of center median.
The results demonstrated that considering similarities between all parts provided better solutions than arbitrary median.Classical PMP has limitation in solving large scale problem (Kusiak, 1987).Hence, Won and Chang Lee (2004) proposed two modified PMP approaches with the objective of maximizing sum of the similarities between machines that were located in the same cell.Ashayeri et al. (2005) applied an improved version of Teitz and Bart's vertex substitution heuristic for solving facility layout and location problems that were formulated as PMP.Won and Currie (2006) developed a new version of PMP by calculating similarity coefficients in a non-binary MCIM through clustering process while operation sequences and production volumes were taken into account.Goldengorin et al. (2012) proposed a compact representation of PMP by using Mixed-Boolean pseudo-Boolean formulation for minimizing dissimilarities between center and machines within a cell that caused reducing computation time.
Krushinsky & Goldengorin (2012) argued that MCIM does not have sufficient information for providing efficient cells with exact solutions.They used straightforward formulation (SF) and alternative formulation (AF) for minimizing P-cuts in an undirected weighted graph which was also known as MINpCUT problem.Paydar & Saidi-Mehrabad (2013) applied a hybrid of Genetic algorithm (GA) with variable neighbourhood solution for maximizing grouping efficacy in large scale CFPs.As mentioned before, C-methods take threshold values to determine clusters (τ).Fuzzy c-means algorithm (FCM) has been applied frequently in CFPs.In FCM one data (part) can belong to more than one cluster at the same time but with different membership values (Fig. 3).FCM was introduced by Dunn (1974).It creates a partition matrix using the given data sets.Then, the elements are represented by membership values of patterns to clusters.Lozano et al. (2002) reported that standard FCM has some drawbacks in choosing appropriate values of fuzziness indexes and defusing of the solution.They proposed a modified FCM that worked based on parallel machine grouping and applied annealing process in order to group components and machines.They considered large weighting exponent (or fuzziness index) at an early stage which was then reduced gradually until a crisp cluster structure emerged.Josien and Liao (2002) proposed a hybrid fuzzy algorithm to take the advantages of FCM and fuzzy K-nearest neighbours to provide better grouping efficacy values than standard FCM.Their results also revealed that generally the weighted distance provides better results than the Euclidean distance.Moreover, increasing the amount of training data is more preferred in weighted distance as well as decreasing density of the machine-part data structure.Yang et al. (2004) proposed a modified version of FCM called MVFCM using modified dissimilarity measure which considered both symbolic and fuzzy feature components.Afterward, Yang et al. (2006) modified MVFCM by considering mixed variable indexes for MCIM in a way that even symbolic and fuzzy variables could be easily applied.Izakian and Abraham (2011) applied a hybrid of FCM with Particle Swarm Optimization (PSO) known as FCM-FPSO to overcome drawbacks of FCM like local optimum traps and also sensitivity to initialization.To improve the fitness of each particle, the algorithm applied fuzzy clustering to particles in swarms of every generation.

Array Based Algorithms
Array based algorithms use information of datasets like MCIM to make diagonal forms of machines and parts.Depending on the solving procedure, array based algorithms can be hierarchical or nonhierarchical.Rank Order Clustering (ROC) and all its modified versions (M-ROC, ROC2), Bond-energy Algorithm (BEA) and Direct Clustering Algorithm (DCA) are among well-known array-based algorithms.ROC is an iterative clustering algorithm introduced by King (1980).The proposed method began by assigning binary variables of MCIM which were then changed to their decimal equivalents and rearranged to reduce the degree of magnitude until a diagonal pattern emerged.Then, a relaxation procedure was employed to determine number of required duplicated machines in order to eliminate the bottleneck points.King and Nakornchai (1982) proposed modified ROC (called ROC2) to generate diagonal groups in MCIM by rearranging several rows and columns of MCIM simultaneously (instead of element by element) which improved grouping efficacy.Boe and Cheng (1991) argued that ROC and Clustering and Data Organization (CDR) cannot create block diagonals efficiently.Hence, they proposed a close neighbour algorithm which worked based on clustering machines using MCIM (at first stage) and then rearranged parts of the matrix by linking the machines to them using closeness measures.Kusiak (1991) addressed 3 heuristics for solving unconstraint problems, problems with machine-cell number constraint and identifying bottlenecks in cells.The proposed heuristics worked based on a cluster identification algorithm which transforms MCIM using vertical and horizontal lines.Chow and Hawaleshka (1992) proposed an algorithm to solve chaining problem by transforming MCIM into a ( ) matrix using the commonality scores which had been proposed by Wei & Kern (1989).Afterward, first two machines that achieved highest commonality scores were considered as first group.The new group supposed as a new component for other machines.

Miscellaneous Clustering Algorithms
Miscellaneous Clustering refers to those algorithms that provide a single clustering but do not fall into sequential or cost function optimization category.Spectral clustering algorithm works based on utilization of graph theory concepts and some certain optimization criteria that stem from matrix theory.Oliveira et al. (2009) applied a modified spectral clustering algorithm for improving number of inter-cell movements in CFP.They used average similarity cluster selection that was bounded with cell-size constraint.Only few records have been done using spectral so performance of the method cannot be judged for CFP/ making more clusters than other methods/ using normalized cut-function provides poor solutions

Hybrid of Metaheuristics and Clustering Methods
Clustering algorithms show remarkable performance in generating cell layouts while they are employed as a part of a hybrid with metaheuristics.Genetic Algorithm (GA) is an iterative population-based algorithm with robust performance of solving mathematical problems which dynamically uses certain and stochastic rules to obtain better combination of solutions by improving individual characteristics using constructing and re-constructing chromosome strings.Zhao et al. (1996) used GA for solving a fuzzy clustering method to form cells that was able to consider inexact real data structure.Dimopoulos and Mort (2001) applied a hierarchical clustering approach based on genetic programming (GP) called GP-SLCA.In the proposed method, Jaccard's similarity coefficient was replaced with a GP algorithm that could employ a variety of similarity coefficients.Lee-Post (2000) employed GA for solving an ALINK-based method which be used for part family identification process.Rogers and Kulkarni (2005) used GA for medium and large scale problems and a typical bivariate clustering for small size problems in a simultaneous approach of grouping rows and columns of flow matrix to minimize sum of dissimilarity measures.
Banerjee and Das (2012) applied a 2-stage modified predator-prey genetic algorithm (PPGA) for generating adaptive clusters and identifying bottleneck parts or machines based on cost measures.Tabu Search (TS) is developed for solving defects of searching neighborhood spaces.The logic of the TS is based on using short term memory to prohibit revisiting those solutions that had been rejected before or those whom are banned by the algorithm for some reason.During cell forming process, TS can be used as a mechanism for clustering machines by maximizing similarity coefficient.Kusiak (1987) employed TS for cell forming and used weighted sum of intra-cellular voids and inter-cellular movements as objective function where number of machines per cell and number of parts in families were limited.Adenso-Diaz et al. ( 2005) proposed a 2-stage approach considering limits on number of parts in families and machines in groups.Ant Colonies Optimization (ACO) is a searching method that inspired by the foraging behavior of real ants.
Kao and Li (2008) applied a recognition system of artificial ants in a clustering algorithm by simulating real ant's vision in order to utilize object recognition to form initial part-clusters with higher similarities.The clusters were then merged to larger clusters until an appropriate part cluster emerged.Particle Swarm Optimisation (PSO) is inspired by flocking birds and works based on using swarm intelligence (using other member's feedback) and past information of each member to find optimum or near optimum solutions (Eberhart & Kennedy, 1995).Yang et al. (2009) reported that K-means and K-harmonic means can easily be trapped in local optima.To overcome this problem, they presented a hybrid PSO and KHM called PSOKHM.Nouri et al. (2010) used BFA for minimizing number of voids and inter-cellular material transferring.
Artificial Neural Networks (ANNs) have been studied for many years due to their remarkable ability of information processing, high parallelism, fault and noise tolerance and learning capabilities (Basheer & Hajmeer, 2000).In continue some of ANN algorithms that successfully employed in CFPs are illustrated.Self-Organizing Map (SOM) refers to two layer networks that transform n-dimensional input patterns to data of lower dimension while preserving their content (Kohonen, 1989).Kulkarni and Kiang (1995) found that SOM provides flexible alternatives for multiple grouping and enables users to have accurate control over the number of cells but at the same time SOM suffers from lacking a procedure to prevent duplicating bottleneck machines.Guerrero et al. (2002) proposed a quadratic assignment problem (QAP) to generate part families according to weighted similarities coefficients.Then a 2-stage SOM was applied for creating initial clusters.Chattopadhyay et al. (2012) used results of quantization and topography errors and also average distortion measure during training process in SOM to setup a criterion for choosing optimum size of SOM.Such criterion generated the best clustering with preservation of topology harms.Adaptive Resonance Theory refers to unsupervised learning networks which consist of two incorporated layers, an input layer as short-term memory which uses feedback weights and output neurons as long-term memory which uses feed-forward weights (Rooij et al., 1996).Chen and Park (1996) improved the standard form of ART algorithm and used it for CFP by using bipolar vectors instead of binary vectors.Enke et al. (1998) applied modified ART1 algorithm that used an optimal vigilance value to ensure that the number of machine and part groups would stay the same through the clustering process.Then, Enke et al. (2000) used modified ART1 in a way that the input vectors were reordered in preparation for application of the modified procedure that stored group representation vectors.Pandian and Mahapatra (2009) presented a 2-phase modified ART algorithm that considered operation sequences and time.The proposed algorithm converted the given non-binary data into MCIM and fed the ART1 network with it.The Fuzzy ART networks represent an improvement in using ART based algorithms since both analogue and binary values can be considered as inputs.
Moreover, it has simpler functions than ART2 (another form of ART) which make it easier to apply.Then, a fuzzy ART algorithm was designed for large scale pattern recognition approach in sequence dependent clustering problem by Suresh et al. (1999).
Afterward, Park and Suresh (2003) proposed a modified fuzzy ART for solving large scale problems with similar routing sequences.They developed a new scheme for representing streams, clustering performance measures and experimental procedures.Kuo et al. (2006) proposed a fuzzy sets in ART2 to improve input vectors in learning procedure which led to better part families.Özdemir et al. (2007) proposed a two-stage hierarchical fuzzy clustering method to overcome the proliferation problem.It should be mentioned that most of the ART procedures, like fuzzy ART, encounter with proliferation problem which is caused by identifying unnecessary clusters.Such uninvited guests emerge as a consequence of losing connecting weights between some input vectors during learning process.As a result, new inputs cannot make any connection with old input vectors and accordingly will be considered as a new cluster.Yang and Yang (2008) proposed an improved ART1 by modifying vigilance parameter and training vector in order to overcome such drawbacks.

Comparison of Clustering Methods
In order to provide comprehensive perception of using clustering methods for CFPs, it seems necessary to compare the reviewed researches to find out advantages and drawbacks of used methods and approaches for filling future gaps.Table 2 compares reviewed papers based on their concepts and objectives.Afterward, significant points and contribution of each research are explained in Table 3 and Table 4 show contribution of the researches that employed hybrid metaheuristics and clustering algorithms.Burke & Kamal (1995) Providing solutions with minimum inter-cell and cellular movements

Common Problems in Designing Cellular Manufacturing Systems
Determining the best combination of machines that can be used in the consecutive operations of a part (during or after cell generation) is the aim of addressing part routing problems.Delgoshaei et al. (2016b) compared different material transferring models that are developed by scientists in the CMS problem so far.From another perspective in part routing problems, each part can be completed in more than one way because of the existence of parallel machines.In CMS studies, two main work-in-process (WIP) movements can be recognized.Intra-cellular WIP transferring involves transferring materials among machines that are located in a cell.By contrast, in inter-cellular movements, materials are planned to shift between cells to perform some operations.Choosing different permutations of various machines inevitably causes different inter and intracellular movements and entails material transferring costs accordingly.

Emerging Exceptional Elements and Voids
In this part, exceptional elements (EEs) and voids that are recognized as the major drawbacks in cell forming and cell scheduling processes will be explained in details.EEs and voids usually emerge during the generation of block diagonals.EEs generally do not allow the strict rearrangement of the MCIM, which generates clusters (or cells).Such a phenomenon emerges as a result of existing differences among operation characteristics or machine abilities in which one or more machines cannot be clustered with other machines and thus are left alone (Fig. 4).The existence of EEs increases the material transferring costs for those parts that are planned to (or must) be served by them.In many cases, the solution is to eliminate the EEs after or during the creation of block diagonals.On the other hand, voids represent the idle path between a block diagonal or within clusters.Note that voids should not be mistaken with corridors because voids represent empty places that should have been filled by machines during cell forming.The existence of voids, which is shown by Fig. 4, obviously causes increased intra-cellular movements for some parts.Hence, minimizing the number of voids and EEs is the main aim of many studies.To solve this problem, two general strategies are followed by scientists.In many studies, avoiding emerging EEs and voids is achieved through strengthened clustering of similar parts and machines.Taj et al. (1998) advised the idea of designing one or more multiple part families to overcome such a drawback.Lozano et al. (2002) proposed a model to minimize the number of voids and inter-cellular movements by using fuzzy indexes.Minimizing the number of EEs is also considered a tool to achieve high-performing cells.Mahdavi et al. (2007) developed a model to minimize the number of voids and EEs and thus achieve high cell utilization.Mahdavi et al. (2009) applied a GA for their previously developed model to achieve high cell utilization performance.Nouri et al. (2010) used Bacteria Forging algorithm for minimizing number of voids and inter-cellular material transferring.Arkat et al. (2011) addressed a multi-criteria decision making model to minimize the number of EEs and voids.Mahdavi et al. (2012) proposed a new mathematical model to minimize the number of voids and makespan by finding the best inter-cellular transfer of workers and parts.The contribution of their study was the addition of workers as the third dimension of MCIM with the use of a cubic matrix.The next strategy is minimizing the number of EEs after the block diagonals frequently used are generated.Mukhopadhyay et al. (1995) attempted to minimize the number of EEs after block diagonals emerge by using close-neighbor search algorithm.Won (2000) proposed a two-stage method to minimize the number of EEs.In the first stage, a P-median model was developed to minimize the distance of input elements (machines), and in the next stage, the authors attempted to minimize the number of bottleneck parts and machines.Mahdavi et al. (2001) proposed an ANN in which in the final layout, the minimum number of EEs remained.A similar approach was employed by Soleymanpour et al. (2002) to group similar parts and dissimilar machines; the aim was to minimize the total number of EEs and voids.Adenso-Diaz et al. (2005) proposed a twostage approach to minimize the number of voids and inside and outside (of cells) operations while the number of cells was not fixed.Chan et al. (2006) tried to minimize intra-cellular WIP transferring by reducing the number of voids inside block diagonals.With the use of the same strategy to minimize EEs outside blocks, the number of inter-cellular WIP transferring moves was also minimized.Venkumar & Haq (2006a) proposed the Kohonen SOM to recognize EEs and bottleneck parts, as well as to measure group efficacy and the effectiveness of fractional cell forming.During the same year, Venkumar and Haq (2006a) applied modified ART to minimize the number of EEs on the basis of MCIM information (Venkumar & Haq, 2006b;Obeid et al., 2018;Jaśkowska et al. 2018).

Dynamic Product Demands in Designing Cellular Manufacturing Systems
In most real cases, part demands are different from one planning horizon to another.Such a criterion is known as dynamic part demand.Market changes, changes in product designs, and the manufacture of new products are some of the reasons for the change in part demands through different time periods.These conditions may cause emerging imbalances in part routings and bottleneck machines.They will be explained in a separate section because of their importance.Wang et al. (2001) argued that dynamic demands can increase the complexity of such models.Therefore, they applied simulated annealing algorithm to solve the problems involved.Tavakkoli-Moghaddam et al. (2005b) employed triangular fuzzy numbers to estimate uncertain demands of each part type.For this purpose, a fuzzy nonlinear mixed integer programming method was developed with the aim of minimizing constant machine, intercellular WIP transferring, and reconfiguration costs.Balakrishnan and Cheng (2005) proposed a two-stage procedure to minimize material handling and machine relocation costs in the midst of part uncertainties.In the same year, Tavakkoli-Moghaddam et al. (2005a) minimized material transferring costs in the dynamic condition of part demands by using alternative process plans and machine relocation and replications.Defersha and Chen (2006) used parallel machines and outsource services to overcome dynamic part demand defects in cell forming process.Jeon and Leep (2006) presented a model for scheduling dynamic cells where machine failures can cause waiting times and reduce system capacity accordingly.Tavakkoli-Moghaddam et al. (2007a) considered dynamic part demands and parts mixed for a reconfigurable part routing problem; minimizing operating (constant and variable), machine relocating, and intercellular WIP transferring costs was considered as the objective of the proposed model.Tavakkoli-Moghaddam et al. (2007b) considered the normal distribution function to estimate the part demands in a stochastic model; minimizing material transferring movements was the main objective of the method.Safaei and Tavakkoli-Moghaddam (2009a) also argued that machine capacity and part demands should not be considered fixed and showed how such uncertainties can influence the cell configuration through time horizon.During the scheduling of a dynamic manufacturing system, the system capacity may be inadequate to meet customer demand at a specific period.Hence, Safaei and Tavakkoli-Moghaddam (2009b) addressed a dynamic scheduling problem to find the tradeoff values between in-house production and outsourcing while cells are supposed to be reconfigurable.This time, they considered intercellular movements in addition to intracellular ones.The other solution to address part uncertainties is forming new cells as a result of market changes.This strategy was discussed by Zhang (2011).Aggregate planning while minimizing operation, inventory, and material movement costs was used.Egilmez et al. (2012) focused on uncertain operation times in D-CMS.The contribution of their model is considering risk level in process of designing cells in dynamic environment.A few years later, Egilmez and Süer (2014) evaluated the impact of risk level in an integrated cell forming and scheduling problem using Monte Carlo Simulation.Süer et al. (2010) proposed a new model which could determine the dedicated, shared and reminder cells in D-CMS.One important conclusion of their research is that in the average flow time and total WIP are not always the lowest when additional machines are used.Delgoshaei et al. (2016a) proposed a new method for scheduling dynamic CMS using a hybrid Ant Colony Optimization and Simulation Annealing Algorithms.Delgoshaei & Gomes (2016) used artificial neural networks for scheduling cellular layouts while preventive maintenance and periodic services are taken into consideration.Egilmez et al. (2012) focused on uncertain operation times in D-CMS.The contribution of their model is considering risk level in process of designing cells in dynamic environment.A few years later, Egilmez and Süer (2014) evaluated the impact of risk level in an integrated cell forming and scheduling problem using Monte Carlo Simulation.Süer et al. (2010) proposed a new model which could determine the dedicated, shared and reminder cells in D-CMS.One important conclusion of their research is that in the average flow time and total WIP are not always the lowest when additional machines are used.Ariafar et al. (2014) focused on the impact of dynamic product demand on facility layout problem.The main objective of the proposed model was minimizing material transferring by arranging the machine cells within the shop-floor, and the machines within each of the machine cells.Afterward, Renna and Ambrico (2015) also proposed three models for designing, reconfiguring and scheduling cells in dynamic condition of product demands.In their models, they considered minimizing system costs including intercellular movements, machining and reconfiguring costs as well as maximizing net-profit.

Discussion on Gaps and Findings in Clustering Methods
Many scientific researchers have used clustering techniques in cellular manufacturing system design over the last three decades.A number of significant conclusions can be drawn from investigated literature.First is that using clustering concepts in CFP showed significant increasing trend during the last 2 decades due to their ability to incorporate with other searching algorithms, mostly metaheuristics.Most of the researches in CFP clustering issues have been developed based on similarities (or dissimilarities) in processing sequences or times and only a few considered setup times, travelling costs and machine breakdown.Moreover, most of the opted researches dealt with generating new cells and only a few involved improving created cells.Many traditional works considered static circumstances in cell design.However, less effort has been expended in dynamic cell design to survey the impact of uncertain conditions.Moreover, since the production cycles of many products are shorter than before, the need for designing robust cells is now more urgent.As well, the desire for customizing products with different characteristics sometimes causes unpredictability in production volume.Using hybrids of clustering methods with other search tools for dynamic cell reconfiguring is suggested.To the best of the authors' knowledge, there are no references for using merging procedure for CFP where fuzzy threshold parameters are taken into account.There are still many clustering methods which have not yet been applied or have few records of using for CFP, so the abilities in cell configuring or reconfiguring process cannot be judged but applying them can open many new areas in cell designing.Some of these methods are: Basic Sequential Algorithmic Scheme (BSAS), Reassignment Procedure, Generalized Mixture Decomposition Algorithmic Scheme (GMDAS), Possibilistic C-means Algorithm (PCM), Competitive Leaky Learning Algorithm (CLLA or LLA), Valley-Seeking Clustering Algorithm (VS), Generalized Agglomerative Scheme and Specific Agglomerative Clustering Algorithms.

Summary
This paper has presented a literature review over clustering methods in cell forming problems concentrating on mostly common programming models, solving methods and procedures successfully used by scientists through last two decades.In each section, an attempt has been performed to explain drawbacks and shortcomings that are emerged during cell forming and scheduling the formed cells.Then the most successful solutions for each drawback have been explained.A comprehensive list of related researches have been classified which enables readers to make vivid delineation on cell forming and scheduling problems.In continue, the gaps, which are found in this research, were listed.Considering stochastic system parameters in forming cells and using advanced computation methods were found as future directions in this field.Future expansion of applying clustering algorithms for cellular manufacturing systems is suggested by using hybrid meta-heuristic methods.

Fig. 2 .
Fig. 2. Using K-medoids for clustering where cluster centers are chosen among real parts

Fig. 4 .
Fig. 4. Graphical view of voids, exceptional elements, bottleneck machines and bottleneck parts Table 1 is provided to clarify the advantages and disadvantages of clustering methods employed in CFP.
MethodFeatures/Advantage Disadvantages K-Mean Unsupervised learning algorithm/ Minimizes Euclidian distance of each part to centre of labelled clusters/ Attempt to minimize mean square error/ Easy to understand and apply/ Permanently decries within-cell variation/ fast iterative algorithm Suboptimal partitions are happened in some cases/ Accurate choosing number of clusters is crucial/Needs several times of running program to compare results in order to find best k point K-medoid Easy to understand and apply/ Works with an arbitrary matrix of distances between data points (each cluster centre must be chosen among one of the points themselves)/ Minimizes a sum of pairwise dissimilarities/ Always converging As (Theodoridis, Pikrakis et al. 2010) mentioned, different initializations of K-medoid algorithms may cause different final clusters/ More time consuming than k-mean/ Needs more computation than k-means/ Like k-mean needs several times of running program to compare results in order to find best k point Harmonic Kmean Centre-based clustering algorithm/ Uses harmonic mean in calculation instead of total with-in cluster variance (that used in K-means)/ Converge faster than K-means when the initialization is far from a local optimum/ lower possibility of falling into local optima trap than k-means Converge slower than K-means when the initialization is close to a local optimum/Centres under K-means is more mobile than what can be found in KHM/ FCM Allows data to be stored in more than one cluster/ allow considering non-binary inputs/ More realistic/ Calculating degree of membership for each part in a part family

Table 2
Details of methods used in opted references in clustering

Table 2
Details of methods used in opted references in clustering(Continued)

Table 3
Significant points and contribution of the literature(Continued)

Table 3
Significant points and contribution of the literature(Continued)

Table 3
Significant points and contribution of the literature(Continued)

Table 4
Observations of hybrids of clustering and metaheuristics in CMS

Table 4
Observations of hybrids of clustering and metaheuristics in CMS(Continued)