K-Means Based Cluster Formation and Head Selection Through Articial Neural Network in MANET

: Mobile ad-hoc network is a dynamic and self configuring network composed of mobile nodes that are cooperative and intelligent in nature, which forms a temporary network without any base station. Due to dynamic nature, route among source and destination is not fixed and it can change with time, this results in routing overhead, congestion and higher energy consumption in nodes. In this condition, clustering becomes very relevant in highly dense network in which the clusterhead initiates the routing instead of normal nodes. This paper proposes an efficient way for cluster formation and selection of stable cluster heads (E-CFSA) with high energy level. E-CFSA forms node cluster using k-means algorithm where distance with centroids acts as the key parameter. Afterwards, artificial neural network (ANN) is applied in each cluster to select efficient clusterhead. It also updates weight of input parameters such as mobility, packet drop, energy and number of neighbor nodes in order to minimize the errors at target neuron using back propagation algorithm. The overall process identifies and selects the nodes as clusterheads that possess longer stability and higher energy level. This procedure reduces the repetitive selection of cluster heads and re-affiliation of member nodes in a cluster. Thereafter, performance is evaluated in terms of overhead, packet delivery ratio, throughput and cluster head stability time with variation in speed of node or number of nodes in MANET. The experimental results obtained through proposed E-CFSA are compared to other existing mechanisms and a comparison is drawn. These comparative results very clearly revealed that proposed E-CFSA outperformed its peers.


Introduction
Mobile ad-hoc network (MANET) is a collection of nodes that can communicate with each other and move freely. It is a self directed infrastructure less network where more than two mobile nodes can share data. The working of network has shown in figure 1 in which source mobile node (S) can send data to destination mobile node (D), directly whenever they are in the coverage of transmission range. These can directly communicate to each other otherwise intermediate mobile nodes (like A, B and C) relay data from source to destination. In MANET, a mobile node can join and leave the network dynamically as the physical network changes frequently. Also, node movement differ for mobile nodes, operation topology also depend on the speed and the direction of the host [1]. Advantages like instant connectivity, router free, fault tolerance in case of connection failures and economical make MANET more enticing than traditional network. All these advantages with free mobility, without the limitation of localization and enabling communication without bothering about the details makes MANET realize pervasive and ubiquitous computing like ambient intelligence. The network routing in MANET is a challenging task due to dynamic nature of the topology. There exists many routing protocols for MANETs and it's classification are as follows: Proactive or Table driven routing protocols, Reactive or On demand routing protocols and Hybrid routing protocols [2].
Ad hoc On-demand distance vector (AODV) routing protocol is a single path routing protocol between the source and the destination. It tends to select single minimum-cost path since although different path with different cost metrics exist. Propagation of data to destination through a single path is not sufficient, therefore multipath routing comes into existence to overcome this limitation. In the multipath technique the best path is selected out of several available paths to transfer the data packets. If the path selected encounters complexity or error occurred or remains busy during the data transfer then alternate best available path is chosen for data transfer. In MANET, when source nodes want to communicate with destination node, it broadcasts or receives control messages like route request, route reply to connected neighbor nodes. This activity is continued till it finds a route to the destination node. In the scenarios, when network has excess number of nodes or network is highly dense, then existing routing approach generates high control overhead. Small group of nodes are created in the form of clusters to overcome the increased overhead, where routing is initiated through clusterhead, in place of normal nodes.
Finally, routes are formed and data transfer will take place through cluster node instead of communication to individual nodes, represented as figure 2. The mobile nodes are aggregated into clusters and one of them is elected as a cluster head on the basis of its performance. Nodes that are in the communication range of the clusterhead belong to its cluster. In a dynamic network, clusterhead scheme can cause performance degradation due to frequent clusterhead selections. In recent times there are various explores involving cluster based routings like least cluster change, in which clusterhead get changed in the condition that mapped two clusterheads to come into one cluster [3]. Several techniques have been expressed to select a trusted clusterhead like link cluster architecture based lower-ID, highest degree techniques, node performance based single metric and multiple metric techniques [4]. Some algorithm are based on neural networks in which feed forward and multi layer perceptron is used to optimize the route and select cluster head which can overcome the issues and improve the usability and survivability of network [5].
The key objective of this work is seamless clusterhead selection for the challenging scenarios where nodes are constantly changing its position along with the other nodes in MANET. The clusterhead node must have best credentials in terms of their throughput, energy etc. Also, it does coordination among the members of each cluster. On the basis of these characteristics the nodes of cluster are analyzed and the clusterhead is selected. Another issue Cluster Head Gateway Normal Node like due to dynamic nature of MANET, number of members in the cluster can increase suddenly, which increases the size of the cluster that causes overloading. Now, handling of all these nodes becomes challenging and as a result performance suffers. This can lead to failure in the transmission of data among the clusters. Another challenge exist in the scenarios of highly dynamic cluster where mobile nodes acting as a clusterhead changes its position so much that it actually leaves the cluster then new clusterheads has to be identified and put in place so that it can manage and look after the cluster. The proposed coupling of K-means algorithm with ANN successfully achieves this aim and also works fine in the context of scalability in MANET.
Taking all the challenges into account, this paper strives to find an optimal cluster head, reduce the scenario of overloading within in cluster and making new cluster head seamlessly.
The rest of the paper is organized as follows. Section 2 covers the related works in the area of cluster head selection algorithms in MANET. Section 3 represents the proposed approach of cluster head selection and weight updation (E-CFSA). Section 4 presents the experimental result and analysis with comparison from current state of art while section 5 concludes this paper.

Related work
In the last decade, numerous algorithms have been introduced to design efficient concept for clustering of nodes, mitigating the problem of network overhead and effective selection of cluster head. Torkestani and Meybodi [6] proposed fully distributed MCFA cluster algorithm based on learning automata and mobility parameters of host. Moreover, performance of MCFA analyzed in terms of the number of clusters, lifetime of cluster, reaffiliation rate and control overhead. Furthermore, Cheng et al. [7] proposed dynamic genetic algorithm for solving problem of load balanced clustering. Genetic algorithms show clustering structure based on load balancing metric with its fitness value as nodes change their position from one place to another frequently. During this time, Agarwal et al. [4], reviewed weight based clustering algorithms introduced where cluster head was identified on the basis of nodes performance. Relatively, adaptive cluster based energy aware routing [8] was proposed that considered node position and power saving parameters to balance the energy of node as well as improve network life. In addition, this protocol worked on such parameters like loading of network, energy conservation and lifetime of nodes and networks. Subsequently, battery power and connectivity level based cluster head technique was introduced for selecting cluster head in MANET [9]. Afterwards, one hop clustering algorithm was proposed by Basurra [10] in that introduced zone based routing with parallel and distributed broadcasting technique to minimize redundant broadcasting and improve path discovery process. Review of different kind of routing has already done in article [13].
In other works, Periyasamy et al [11] employed selection of clusterhead with modified Kmeans while Prince and Kannan [12] used bat-inspired strategy, and introduced optimal clustering algorithms. Subsequently, other authors focused on work related to optimal cluster head selection in current years and discussed the issues related to reduce network overhead and improve security within the network after effective selection of cluster head. Agarwal et al. [14] presented trusted weight based clustering approach and used parameters like trust, load balancing, energy consumption, mobility and battery consumption. Bisen and Sharma [15] proposed agent based secure enhanced performance approach for MANET (AB-SEP). The key advantage highlighted included increase in the performance of nodes, lesser energy consumption and also identifies malicious activities of network [16]. Furthermore, neural network based adaptive neuro-fuzzy inference system [17] and multilayer perceptron neural network [5] was introduced for selecting cluster head on the basis of mobility and residual energy that improved the performance of network. In another explore Malar et al. [18] have introduced bio inspired methods to improve routing in MANET. In their work they have used Ant colony optimization technique for developing an energy efficient routing in MANET and utilized the energy in a smarter way. On the basis of above available state of the art, this paper considers problem of cluster formation, cluster head selection within network as key challenges and employs k-means algorithm with artificial neural network to find a solution.

CFSA)
This section presents an Efficient Cluster formation and Head selection algorithm (E-CFSA) that uses k-means algorithm and artificial neural network. In this work, mobile nodes are grouped together into clusters on the basis of distance between nodes, it is calculated by finding out the X-coordinate and Y-coordinate of the nodes. Distance is compared with all the neighbors or other nodes. Nodes having least difference of distance are grouped into one cluster and likewise the clusters are formed. E-CFSA use k-means algorithm [10] for clustering of nodes, it is a method of vector quantization, and it partitions 'n' observations into 'k' clusters in which each observation belongs to the cluster with the nearest mean serving as a prototype of cluster, hence 'k' here represents number of clusters and mean is the calculated average value of parameters. [11]. After clustering, the most efficient node within the cluster is selected as cluster head. The cluster head is selected on the basis of parameters such as energy, mobility, number of hops and neighbors. There are various constraints in selection of cluster head and these worked out using several routing protocols such as clusterhead gateway switch routing protocol (CGSR) [19]. Cluster formation procedure is given below in detail.

Cluster formation procedure:
(i) Initially assume the number of centroids according to the value of k (i.e. number of clusters to be formed).
For example if the value of "k" is 2 then two nodes are selected randomly as centroid. For the calculation of 'n' dimensional centroid point among k n-dimensional point following equation (1) is used: From these centroids, the distance of node is calculated and compared with that of the neighbor or other nodes. This distance is called as Euclidean distance and is calculated using following formula illustrated in equation (2) below: (ii) After comparing the distance among nodes, the nodes with least difference of distance are clustered together.
(iii) Now, calculate mean of the distances and again takes difference of the distance and mean value. Accordingly, iteratively new cluster is formed and the same step is repeated whenever get the common clusters. Finally K numbers of groups are formed.
M1 and M2 are the mean values these are taken randomly from the set D, shown in table1.

Cluster head selection Algorithm and weight updating algorithm
Artificial neural networks (ANNs) are computing systems inspired by biological neural networks that constitute human brains [20]. An ANN is collection of connected units or nodes Step 2: Now weights are assigned to each parameters according to each artificial neurons. The change in value of these weights results into change in the strength of signals. The summation of these weights should be equal to 1. For selecting the cluster head, the input parameters should be weighted such that the final target value is maximum. According to these input parameters the Y energy and neighbor nodes should be maximum while mobility and packet drop should be minimum.
Step 11: This process is repeated for the output layer neurons, using the output from the hidden layer neurons as inputs.
Step 12: Calculating the total error: Now the error for each output neuron is calculated using squared error function and sum them to get the total error. Squared error function is given in equation (10).
Step 13: Total error for this neural network is the sum of the calculated errors illustrated in equation (11).
Step 15: The backward pass: After calculating the total error, back propagation is performed to update the weights in the network so that they cause the actual output to be closer to that of the target value, thereby minimizing error for each output neuron and the network as a whole. In back propagation, the partial derivative of is taken with respect to the weights given. Here, considering weights 1 , 2, 3 and 4.
Step 16: By applying chain rule at output layer given in equation (13): Repeating same for weight 10 .
Step 17: Similarly, applying chain rule at hidden layer stated below: Repeating same for other weights at hidden layer where n= (1,2, …,8) Step 18: Total error change with respect to the output is: Next, how much does the output (o1) change with respect to its total net input is calculated below.
Total net input of output1 change with respect to weights- Step 19: Partially differentiating equation (6) and (7) with respect to their respective weights, leads to equation (16) Putting the values of equation (14), (15) and (16) into equation (10), the total error can be calculated.
Step 20: Next, to decrease the error, the calculated partial total is subtracted from the actual weight results in equation (17). (µ is learning rate) New weights are: In above algorithm, sigmoid function is used as activation function in hidden and output layer.
On the basis of given input samples, neural networks trains itself and update weight according to calculate errors in output neuron using back propagation algorithm. Here, Adam optimizer is used instead of the classical stochastic gradient descent procedure, it calculates adaptive learning rate helping in faster convergence thereby resulting in reduced overall training cost and time.
Whenever final output (Y) is approximately equal to the target value for the node, it is considered as error is minimum and model train correctly.

Simulation and Results Analysis
This section presents the experiments with proposed E-CFSA to observe and validate its findings, experimental results are then compared to conventional technique of weighted clustering algorithm (WCA) [21] and Agent-based secure enhanced performance approach (AB-SEP) [15] keeping all the test conditions same. Experiments are performed on Python [22] and NS-2.35 [23]. Experiments are performed with 20-100 nodes, placed within 1000*1000 m 2 area for a simulation time is 200sec. Each node is configured using random way point mobility model with speed 0-25m/s, 250m transmission range and 200 joules initial energy. Data traffic generated through constant bit rate (CBR) UDP traffic sources and the size of data packet is 512 bytes. Table 4 shows vital simulation parameters and their values.

Performance parameters
This subsection details the parameters that measure the working of both conventional and proposed E-CFSA. Experimental are carried to find packet delivery ratio, throughput, routing or overhead and stability time period of cluster head [14] [15] to draw the comparison of E-CFSA with conventional method.  Packet delivery ratio (PDR) -Packet delivery ratio is defined as the ratio of data packets received by the destinations to those generated by the sources. Mathematically, it can be defined as: PDR= S1/ S2; where, S1 is the sum of data packets received by the each destination and S2 is the sum of data packets generated by the each source given in equation (18) PDR (%) = *100 (18)  Throughput-It is ratio of total number of packets delivered over the total simulation time as stated in equation (19).
 Routing Overhead-Network overhead is the number of control (hello packets) and routing packets required for an overall network communication illustrated in equation (20).
Overhead (in ratio) = (20)  Stability time period of cluster head-Stability of a cluster head node is defined as the time period for which the node worked as a cluster head of the cluster. Average of that time period is known as average stability time.

Experimentation and result analysis
Experiments are performed to measure the performance of the proposed E-CFSA and then a comparison is done with current state of the arts. Experimental numerical statistics of E-CFSA, AB-SEP and WCA are presented in table 5 and 6.   to 25m/s) in the network. These results also highlight the fact that WCA does not calculate node reliability factor of the nodes and preferred shortest routes by default. Therefore, the network sizes are regularly varied in the routing process, which leads to an overall lower throughput.  helping E-CFSA to achieve better results as compared to conventional methods. These experimental results also signify that k-means is helping E-CFSA in cluster head selection which in turn helps in outperforming conventional state of the art.

Conclusion
This paper introduced efficient way for cluster formation and selection of stable cluster heads (E-CFSA) with high energy level among nodes in MANET. E-CFSA employs k-means algorithm to form cluster of nodes, based on distance among the nodes. Afterwards, artificial neural network is applied on each cluster for the selection of cluster head on the basis of mobility, packet drop, energy and number of neighbor nodes. E-CFSA identified nodes to be cluster heads that have longer stability and higher energy level. This procedure reduced the repetitive selection of cluster heads and re-affiliation of member nodes in a cluster. These new integrations in E-CFSA made it robust, make communication pervasive and achieved better packet delivery ratio, throughput and lower routing overhead and cluster head stability as compared to conventional WCA and AB-SEP. This work can be extended using bio inspired methods as they offer very enticing concepts, selection of cluster node can be considered as an intelligent and optimization problem for a more complex and heterogeneous networks.

Declaration
The Author(s) permits the publisher to publish this research article in this esteemed Journal. The Corresponding Author gives consent from individuals to publish the data associated with this research article.

Funding
The Author(s) declares that this research has not been funded by any agency.