A Hybrid Approach of TLBO and EBPNN for Crop Yield Prediction Using Spatial Feature Vectors

The prediction of crop yield is one of the important factor and also challenging, to predict the future crop yield based on various criteria’s. Many advanced technologies are incorporated in the agricultural processes, which enhances the crop yield production efficiency. The process of predicting the crop yield can be done by taking agriculture data, which helps to analyze and make important decisions before and during cultivation. This paper focuses on the prediction of crop yield, where two models of machine learning are developed for this work. One is Modified Convolutional Neural Network (MCNN), and the other model is TLBO (Teacher Learning Based Optimization)-a Genetic algorithm which reduces the input size of data. In this work, some spatial information used for analysis is the Normalized Difference Vegetation Index, Standard Precipitation Index and Vegetation Condition Index. TLBO finds some best feature value set in the data that represents the specific yield of the crop. So, these selected feature valued set is passed in the Error Back Propagation Neural Network for learning. Here, the training was done in such a way that all set of features were utilized in pair with their yield value as output. For increasing the reliability of the work whole experiment was done on a real dataset from Madhya Pradesh region of country India. The result shows that the proposed model has overcome various evaluation parameters on different scales as compared to previous approaches adopted by researchers.


Introduction
The number of inhabitants on the planet is expanding day by day, so for the better nourishment and administration, the need for a far-reaching, dependable and quick data resources are required. The utilization of remote detecting has ended up being exceptionally critical in areas of agriculture for observing the development of crops and irrigation system. Remote sensing information can be utilized for various applications, for example, crop stock, crop generation and dry season prediction, and irrigation land checking. The objective of the information mining process is to remove learning from a current information index and change it into a human reasonable development for higher utilization [Pritam and Kasabov (2016)]. Information mining is an incredible request in other handy databases, including spatial databases, worldly databases, object arranged databases, multimedia databases, so on. Spatial information mining can be utilized to anticipate crop yield expectation other than genuine applications. Spatial information mining is the way toward removing intriguing learning from spatial databases. This learning can be utilized for understanding spatial and non-spatial information with their connections. Information found from spatial information can be of different structures, similar to the trademark and segregate principles, extraction and depiction of unmistakable structures or groups, spatial affiliations [Gandhi, Armstrong, Petkar et al. (2016)]. The test has been to extricate information from this crude information, has prompt new strategies and methods, for example, information mining that can connect the learning of the information to the crop yield estimation [Geeta (2015)]. Crop yield forecast has been a subject of enthusiasm for researchers, specialists, and field associations. As characterized by the Food and Agriculture of the United Nations, crop prediction is the craft of anticipating crop yields and creation before the harvest happens, commonly a few months ahead of time [Ramesh and Vardhan (2015)]. The crop yield forecast is a noteworthy segment of national nourishment security assessment and sustenance approach making. Crop development and yield information are basic for directing rural development framework, and farming task with administration [Medar and Rajpurohit (2014)]. The crop generation could be influenced straightforwardly and by implication under environmental change [Meroni, Fasbender, Balaghi et al. (2016)]. Measurable models and harvest models are two primary devices for contemplating the impacts of environmental change on crop yields [Garg and Bindu (2017)]. Larger part of research works in the farming spotlight on natural instruments to distinguish crop development and enhance its yield. The result of harvest yield fundamentally relies upon parameters, for example, the assortment of the crop, seed write and natural parameters, for example, daylight, soil, water, precipitation and dampness [Kaur, Gulati and Kundra (2014)] Harvest yield estimation is important, especially in nations like India, which relies upon farming as their principle wellspring of the economy. Harvest yield estimation causes choices creators to take choice about the abundance or shortage of generation conditions and permit suitable import and fare choices [Dahikar and Rode (2014)]. Crop recognizable proof and forecast of yield are the principle worry of remote detecting application in agribusiness [Raorane and Kulkarni (2013)]. Spatiotemporal information mining can be utilized to foresee crop yield forecast other than other certifiable applications. This paper tosses light into crop yield forecast utilizing socio temporal information mining and extends it to utilize Map Reduce system with a specific end goal to enhance the prediction precision value. Since the information for spatial information mining is exceptionally huge, information mining can help in extricating thorough business knowledge as data represents the information with attributes of volume, speed and variety. Ramesh et al. [Ramesh and Vardhan (2015)] analyzes the consequences of different straight regression and Density-based group system. The information was thought about in the particular area i.e., East Godavari region of Andhra Pradesh in India. Numerous Linear Regression is connected on existing information yet the outcomes acquired are analyzed and inspected utilizing Density-based grouping procedure.

Related work
Raorane et al. [Raorane and Kulkarni (2013)], recommends that few changes in the climate can be analyzed by Support Vector Machine (SVM is fit for arranging information tests in two disjoint bunches) and furthermore K-implies technique was utilized to forward the contamination in the environment. Information mining strategies are utilized to screen the wine fermentation. Tits et al. [Tits, Somers and Coppin (2012)] proposed checking framework for crop creation through grouping approach. The sub-pixel cover part and the unconstrained spatial mark of the plant include from the consolidated hyperspatial flag was extricated at the same time by building up different end part phantom mix examination. The age of lookup tables (LUT) was accomplished through a radiative exchange show for both soil and plant features. In the wake of dividing the features, the grouping strategy was performed for upgrading the effectiveness of the use of LUT in this proposed display. At that point, the Bayesian choice calculation was displayed for choosing the most positive bunches. Be that as it may, the impacts of layout in the mix and non-straight mixed impacts in orchard frameworks were not considered. Iordache et al. [Iordache, Tits, Plaza et al. (2014)] created a dynamic unmixing model for plant generation checking framework. The watched spectra, for example, vegetation and soil were thought to be straight mixes of spectra from accessible otherworldly libraries through direct mixing model (LMM). The unmixed issues were handled by different end part otherworldly mix examination (MEMSA) and inadequate unmixing by means of variable part and increased Lagrangian (SUnSAL). For powerful library lessening, change of Hyper-spatial unmixing by means of numerous flag characterization and synergistic scanty regression (MUSIC-CSR) was created. This procedure was utilized for pruning the word reference and seeing the high calibre of the vegetation spectra on the ground by methods for pruned lexicon as a contribution to accessible unmixing strategies. Notwithstanding, this technique was troublesome for obtaining solid estimation of ground partial plenitudes. Sung et al. [Sung, Chung and Chang et al. (2014)] explored insect state calculation with focus information total for horticultural checking framework. The plant development was observed through natural parameters gathered by means of Zigbee based climate stations. The whole sensors were consolidated into climate stations and essentially single observing hub was utilized for information total. A vitality effective focus information accumulation technique was exhibited in which subterranean insect province improvement calculation was connected to the creation of level slope field. The remote online human-machine interface was likewise created for checking the plant generation. Notwithstanding, the normal postponement of this system was high.

Proposed methodology
In order to make a general model which work on various available data Indices a Teacher Learning Based Optimization (TLBO), a genetic algorithm was used where data is classified without finding all possible combination. In this work, EBPNN is used as the learning model where input data was filtered/refined first by TLBO algorithm. Fig. 1 and

Vegetation condition index (VCI)
As suggested in Sung et al. [Sung, Chung, Chang et al. (2014)], which represent a correlation between the current NDVI values with the minimum NDVI value from a huge set of period. It was calculated as shown in Eq. (2). 100 min max where Vj stands for the j th position VCI value in the vector. Similarly Nj stands or the j th position NDVI value in the vector. While Nmin and Nmax are minimum and maximum values of NDVI vector.

Pre-processing
Input image data need to be preprocessed first for transforming data as per the proposed model. It involves activities like cleaning of data and converts in required environment format. This can be understood as the let image consist of four pixels having the dimension of 2×2 then for this image, a matrix is of same dimension 2×2 and its four-cell contain value as per the pixel colour and representing format. In this step, image is resized in fix dimension. As different image has different dimension. So conversion of each is done in this step. In order to get index value from the image, each image has its own colour scale where pixel value corresponding to a specific colour is replaced by its index value. This can be understood as let Fig. 3(a) has an NDVI index than all the dark green portion in the image is considered as the extremely wet region, so if this dark green is represented by pixel value 65 then all the pixel value in the image of NDVI where 65 value is present is considered as extremely wet region and NDVI value for the cell is +1.0. In similar fashion other regions are also detected in the image. While SPI image index value is obtained from its separate index, as the image colour region of this image is different.

Feature selection
In this work, three indexes are used as the training parameters, first was NDVI, the second was SPI and third was VCI. Use of three feature vectors for training increase the accuracy of the resultant neural network. As this help in covering various other geographical parameters of the location. So a set of values for any geographical location at a particular period of time this feature vector has three value set. This can be understood as if the geographical location (xi, yi), then its corresponding feature vector for time instance ti is [Ni, Vi, Si]. While for same location but at time tj, it might be [Nj, Vj, Sj]. So for each year data input feature vector is a three-dimension vector where the first and second dimension provide geographical position while third provide index vector for the same.

TLBO (Teacher learning based optimization)
In this model, TLBO (Teachers Learning Based Optimization Algorithm was used for classifying the feature value of NDVI, VCI, SPI. Here this reduces the size of the dataset as during learning presence of similar data with multiple value increase the confusion and resultant value will be deflected. In this work TLBO, a genetic algorithm is used because this takes two-phase learning. The main motive of this model is to reduce the dataset size and increase the leaning accuracy of the neural network. Here iteration of teacher and student phase was done while two similar solutions were not obtained.

Generate population
In this step cluster center set was generate, which represent whole input block of the image. So each cluster center set acts as the chromosome while the collection of all set is termed as population. This can be assumed as C [c1, c2, ….cm] as the cluster center set. While P= [Cc1,Cc2,……Ccn] where m is the number of cluster and n is the number of chromosomes. In this work, each block is divided into m cluster, where each cluster center has NDVI, VCI and SPI value of the selected pixel position.

Fitness function
In order to obtain good chromosome from the bunch of available set fitness value of each probable cluster center set is passed in this function. So the fitness value was returned. Now applying Euclidian distance formula in Eq. (2) is used for finding the fitness value. This can be understood as let cluster set Cc=[C1, C2] fitness value need to calculate. Than distance of cluster center pixel values from each non-cluster center pixel of the block was evaluate. Now each non-cluster center pixel was assigned to the minimum distance cluster center. While the sum of all minimum distance of each non-cluster center was considered as fitness value shown in Eq. (3).
where m is the number of cluster center and s is the number of pixel in the block of image.

Teacher phase
This phase was used for the crossover of the chromosomes by the single best solution from the population. Here best solution act as a teacher and its selection is based on the minimum fitness value. In order to do crossover operation, random position cluster center value is copied from the teacher chromosome and it was replaced to the non-teacher chromosome. This improves the population quality. This can be understood as: let the best solution is Ccb, then the crossover operation done through Eq. (4)

Student phase
In this phase, some random group of the chromosome were made automatically and then each group was used for the crossover of the chromosomes by the single best solution in that group. Here best solution act as a teacher among other chromosomes and its selection is based on the minimum fitness value. In order to do crossover operation, random position cluster center value was copied from the teacher chromosome and it was replaced to the non-teacher chromosome using Eq. (4). Here each new chromosome was cross-verified that either its fitness value gets improved from then previous one, if fitness improves, then the new chromosome is included in the population and older one get removed and vice versa if fitness value not improves.

Training vector
Here the output of the genetic algorithm reduces the feature vector as each block was represented by cluster center value of NDVI, SPI and VCI. So each cluster center acts as input values and crop yield of that area in that year is considered as the desired output. So the input and output parameters are combined to generate training data for the Error Back Propagation Neural Network.

Training of error back propagation neural network (EBPNN)
• Let us assume a three layer neural network.
• Now consider i as the input layer of the network. While j is consider as the hidden layer of the network. • Finally k is considered as the resultant output layer of the network.
• If w ij represents a weight of the between nodes of different consecutive layers.
• So the output of the neural network is depending on the below equation sigmoidal function shown in Eq. (5): where, Xj=∑ xi. wij-θj, 1≤i≤n; n is the number of inputs to node j, and θj is threshold for node j.
In order to understand above steps let us consider an example where Wij have some weight values. W11 W12 W13 Let input vector of three value which include values from NDI, VCI, SPI images are pass from the first layer of neural network. These values get multiply by above weight values. Now, this act as input H1input to next layer of hidden neurons. In this, some biasing is also possible which was neglected in this example. So weight values of the neuron for next level is assumed as shown in the below matrix. where each value obtained from the previous weight matrix multiplication is passed through the sigmoidal function as in Eq. (5) Therefore, a small variation in the output value was done by this function.
In similar fashion other values can be calculated to find other set of derivatives for sigmoid of Eq. (7).
Here as per output derivative value may vary.
For each input to neuron let us calculate the derivative with respect to each weight. Now let us look at the final derivative: Now by using chain rule final derivates were calculated for the same. Here multiplication of output obtained from Eqs. (6)-(8) was done in following way: So the overall can be obtained by getting the value of weight from the above equation, here all set of weight which needs to be updated are change by below matrix values. • So the error corresponds to the input data was estimated by differencing desired output obtained from the output layer. ek(n)=dk(n)-yk(n) • The ebpnn weight updation was done by the above matrix of ij ij ij w w w ∆ + = So at the end of the above iteration steps are over when error obtained from the output layer get nearer to zero or some constant such as 0.001.

Testing of EBPNN
Testing of Trained neural network obtained from the above steps is processed by passing testing input images of NDVI, SPI. These images are preprocessed and generate feature values of as done in training part of the neural network. Finally, feature vectors were passed in the EBPNN which gives crop yield predicted value. This crop yield predicted values are compared on various parameters to find the fitness of the proposed work.

Experiment and results
In order to conduct an experiment and to measure evaluation results, MATLAB 2012a version software is used. This section of the paper shows the experimental setup and results. The tests were performed on a 2.27 GHz Intel Core i3 machine, equipped with 4 GB of RAM, and running under Windows 7 Professional.

Dataset
The experiment is done on a real dataset having the SPI from one of the world wide web source: https://iridl.ldeo.columbia.edu/maproom/Global/Precipitation/SPI.html. While NDVI value obtained from http://iridl.ldeo.columbia.edu/SOURCES. Each source image is extracted on the longitude parameters 21N-26N 60, and Latitude parameters 74E-82E, where the selected region is one of state Madhya Pradesh in India. While the average of different crop yields is obtained from http://www.mospi.gov.in/statistical-year-book-india/2016/177. Ground truth values are also considered from the same site http://www.mospi.gov.in for testing of trained models.

Results
Results found are compared with the previous work [Spiking Neural Network (SNN) in Pritam et al. [Pritam and Kasabov (2016)]], which is termed as previous work in this paper. It has been observed by Tab. 2 that proposed work GA+EBPNN of crop yield prediction using multiple feature work well as compare to the previous method adopt in Pritam et al. [Pritam and Kasabov (2016)] and EBPNN. Here RMSE value of proposed work is low as compared to the spiking neural network model. In this work, the use of TLBO algorithm for reducing input feature data increase the value of accuracy as well. It has been observed by Tab. 3 that relative error of the proposed work of crop yield prediction work is less as compared to the previous method adopt in Pritam et al. [Pritam and Kasabov (2016)]. One more important factor is observed that by the change in climate production of crops also rises. Above table shows that the combination of two soft computing techniques from the genetic and neural network has reduced Relative error value. It has been observed by Tab. 4 that the relative error of the proposed work of crop yield prediction value is much nearer to the ground truth value as compared to the previous method adopt in Pritam et al. [Pritam and Kasabov (2016)]. As dimension reduction done by TLBO, a good set of cluster center were selected and reduce the confusion of neural network while training.  (2016)]. Here the input layer in the EBPNN depends on the number of features used, while in case of spiking neural network number of neurons in input layer depend on training block size. While in case of GA+EBPNN it was assumed that finding cluster center is a kind of pre-processing step before the input in the neural network training. So, based on that execution time of proposed work was found to be less as compared to other existing methods.

Conclusion
Crop yield prediction through various machine learning techniques can be helpful to agricultural departments to have strategies for improving agriculture. Nowadays, various techniques are coming into existence for the same cause of future prediction. Spatiotemporal data mining is one such solution that can be employed to achieve crop yield prediction. Considering this fact, this paper has focused on one of the issues of the crop yield geospatial based prediction by using different indices such as NDVI, SPI, VCI. Many researchers have already done a lot of work based on neural network classification. In this model, TLBO (Teachers Learning Based Optimization Algorithm) was used for classifying the feature value of NDVI, VCI, SPI. Here the reduce in the size of the dataset is done during the learning phase-as the presence of similar data with multiple values increases the confusion and the resultant value gets deflected. In this work, the TLBO, genetic algorithm is used because this takes two-phase learning. The main motive of this model is to reduce the dataset size and increase the leaning accuracy of the neural network. As during learning in ANN various values to the same output increases the confusion, so the overall output gets disturbed. Here, the result shows that proposed work has improved the prediction accuracy by reducing the RMSE, relative error value. Also through proper training and rich input vector, the resultant neural network is less time consuming model by 57.47% as compared to SNN. It was obtained that the proposed work has reduced the relative error by 80.93% as compared to SNN, while RMSE also gets reduced by 80.85%. Hence the overall accuracy of the proposed work was also improved.