ABSTRACT

In the last few decades, there has been an enormous growth and generation of data from various domains. This has presented a big challenge to researchers regarding processing and analyzing data. While dealing with some machine-learning problems, the number of features involved has reached the thousands in many cases. Thus it becomes very difficult to handle them as this consumes a lot of computational power during the feature evaluation and classification processes. A large number of features may also affect the performance of the model and increase the learning time. Hence, it is very important to select the minimum number of relevant features instead of all features. The objective of this chapter is to determine an optimal subset of features among the original features which can effectively represent the underlying data. These significant features increase performance and help the model to learn faster. It is also a very efficient way to reduce the computational cost and increase accuracy. It removes the irrelevant, redundant, and insignificant data that can improve learning accuracy, and reduce the number of features and computation time for the best interpretation of the learning model. In this chapter, teaching-learning-based optimization (TLBO), based on the feature selection (FS) approach called FSTLBO, is proposed for finding the optimal features. This approach updates the weak features with strong features. The experimental results show that the FSTLBO approach has significant improvements on performance as compared to other models for the selection of optimal features. This approach provides an effective method to determine the optimal features for classification problems to support a data-based decision-making process.