Abstract
The optimal architecture of a deep feedforward neural network (DFNN) is essential for its better accuracy and faster convergence. Also, the training of DFNN becomes tedious as the depth of the network increases. The DFNN can be tweaked using several parameters, such as the number of hidden layers, the number of hidden neurons at each hidden layer, and the number of connections between layers. The optimal architecture of DFNN is usually set using a trial-and-error process, which is an exponential combinatorial problem and a tedious task. To address this problem, we need an algorithm that can automatically design an optimal architecture with improved generalization ability. This work aims to propose a new methodology that can simultaneously optimize the number of hidden layers and their respective neurons for DFNN. This work combines the advantages of Tabu search and Gradient descent with a momentum backpropagation training algorithm. The proposed approach has been tested on four different classification benchmark datasets, which show better generalization ability of the optimized networks.
Similar content being viewed by others
References
Williams D, Hinton G (1986) Learning representations by back-propagating errors. Nature 323(6088):533–538
Holland JH (1975) Adaptation in natural and artificial systems. An introductory analysis with application to biology, control, and artificial intelligence. Univ. Michigan Press, Ann Arbor, pp 439–444
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549
Yang X-S (2010) A new metaheuristic bat-inspired algorithm. In: Nature inspired cooperative strategies for optimization (NICSO 2010), Springer, pp 65–74
Li Y, Fu Y, Li H, Zhang S-W (2009) The improved training algorithm of back propagation neural network with self-adaptive learning rate. In: International conference on computational intelligence and natural computing. vol 1, 2009. CINC’09, pp 73–76
Ma DS, Correll J, Wittenbrink B (2015) The Chicago face database: a free stimulus set of faces and norming data. Behav Res Methods 47(4):1122–1135
Vergara A, Vembu S, Ayhan T, Ryan MA, Homer ML, Huerta R (2012) Chemical gas sensor drift compensation using classifier ensembles. Sens Actuators B Chem 166:320–329
Rodriguez-Lujan I, Fonollosa J, Vergara A, Homer M, Huerta R (2014) On the calibration of sensor arrays for pattern recognition using the minimal number of experiments. Chemom Intell Lab Syst 130:123–134
LeCun Y, Cortes C (2010) {MNIST} handwritten digit database
Dheeru D, Karra Taniskidou E (2017) {UCI} Machine Learning Repository
Stepniewski SW, Keane AJ (1997) Pruning backpropagation neural networks using modern stochastic optimisation techniques. Neural Comput Appl 5(2):76–98
Ludermir TB, Yamazaki A, Zanchettin C (2006) An optimization methodology for neural network weights and architectures. IEEE Trans Neural Netw 17(6):1452–1459
Gepperth A, Roth S (2006) Applications of multi-objective structure optimization. Neurocomputing 69(7–9):701–713
Tsai JT, Chou JH, Liu TK (2006) Tuning the structure and parameters of a neural network by using hybrid Taguchi-genetic algorithm. IEEE Trans Neural Netw 17(1):69–80
Huang D-S, Du J-X (2008) A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks. IEEE Trans Neural Netw 19(12):2099–2115
Yu J, Wang S, Xi L (2008) Evolving artificial neural networks using an improved PSO and DPSO. Neurocomputing 71(4–6):1054–1060
Islam MM, Amin MF, Ahmmed S, Murase K (2008) An adaptive merging and growing algorithm for designing artificial neural networks. In: IEEE international joint conference on neural networks, 2008. IJCNN 2008 (IEEE world congress on computational intelligence), pp 2003–2008
Islam MM, Sattar MA, Amin MF, Yao X, Murase K (2009) A new adaptive merging and growing algorithm for designing artificial neural networks. IEEE Trans Syst Man Cybern Part B 39(3):705–722
Goh C-K, Teoh E-J, Tan KC (2008) Hybrid multiobjective evolutionary design for artificial neural networks. IEEE Trans Neural Netw 19(9):1531–1548
Hervás-Martínez C, Martínez-Estudillo FJ, Carbonero-Ruz M (2008) Multilogistic regression by means of evolutionary product-unit neural networks. Neural Netw 21(7):951–961
Mantzaris D, Anastassopoulos G, Adamopoulos A (2011) Genetic algorithm pruning of probabilistic neural networks in medical disease estimation. Neural Netw 24(8):831–835
Pelikan M, Goldberg DE, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference GECCO-99, vol. 1, pp 525–532
Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959
Yu Z, Yu J, Xiang C, Fan J, Tao D (2018) Beyond bilinear: generalized multimodal factorized high-order pooling for visual question answering. IEEE Trans Neural Netw Learn Syst 29(12):5947–5959
Yu Z, Yu J, Cui Y, Tao D, Tian Q (2019) Deep modular co-attention networks for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6281–6290
Yu Z, Xu D, Yu J, Yu T, Zhao Z, Zhuang Y, Tao D (2019) ActivityNet-QA: a dataset for understanding complex Web videos via question answering. Proc AAAI Conf Artif Intell 33:9127–9134
Zanchettin C, Ludermir TB, Almeida LM (2011) Hybrid training method for MLP: optimization of architecture and training. IEEE Trans Syst Man Cybern Part B Cybern 41(4):1097–1109
Baker JE (1987) Reducing bias and inefficiency in the selection algorithm. In: Proceedings of the second international conference on genetic algorithms, pp 14–21
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058
Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst 32(2):661–674
Yao J, Yu Z, Yu J, Tao D. SPRNet: single pixel reconstruction for one-stage instance segmentation
Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27(5):2420–2432
Jaddi NS, Abdullah S, Hamdan AR (2015) Optimization of neural network model using modified bat-inspired algorithm. Appl Soft Comput 37:71–86
Yang WH, Tarng YS (1998) Design optimization of cutting parameters for turning operations based on the Taguchi method. J Mater Process Technol 84(1–3):122–129
Jaddi NS, Abdullah S, Hamdan AR (2015) Multi-population cooperative bat algorithm-based optimization of artificial neural network model. Inf Sci 294:628–644
Jaddi NS, Abdullah S, Hamdan AR (2016) A solution representation of genetic algorithm for neural network weights and structure. Inf Process Lett 116(1):22–25
Gupta TK, Raza K (2018) Optimization of ANN architecture: a review on nature-inspired techniques. In: Dey N, Borra S, Ashour, Shi F (eds) Machine learning in bio-signal and diagnostic imaging. Elsevier, pp 159–182
Bengio Y et al (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gupta, T.K., Raza, K. Optimizing Deep Feedforward Neural Network Architecture: A Tabu Search Based Approach. Neural Process Lett 51, 2855–2870 (2020). https://doi.org/10.1007/s11063-020-10234-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10234-7