Abstract
We introduce the fractional-order global optimal backpropagation machine, which is trained by an improved fractional-order steepest descent method (FSDM). This is a fractional-order backpropagation neural network (FBPNN), a state-of-the-art fractional-order branch of the family of backpropagation neural networks (BPNNs), different from the majority of the previous classic first-order BPNNs which are trained by the traditional first-order steepest descent method. The reverse incremental search of the proposed FBPNN is in the negative directions of the approximate fractional-order partial derivatives of the square error. First, the theoretical concept of an FBPNN trained by an improved FSDM is described mathematically. Then, the mathematical proof of fractional-order global optimal convergence, an assumption of the structure, and fractional-order multi-scale global optimization of the FBPNN are analyzed in detail. Finally, we perform three (types of) experiments to compare the performances of an FBPNN and a classic first-order BPNN, i.e., example function approximation, fractional-order multi-scale global optimization, and comparison of global search and error fitting abilities with real data. The higher optimal search ability of an FBPNN to determine the global optimal solution is the major advantage that makes the FBPNN superior to a classic first-order BPNN.
Similar content being viewed by others
References
Andramonov M, Rubinov A, Glover B, 1999. Cutting angle methods in global optimization. Appl Math Lett, 12(3):95–100. https://doi.org/10.1016/S0893-9659(98)00179-7
Barnard E, 1992. Optimization for training neural nets. IEEE Trans Neur Netw, 3(2):232–240. https://doi.org/10.1109/72.125864
Barron AR, 1993. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inform Theory, 39(3):930–945. https://doi.org/10.1109/18.256500
Battiti R, 1992. First- and second-order methods for learning: between steepest descent and Newton’s method. Neur Comput, 4(2):141–166. https://doi.org/10.1162/neco.1992A2.141
Browne CB, Powley E, Whitehouse D, et al., 2012. A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI Games, 4(1):1–43. https://doi.org/10.1109/tciaig.2012.2186810
Cantu-Paz E, Kamath C, 2005. An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems. IEEE Trans Syst Man Cybern, 35(5):915–927. https://doi.org/10.1109/TSMCB.2005.847740
Charalambous C, 1992. Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proc G, 139(3):301–310. https://doi.org/10.1049/ip-g-2.1992.0050
Chuang CC, Su SF, Hsiao CC, 2000. The annealing robust backpropagation (ARBP) learning algorithm. IEEE Trans Neur Netw, 11(5):1067–1077. https://doi.org/10.1109/72.870040
Cybenko G, 1989. Approximation by superpositions of a sigmoidal function. Math Contr Signals Syst, 2(4):303–314. https://doi.org/10.1007/bf02551274
Elwakil AS, 2010. Fractional-order circuits and systems: an emerging interdisciplinary research area. IEEE Circ Syst Mag, 10(4):40–50. https://doi.org/10.1109/MCAS.2010.938637
Hagan MT, Menhaj MB, 1994. Training feedforward networks with the Marquardt algorithm. IEEE Trans Neur Netw, 5(6):989–993. https://doi.org/10.1109/72.329697
Heymans N, Podlubny I, 2006. Physical interpretation of initial conditions for fractional differential equations with Riemann-Liouville fractional derivatives. Rheol Acta, 45(5):765–771. https://doi.org/10.1007/s00397-005-0043-5
Hornik K, Stinchcombe M, White H, 1989. Multilayer feedforward networks are universal approximators. Neur Netw, 2(5):359–366. https://doi.org/10.1016/0893-6080(89)90020-8
Jacobs RA, 1988. Increased rates of convergence through learning rate adaptation. Neur Netw, 1(4):295–307. https://doi.org/10.1016/0893-6080(88)90003-2
Kaslik E, Sivasundaram S, 2011. Dynamics of fractional-order neural networks. Proc Int Joint Conf on Neural Networks, p.1375–1380. https://doi.org/10.1109/IJCNN.2011.6033277
Koeller RC, 1984. Applications of fractional calculus to the theory of viscoelasticity. J Appl Mech, 51(2):299–307. https://doi.org/10.1115/L3167616
LeCun Y, 1985. Une procedure d’apprentissage pour reseau a seuil assymetrique. Proc Cogn, 85:599–604 (in French).
Leung FHF, Lam HK, Ling SH, et al., 2003. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neur Netw, 14(1):79–88. https://doi.org/10.1109/TNN.2002.804317
Ludermir TB, Yamazaki A, Zanchettin C, 2006. An optimization methodology for neural network weights and architectures. IEEE Trans Neur Netw, 17(6):1452–1459. https://doi.org/10.1109/TNN.2006.881047
Manabe S, 2002. A suggestion of fractional-order controller for flexible spacecraft attitude control. Nonl Dynam, 29(1–4):251–268. https://doi.org/10.1023/a:1016566017098
Maniezzo V, 1994. Genetic evolution of the topology and weight distribution of neural networks. IEEE Trans Neur Netw, 5(1):39–53. https://doi.org/10.1109/72.265959
Nikolaev NY, Iba H, 2003. Learning polynomial feedforward neural networks by genetic programming and backpropagation. IEEE Trans Neur Netw, 14(2):337–350. https://doi.org/10.1109/TNN.2003.809405
Oldham KB, Spanier J, 1974. The Fractional Calculus: Integrations and Differentiations of Arbitrary Order. Academic Press, New York, USA, p.1–234.
Özdemir N, Karadeniz D, 2008. Fractional diffusion-wave problem in cylindrical coordinates. Phys Lett A, 372(38):5968–5972. https://doi.org/10.1016/j.physleta.2008.07.054
Palmes PP, Hayasaka T, Usui S, 2005. Mutation-based genetic neural network. IEEE Trans Neur Netw, 16(3):587–600. https://doi.org/10.1109/tnn.2005.844858
Parker DB, 1985. Learning-Logic: Casting the Cortex of the Human Brain in Silicon. Technical Report, No. TR-47, Center for Computational Research in Economics and Management Science, MIT, USA.
Petráš I, 2011. Fractional-Order Nonlinear Systems: Modeling, Analysis and Simulation. Springer Berlin Heidelberg, Berlin, Germany, p.1–218.
Podlubny I, 1998. Fractional Differential Equations: an Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications. Academic Press, San Diego, USA, p.1–340.
Podlubny I, Petráš I, Vinagre BM, et al., 2002. Analogue realizations of fractional-order controllers. Nonl Dynam, 29(1–4):281–296. https://doi.org/10.1023/a:1016556604320
Pu YF, Zhou JL, Yuan X, 2010. Fractional differential mask: a fractional differential-based approach for multiscale texture enhancement. IEEE Trans Image Process, 19(2):491–511. https://doi.org/10.1109/TIP.2009.2035980
Pu YF, Zhou JL, Zhang Y, et al., 2015. Fractional extreme value adaptive training method: fractional steepest descent approach. IEEE Trans Neur Netw Learn Syst, 26(4):653–662. https://doi.org/10.1109/TNNLS.2013.2286175
Pu YF, Yi Z, Zhou JL, 2016. Defense against chip cloning attacks based on fractional Hopfield neural networks. Int J Neur Syst, 27(4):1750003. https://doi.org/10.1142/S0129065717500034
Pu YF, Yi Z, Zhou JL, 2017. Fractional Hopfield neural networks: fractional dynamic associative recurrent neural networks. IEEE Trans Neur Netw Learn Syst, 28(10):2319–2333. https://doi.org/10.1109/TNNLS.2016.2582512
Pu YF, Yuan X, Yu B, 2018a. Analog circuit implementation of fractional-order memristor: arbitrary-order lattice scaling fracmemristor. IEEE Trans Circ Syst I, 65(9):2903–2916. https://doi.org/10.1109/TCSI.2018.2789907
Pu YF, Siarry P, Chatterjee A, et al., 2018b. A fractional-order variational framework for retinex: fractional-order partial differential equation-based formulation for multi-scale nonlocal contrast enhancement with texture preserving. IEEE Trans Image Process, 27(3):1214–1229. https://doi.org/10.1109/TIP.2017.2779601
Rigler AK, Irvine JM, Vogl TP, 1991. Rescaling of variables in back propagation learning. Neur Netw, 4(2):225–229. https://doi.org/10.1016/0893-6080(91)90006-q
Rossikhin YA, Shitikova MV, 1997. Applications of fractional calculus to dynamic problems of linear and nonlinear hereditary mechanics of solids. Appl Mech Rev, 50(1):15–67. https://doi.org/10.1115/1.3101682
Rumelhart DE, Hinton GE, Williams RJ, 1986a. Learning representations by back-propagating errors. Nature, 323(6088):533–536. https://doi.org/10.1038/323533a0
Rumelhart DE, McClelland JL, PDP Research Group, 1986b. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 1, MIT Press, Cambridge, USA, p.547–611.
Shanno DF, 1990. Recent advances in numerical techniques for large-scale optimization. In: Miller WT, Sutton RS, Werbos PJ (Eds.), Neural Networks for Control. MIT Press, Cambridge, USA, p.171–178.
Sontag ED, 1992. Feedback stabilization using two-hidden-layer nets. IEEE Trans Neur Netw, 3(6):981–990. https://doi.org/10.1109/72.165599
Tollenaere T, 1990. SuperSAB: fast adaptive back propagation with good scaling properties. Neur Netw, 3(5):561–573. https://doi.org/10.1016/0893-6080(90)90006-7
Treadgold NK, Gedeon TD, 1998. Simulated annealing and weight decay in adaptive learning: the SARPROP algorithm. IEEE Trans Neur Netw, 9(4):662–668. https://doi.org/10.1109/72.701179
Vogl TP, Mangis JK, Rigler AK, et al., 1988. Accelerating the convergence of the back-propagation method. Biol Cybern, 59(4–5):257–263. https://doi.org/10.1007/bf00332914
Werbos PJ, 1974. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD Thesis, Harvard University, Cambridge, USA.
Yeh WC, 2013. New parameter-free simplified swarm optimization for artificial neural network training and its application in the prediction of time series. IEEE Trans Neur Netw Learn Syst, 24(4):661–665. https://doi.org/10.1109/TNNLS.2012.2232678
Zanchettin C, Ludermir TB, Almeida LM, 2011. Hybrid training method for MLP: optimization of architecture and training. IEEE Trans Syst Man Cybern, 41(4):1097–1109. https://doi.org/10.1109/TSMCB.2011.2107035
Author information
Authors and Affiliations
Contributions
Yi-fei PU designed the research and drafted the manuscript. Jian WANG helped organize the manuscript. Yi-fei PU and Jian WANG processed the data, and revised and finalized the paper.
Corresponding authors
Additional information
Compliance with ethics guidelines
Yi-fei PU and Jian WANG declare that they have no conflict of interest.
Project supported by the National Key Research and Development Program of China (No. 2018YFC0830300) and the National Natural Science Foundation of China (No. 61571312)
Rights and permissions
About this article
Cite this article
Pu, Yf., Wang, J. Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method. Front Inform Technol Electron Eng 21, 809–833 (2020). https://doi.org/10.1631/FITEE.1900593
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1900593
Key words
- Fractional calculus
- Fractional-order backpropagation algorithm
- Fractional-order steepest descent method
- Mean square error
- Fractional-order multi-scale global optimization