KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks

Kiliçarslan, Serhat; Celik, Mete

doi:10.1007/s00521-022-07211-7

KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks

Original Article
Published: 18 April 2022

Volume 34, pages 13909–13923, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

530 Accesses
5 Citations
Explore all metrics

Abstract

Activation functions (AFs) are the basis for neural network architectures used in real-world problems to accurately model and learn complex relationships between variables. They are preferred to process the input information coming to the network and to produce the corresponding output. The kernel-based activation function (KAF) offers an extended version of ReLU and sigmoid AFs. Therefore, KAF faced with the problems of bias shift originating from the negative region, vanishing gradient, adaptability, flexibility, and neuron death in parameters during the learning process. In this study, hybrid KAF + RSigELUS and KAF + RSigELUD AFs, which are extended versions of KAF, are proposed. In the proposed AFs, the gauss kernel function is used. The proposed KAF + RSigELUS and KAF + RSigELUD AFs are effective in the positive, negative, and linear activation regions. Performance evaluations of them were conducted on the MNIST, Fashion MNIST, CIFAR-10, and SVHN benchmark datasets. The experimental evaluations show that the proposed AFs overcome existing problems and outperformed ReLU, LReLU, ELU, PReLU, and KAF AFs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multikernel Activation Functions: Formulation and a Case Study

E-Tanh: a novel activation function for image processing neural network models

Article 14 June 2022

T. Kalaiselvi, S. T. Padmapriya, … S. Praveenkumar

A novel activation function for multilayer feed-forward neural networks

Article 25 January 2016

Aboubakar Nasser Samatin Njikam & Huan Zhao

References

Pacal I, Karaboga D (2021) A robust real-time deep learning based automatic polyp detection system. Comput Biol Med 134:104519
Article Google Scholar
Ozkok FO, Celik M (2021) Convolutional neural network analysis of recurrence plots for high resolution melting classification. Comput Methods Progr Biomed 207:1061139. https://doi.org/10.1016/j.cmpb.2021.106139
Article Google Scholar
Kiliçarslan S, Celik M (2021) RSigELU: a nonlinear activation function for deep neural networks. Expert Syst Appl 174:114805
Article Google Scholar
Scardapane S, Van Vaerenbergh S, Totaro S, Uncini A (2019) Kafnets: kernel-based non-parametric activation functions for neural networks. Neural Netw 110:19–32
Article Google Scholar
Chieng HH, Wahid N, and Ong P (2020) Parametric flatten-T Swish: an adaptive non-linear activation function for deep learning. arXiv preprint arXiv:2011.03155
Zhao H, Liu F, Li L, Luo C (2018) A novel softplus linear unit for deep convolutional neural networks. Appl Intell 48(7):1707–1720
Article Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Hochreiter S, Jurgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1106–1114
Maas AL, Hannun AY, and Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proc. Icml, vol 30, no 1, p 3
Szandała T (2021) Review and comparison of commonly used activation functions for deep neural networks. In: Bio-inspired neurocomputing, Springer, Singapore, pp 203–224
Apicella A, Donnarumma F, Isgrò F, Prevete R (2021) A survey on modern trainable activation functions. Neural Netw 138:14–32
Article Google Scholar
Basirat M, and Roth PM (2021) S* ReLU: learning piecewise linear activation functions via particle swarm optimization. In: Proceedings of the proceedings of the 16th international joint conference on computer vision, imaging and computer graphics theory and applications
Chung H, Lee SJ, and Park JG (2016) Deep neural network using trainable activation functions. In: 2016 International joint conference on neural networks (IJCNN), IEEE, pp 348–352
Godin F, Degrave J, Dambre J, De Neve W (2018) Dual rectified linear units (DReLUs): a replacement for tanh activation functions in quasi-recurrent neural networks. Pattern Recogn Lett 116:8–14
Article Google Scholar
Klambauer G, Unterthiner T, Mayr A, and Hochreiter S (2017) Self-normalizing neural networks. In: Advances in neural information processing systems, pp 971–980
Nair V, and Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Qiumei Z, Dan T, Fenghua W (2019) Improved convolutional neural network based on fast exponentially linear unit activation function. IEEE Access 7:151359–151367
Article Google Scholar
Ramachandran P, Zoph B, and Le QV (2017) Searching for activation functions. arXiv preprint arXiv:1710.05941
Bawa VS, Kumar V (2019) Linearized sigmoidal activation: A novel activation function with tractable non-linear characteristics to boost representation capability. Expert Syst Appl 120:346–356
Article Google Scholar
He K, Zhang X, Ren S, and Sun J (2015). Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Maguolo G, Nanni L, Ghidoni S (2021) Ensemble of convolutional neural networks trained with different activation functions. Expert Syst Appl 166:114048
Article Google Scholar
Trottier L, Gigu P, and Chaib-draa B (2017) Parametric exponential linear unit for deep convolutional neural networks. In: 16th IEEE International conference on machine learning and applications (ICMLA), IEEE, pp 207–214
Clevert DA, Unterthiner T, and Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289
Kong S, and Takatsuka M (2017) Hexpo: a vanishing-proof activation function. In: 2017 international joint conference on neural networks (IJCNN), IEEE, pp 2562–2567
Solazzi M, Uncini A (2004) Regularising neural networks using flexible multivariate activation function. Neural Netw 17(2):247–260
Article Google Scholar
Yun BI (2019) A neural network approximation based on a parametric sigmoidal function. Mathematics 7(3):262
Article Google Scholar
Zhou Y, Li D, Huo S, Kung SY (2021) Shape autotuning activation function. Expert Syst Appl 171:114534
Article Google Scholar
Agostinelli F, Hoffman M, Sadowski P and Baldi P (2014) Learning activation functions to improve deep neural networks. arXiv preprint arXiv:1412.6830
Krizhevsky A, and Hinton G (2009) Learning multiple layers of features from tiny images. Master's thesis, University of Tront
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Xiao H, Rasul K and Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747
Netzer Y, Wang T, Coates A, Bissacco A et al (2011) Reading digits in natural images with unsupervised feature learning
Kilicarslan S, Adem K, Celik M (2020) Diagnosis and classification of cancer using hybrid model based on reliefF and convolutional neural network. Med Hypotheses 137:109577
Article Google Scholar
Kilicarslan S, Celik M, Sahin Ş (2021) Hybrid models based on genetic algorithm and deep learning algorithms for nutritional Anemia disease classification. Biomed Signal Process Control 63:102231
Article Google Scholar
LeCun Y, Boser BE, Denker JS, Henderson D et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems conference, pp 396–404
Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press
MATH Google Scholar
Adem K, Közkurt C (2019) Defect detection of seals in multilayer aseptic packages using deep learning. Turk J Electr Eng Comput Sci 27(6):4220–4230
Article Google Scholar
Adem K, Orhan U, Hekim M (2015) Image processing based quality control of the impermeable seams in multilayered aseptic packages. Expert Syst Appl 42(7):3785–3789
Article Google Scholar
Adem K (2018) Exudate detection for diabetic retinopathy with circular Hough transformation and convolutional neural networks. Expert Syst Appl 114:289–295
Article Google Scholar
Genton MG (2001) Classes of kernels for machine learning: a statistics perspective. J Mach Learn Res 2:299–312
MathSciNet MATH Google Scholar
Krzywinski M, Altman N (2014) Visualizing samples with box plots. Nat Methods 11:119–120. https://doi.org/10.1038/nmeth.2813
Article Google Scholar
Karakoyun M, Hacıbeyoğlu M (2014) Statistical comparison of machine learning classification algorithms using biomedical datasets. Dokuz Eylül Üniv Mühendis Fak Fen ve Mühendis Derg 16(48):30–42
Google Scholar
Adem K, Kilicarslan S, Comert O (2019) Classification and diagnosis of cervical cancer with softmax classification with stacked autoencoder. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2018.08.050
Article Google Scholar

Download references

Author information

Authors and Affiliations

Software Engineering Department, Bandırma Onyedi Eylül University, Bandırma, Balıkesir, Turkey
Serhat Kiliçarslan
Dept. of Computer Engineering, Erciyes University, 38039, Kayseri, Turkey
Mete Celik

Authors

Serhat Kiliçarslan
View author publications
You can also search for this author in PubMed Google Scholar
Mete Celik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serhat Kiliçarslan.

Ethics declarations

Conflict of interest

We, the authors, have no conflict of interest and disclose any financial and personal relationships with other people or organizations that could inappropriately influence the presented work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kiliçarslan, S., Celik, M. KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks. Neural Comput & Applic 34, 13909–13923 (2022). https://doi.org/10.1007/s00521-022-07211-7

Download citation

Received: 29 July 2021
Accepted: 29 March 2022
Published: 18 April 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s00521-022-07211-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks

Abstract

Access this article

Similar content being viewed by others

Multikernel Activation Functions: Formulation and a Case Study

E-Tanh: a novel activation function for image processing neural network models

A novel activation function for multilayer feed-forward neural networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks

Abstract

Access this article

Similar content being viewed by others

Multikernel Activation Functions: Formulation and a Case Study

E-Tanh: a novel activation function for image processing neural network models

A novel activation function for multilayer feed-forward neural networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation