Skip to main content
Log in

KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Activation functions (AFs) are the basis for neural network architectures used in real-world problems to accurately model and learn complex relationships between variables. They are preferred to process the input information coming to the network and to produce the corresponding output. The kernel-based activation function (KAF) offers an extended version of ReLU and sigmoid AFs. Therefore, KAF faced with the problems of bias shift originating from the negative region, vanishing gradient, adaptability, flexibility, and neuron death in parameters during the learning process. In this study, hybrid KAF + RSigELUS and KAF + RSigELUD AFs, which are extended versions of KAF, are proposed. In the proposed AFs, the gauss kernel function is used. The proposed KAF + RSigELUS and KAF + RSigELUD AFs are effective in the positive, negative, and linear activation regions. Performance evaluations of them were conducted on the MNIST, Fashion MNIST, CIFAR-10, and SVHN benchmark datasets. The experimental evaluations show that the proposed AFs overcome existing problems and outperformed ReLU, LReLU, ELU, PReLU, and KAF AFs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Pacal I, Karaboga D (2021) A robust real-time deep learning based automatic polyp detection system. Comput Biol Med 134:104519

    Article  Google Scholar 

  2. Ozkok FO, Celik M (2021) Convolutional neural network analysis of recurrence plots for high resolution melting classification. Comput Methods Progr Biomed 207:1061139. https://doi.org/10.1016/j.cmpb.2021.106139

    Article  Google Scholar 

  3. Kiliçarslan S, Celik M (2021) RSigELU: a nonlinear activation function for deep neural networks. Expert Syst Appl 174:114805

    Article  Google Scholar 

  4. Scardapane S, Van Vaerenbergh S, Totaro S, Uncini A (2019) Kafnets: kernel-based non-parametric activation functions for neural networks. Neural Netw 110:19–32

    Article  Google Scholar 

  5. Chieng HH, Wahid N, and Ong P (2020) Parametric flatten-T Swish: an adaptive non-linear activation function for deep learning. arXiv preprint arXiv:2011.03155

  6. Zhao H, Liu F, Li L, Luo C (2018) A novel softplus linear unit for deep convolutional neural networks. Appl Intell 48(7):1707–1720

    Article  Google Scholar 

  7. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  8. Hochreiter S, Jurgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  9. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1106–1114

  10. Maas AL, Hannun AY, and Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proc. Icml, vol 30, no 1, p 3

  11. Szandała T (2021) Review and comparison of commonly used activation functions for deep neural networks. In: Bio-inspired neurocomputing, Springer, Singapore, pp 203–224

  12. Apicella A, Donnarumma F, Isgrò F, Prevete R (2021) A survey on modern trainable activation functions. Neural Netw 138:14–32

    Article  Google Scholar 

  13. Basirat M, and Roth PM (2021) S* ReLU: learning piecewise linear activation functions via particle swarm optimization. In: Proceedings of the proceedings of the 16th international joint conference on computer vision, imaging and computer graphics theory and applications

  14. Chung H, Lee SJ, and Park JG (2016) Deep neural network using trainable activation functions. In: 2016 International joint conference on neural networks (IJCNN), IEEE, pp 348–352

  15. Godin F, Degrave J, Dambre J, De Neve W (2018) Dual rectified linear units (DReLUs): a replacement for tanh activation functions in quasi-recurrent neural networks. Pattern Recogn Lett 116:8–14

    Article  Google Scholar 

  16. Klambauer G, Unterthiner T, Mayr A, and Hochreiter S (2017) Self-normalizing neural networks. In: Advances in neural information processing systems, pp 971–980

  17. Nair V, and Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  18. Qiumei Z, Dan T, Fenghua W (2019) Improved convolutional neural network based on fast exponentially linear unit activation function. IEEE Access 7:151359–151367

    Article  Google Scholar 

  19. Ramachandran P, Zoph B, and Le QV (2017) Searching for activation functions. arXiv preprint arXiv:1710.05941

  20. Bawa VS, Kumar V (2019) Linearized sigmoidal activation: A novel activation function with tractable non-linear characteristics to boost representation capability. Expert Syst Appl 120:346–356

    Article  Google Scholar 

  21. He K, Zhang X, Ren S, and Sun J (2015). Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  22. Maguolo G, Nanni L, Ghidoni S (2021) Ensemble of convolutional neural networks trained with different activation functions. Expert Syst Appl 166:114048

    Article  Google Scholar 

  23. Trottier L, Gigu P, and Chaib-draa B (2017) Parametric exponential linear unit for deep convolutional neural networks. In: 16th IEEE International conference on machine learning and applications (ICMLA), IEEE, pp 207–214

  24. Clevert DA, Unterthiner T, and Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289

  25. Kong S, and Takatsuka M (2017) Hexpo: a vanishing-proof activation function. In: 2017 international joint conference on neural networks (IJCNN), IEEE, pp 2562–2567

  26. Solazzi M, Uncini A (2004) Regularising neural networks using flexible multivariate activation function. Neural Netw 17(2):247–260

    Article  Google Scholar 

  27. Yun BI (2019) A neural network approximation based on a parametric sigmoidal function. Mathematics 7(3):262

    Article  Google Scholar 

  28. Zhou Y, Li D, Huo S, Kung SY (2021) Shape autotuning activation function. Expert Syst Appl 171:114534

    Article  Google Scholar 

  29. Agostinelli F, Hoffman M, Sadowski P and Baldi P (2014) Learning activation functions to improve deep neural networks. arXiv preprint arXiv:1412.6830

  30. Krizhevsky A, and Hinton G (2009) Learning multiple layers of features from tiny images. Master's thesis, University of Tront

  31. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  32. Xiao H, Rasul K and Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

  33. Netzer Y, Wang T, Coates A, Bissacco A et al (2011) Reading digits in natural images with unsupervised feature learning

  34. Kilicarslan S, Adem K, Celik M (2020) Diagnosis and classification of cancer using hybrid model based on reliefF and convolutional neural network. Med Hypotheses 137:109577

    Article  Google Scholar 

  35. Kilicarslan S, Celik M, Sahin Ş (2021) Hybrid models based on genetic algorithm and deep learning algorithms for nutritional Anemia disease classification. Biomed Signal Process Control 63:102231

    Article  Google Scholar 

  36. LeCun Y, Boser BE, Denker JS, Henderson D et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems conference, pp 396–404

  37. Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press

    MATH  Google Scholar 

  38. Adem K, Közkurt C (2019) Defect detection of seals in multilayer aseptic packages using deep learning. Turk J Electr Eng Comput Sci 27(6):4220–4230

    Article  Google Scholar 

  39. Adem K, Orhan U, Hekim M (2015) Image processing based quality control of the impermeable seams in multilayered aseptic packages. Expert Syst Appl 42(7):3785–3789

    Article  Google Scholar 

  40. Adem K (2018) Exudate detection for diabetic retinopathy with circular Hough transformation and convolutional neural networks. Expert Syst Appl 114:289–295

    Article  Google Scholar 

  41. Genton MG (2001) Classes of kernels for machine learning: a statistics perspective. J Mach Learn Res 2:299–312

    MathSciNet  MATH  Google Scholar 

  42. Krzywinski M, Altman N (2014) Visualizing samples with box plots. Nat Methods 11:119–120. https://doi.org/10.1038/nmeth.2813

    Article  Google Scholar 

  43. Karakoyun M, Hacıbeyoğlu M (2014) Statistical comparison of machine learning classification algorithms using biomedical datasets. Dokuz Eylül Üniv Mühendis Fak Fen ve Mühendis Derg 16(48):30–42

    Google Scholar 

  44. Adem K, Kilicarslan S, Comert O (2019) Classification and diagnosis of cervical cancer with softmax classification with stacked autoencoder. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2018.08.050

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serhat Kiliçarslan.

Ethics declarations

Conflict of interest

We, the authors, have no conflict of interest and disclose any financial and personal relationships with other people or organizations that could inappropriately influence the presented work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kiliçarslan, S., Celik, M. KAF + RSigELU: a nonlinear and kernel-based activation function for deep neural networks. Neural Comput & Applic 34, 13909–13923 (2022). https://doi.org/10.1007/s00521-022-07211-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07211-7

Keywords

Navigation