Improved swarm intelligence algorithms with time-varying modified Sigmoid transfer function for Amphetamine-type stimulants drug classification

doi:10.1016/j.chemolab.2022.104574

Chemometrics and Intelligent Laboratory Systems

Volume 226, 15 July 2022, 104574

https://doi.org/10.1016/j.chemolab.2022.104574 Get rights and content

Highlights

•
A time-varying modified Sigmoid transfer function with two time-varying updating strategies is introduced.
•
The proposed transfer functions are applied in five different standard swarm intelligence (SI) algorithms.
•
The new binary SI algorithms are evaluated in the descriptor selection problem for ATS drug classification.
•
Experimental results confirmed the proposed algorithms promote fast convergence and better classification accuracy.
•
The statistical analyses show significant improvement between the proposed algorithms with the comparative ones.

Abstract

Swarm-intelligence (SI) algorithms have received great attention in addressing various binary optimization problems such as feature selection. In this article, a new time-varying modified Sigmoid transfer function with two time-varying updating schemes is proposed as the binarization method for particle swarm optimization (PSO), grey wolf optimization algorithm (GWO), whale optimization algorithm (WOA), harris hawk optimization (HHO), and manta-ray foraging optimization (MRFO). The new binary algorithms, BPSO, BGWOA, BWOA, BHHO, and BMRFO algorithms are utilized for solving the descriptors selection problem in supervised Amphetamine-type Stimulants (ATS) drug classification task. The goal of this study is to improve the speed of convergence and classification accuracy. To evaluate the performance of the proposed methods, experiments were carried out on a specific chemical dataset containing molecular descriptors of ATS and non-ATS drugs. The results obtained showed that the proposed methods’ performances on the chemical dataset are promising in near to optimal convergence, fast computation, increased classification accuracy, and enormous reduction in descriptor size.

Introduction

Amphetamine-type stimulant (ATS) drugs are one of the popular synthetic drugs of abuse. These substances were originally developed for pharmacological research, but, the underground chemist keep modifying the chemical structure of these compounds to evade legal regulation. The novelty of these substances makes them undetectable by traditional drug testing methods [1].

Nowadays, there are different devices and test kits available in the market for use in ATS drug testing [[2], [3], [4]]. However, several drawbacks were discovered within these methods such as long preparation and execution time, costly apparatus, and equipment, complex testing process, requiring well-trained technicians, unreliable and inconsistent outputs from different test kits, and outdated analytical methods. At present, computational methods have been shown as the promising techniques in the cheminformatics field such as in drug design and discovery [5], drug-non drug classification [6], molecular similarity analysis [7], toxicity prediction [8,9], and quantitative structure-activity/property relationships (QSAR/QSPR) analysis [10]. Computational methods also offer much cheaper and faster procedures. Molecular descriptors are an important component that has provided support for many modern computational models. Molecular descriptors are numerical indexes encoded from molecular structure representation of different dimensionalities (0D, 1D, 2D, 3D, or 4D). The higher the dimensionality the more information about the molecular features is stored in the descriptors.

Due to the rapid increment of chemical data, machine learning has become a promising tool to process big data at high volume, veracity, and velocity and with enormous flexibility [11]. In a previous study, Pratama et al. proposed an approach for ATS drug identification by employing a newly developed 3D image pre-processing technique called 3D Exact Legendre Moment Invariants (3D ELMI) as a feature extractor algorithm and several classification algorithms [12,13]. 3D ELMI is a molecular descriptor algorithm that is used to calculate descriptors of 7190 drug compounds (equal size of ATS and non-ATS drugs). It generates 1185 descriptors for each drug compound and is then used as input to the classification algorithms to perform the identification task. The experimental results found that the random forest (RF) classifier is superior in achieving the highest classification accuracy. According to Ref. [14], molecular classification involved three steps: feature extraction, feature selection, and classification. This study aims to improve the ATS drug classification performance by carrying out the feature selection as a data-preprocessing step before executing the classification task.

The existence of new molecular descriptors that generate high-dimensional descriptors has made the feature or descriptor selection step required in computational modeling. Descriptor selection is a well-known non-polynomial (NP) hard combinatorial search problem as the number of possible feature subsets grows exponentially with the increase of dimensionality. The traditional feature selection techniques are inefficient to handle medium or large descriptors or datasets. Therefore, the SI algorithm is one of the core technology to address this issue. SI algorithm is classified as a population-based optimization algorithm having several advantages [15]:

•
Ease of implementation.
•
Fewer operators compared to evolutionary approaches.
•
Fewer parameters to tune.
•
Retain information about the search space throughout the iteration.
•
Regularly use memory to save the best solution obtained so far.

Some of the successful works that implemented SI algorithms in the descriptors selection problem are outlined in Table 1.

Descriptor selection has become popular research in the cheminformatics domain where researchers attempt to identify the lowest feasible number of descriptors that can provide good predictive performance [16]. The research area of incorporating and implementing SI algorithms to the feature selection problem is still active to date. According to the No Free Lunch (NFL) theorem, there is no universal algorithm that applies to all optimization problems [17]. Therefore, there is always an opportunity to come up with new metaheuristic-based feature selection algorithms to enhance the process of solving feature selection problems. Numerous new approaches are originated in the literature [[18], [19], [20], [21], [22]]. Motivated by this, this research proposed ten new SI-based feature selection algorithms to obtain the significance and discriminative 3D ELMI descriptors to enhance the classification performance.

The initial version of the SI algorithm produces a continuous solution and is only applicable to solve continuous optimization problems. Therefore, the binary version of the SI algorithm is mandatory to generate a binary solution for addressing binary optimization problems. Examples of binary optimization problems are feature selection [23], and traveling salesman problem (TSP) [24]. The common practice is to use the transfer function as a conversion method [[25], [26], [27]]. The implementation of a transfer function is straightforward and does not increase the complexity of the original algorithm. In addition, the utilization of a suitable transfer function will provide a good balance between the exploration and exploitation phases in the SI algorithm resulting in a better convergence and good classification accuracy. Several popular transfer functions used in the literature are listed in Table 2.

In the present research, we introduced a time-varying modified Sigmoid transfer function with a linear time-varying updating strategy. We also adopted the transfer function that we have proposed in Ref. [28]. To evaluate the efficiency of the particular transfer functions, we integrate them into five continuous SI algorithms: particle swarm optimization algorithm (PSO), whale optimization algorithm (WOA), grey wolf optimization algorithm (GWO), harris hawk optimization algorithm (HHO), and manta-ray foraging optimization algorithm (MRFO). The characteristics of these SI algorithms are summarized in Table 3 while Table 4, Table 5 present the transfer functions used to produce their binary versions.

In a few words, this paper introduces a new approach for tackling descriptors selection problem based on the PSO, BWO, GWO, HHO, and MRFO algorithms, and its main contributions can be summarized as follows:

1.
A novel time-varying modified Sigmoid transfer function with a linear (TV1) time-varying updating scheme is introduced as a binarization technique for the metaheuristic algorithm.
2.
Ten new binary variants of PSO, BWOA, GWO, HHO, and MRFO algorithms are developed by employing TV1 and our recently proposed transfer function (TV2) in Ref. [28]: BPSO_TV1, BPSO_TV2, BWOA_TV1, BWOA_TV2, BGWO_TV1, BGWO_TV2, BHHO_TV1, BHHO_TV2, BMRFO_TV1, BMRFO_TV2.
3.
These proposed SI algorithms are adapted as a feature search for wrapper feature selection in a supervised binary classification task that differentiates ATS and non-ATS drugs.
4.
The final results were assessed based on different performance metrics, including the average fitness, average classification accuracy, average fitness, average number of selected features, as well as the respective standard deviation values.
5.
The significance of the proposed algorithms was validated against competitive algorithms using a Wilcoxon's rank-sum non-parametric statistical test at a significance level of α = 0.05.

There are several limitations of this study which include:

1.
The transfer functions are validated on five SI-based optimization algorithms only.
2.
The algorithms are used to solve the descriptors selection problem in the drug analysis domain.
3.
Only one chemical dataset is used for algorithms evaluation.
4.
This study does not apply any data preprocessing to the dataset.

The remainder of this paper is structured as follows. Section 2 explains the concepts of the proposed new transfer functions and their implementation in PSO, GWO, WOA, HHO, and MRFO algorithms. Section 3 describes the necessary material and methods used in the experiments. Results and discussions are presented in Section 4. Finally, Section 5 concludes and highlights some future works.

Section snippets

The binary version of PSO, GWO, WOA, HHO, and MRFO

[23,24,50]. In BPSO, BWOA, BGWOA, BHHO, and BMRFO, the search agents (solutions) update their positions continuously to any point in the search space based on the best search agent discovered so far. Then the real position of search agents is converted to binary values using the proposed time-varying modified Sigmoid transfer function. This technique forces search agents to move in a binary space by probability definition which updates each element (feature) in the solution (features subset) to

Dataset

In this study, a special chemical dataset was adopted in the experiments. The dataset contains 1187 descriptors to represent the 3D molecular structure of 3595 ATS dan 3595 non-ATS drugs compounds. The descriptors were generated by using a novel 3D Exact Legendre Moment Invariant molecular descriptors algorithm that was developed by Pratama in Refs. [12,13]. The descriptors comprise molecule id, 1185 moment invariants values, and a binary class label, 0 (non-ATS drug) and 1 (ATS drug). During

Experimental results and discussion

Table 8 present the detailed experimental results. The average fitness value achieved by the proposed algorithms is much lower when compared with their existing binary version except for BGWO _TV1. It is confirmed by the convergence curves illustrated in Figs. 2–6. This disclosed the potential of TV2 in improving the convergence of BPSO, BWOA, BGWO, and BHHO. Meanwhile, TV1 is capable to help the BMRFO algorithm to escape from local optima more efficiently. BMRFO_TV1 is seen to provide the lowest

Conclusions and future work

In this paper, the time-varying modified Sigmoid transfer function with two time-varying update strategies are introduced as binarization methods for PSO, GWO, WOA, HHO, and MRFO algorithms. These proposed binary SI algorithms are used in the descriptors selection problem to improve ATS drug classification accuracy. Among all the best proposed binary algorithms, BMRFO_TV1 was seen to be superior and significant in selecting relevant and discriminative descriptors and obtained high classification

Author statement

Norfadzlia Mohd Yusof: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing – Original draft, Visualization. Azah Kamilah Muda: Conceptualization, Supervison, Project administration, Funding acquisition. Satrya Fajri Pratama: Conceptualization, Data Curation, Resource, Supervison. Ramon Carbo-Dorca: Wrting – Review & Editing. Ajith Abraham: Wrting – Review & Editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by Universiti Teknikal Malaysia Melaka through the Fundamental Research Grant Scheme [FRGS/1/2020/FTMK-CACT/F00461] from the Ministry of Higher Education, Malaysia.

References (64)

K. Tamama
Synthetic drugs of abuse
Adv. Clin. Chem.
(2021)
E.H. Houssein et al.
A novel hybrid Harris hawks optimization and support vector machines for drug design and discovery
Comput. Chem. Eng.
(2020)
S. Korkmaz et al.
Drug/nondrug classification using Support Vector Machines with various feature selection strategies
Comput. Methods Progr. Biomed.
(2014)
Y.-C. Lo et al.
Machine learning in chemoinformatics and drug discovery
Drug Discov. Today
(2018)
S. Mirjalili et al.
Grey wolf optimizer
Adv. Eng. Software
(2014)
Z.Y. Algamal et al.
High-dimensional QSAR/QSPR classification modeling based on improving pigeon optimization algorithm
Chemometr. Intell. Lab. Syst.
(2020)
S. Mirjalili et al.
The whale optimization algorithm
Adv. Eng. Software
(2016)
A.A. Heidari et al.
Harris hawks optimization: algorithm and applications
Future Generat. Comput. Syst.
(2019)
W. Zhao et al.
Manta ray foraging optimization: an effective bio-inspired optimizer for engineering applications
Eng. Appl. Artif. Intell.
(2020)
E. Emary et al.
Binary grey wolf optimization approaches for feature selection
Neurocomputing
(2016)

M. Mafarja et al.

Binary dragonfly optimization for feature selection using time-varying transfer functions

Knowl. Base Syst.

(2018)

S. Mirjalili et al.

S-shaped versus V-shaped transfer functions for binary Particle Swarm Optimization

Swarm Evol. Comput.

(2013)

Z. Beheshti

A time-varying mirrored S-shaped transfer function for binary particle swarm optimization

Inf. Sci.

(2020)

M. Mafarja et al.

Whale optimization approaches for wrapper feature selection

Appl. Soft Comput.

(2018)

M. Mafarja et al.

Binary grasshopper optimisation algorithm approaches for feature selection problems

Expert Syst. Appl.

(2019)

E. Emary et al.

Binary ant lion approaches for feature selection

Neurocomputing

(2016)

B. Turkoglu et al.

Binary artificial algae algorithm for feature selection

Appl. Soft Comput.

(2022)

A.A. Abd El-Mageed et al.

Improved binary adaptive wind driven optimization algorithm-based dimensionality reduction for supervised classification

Comput. Ind. Eng.

(2022)

L. Harper et al.

An overview of forensic drug testing methods and their suitability for harm reduction point-of-care services

Harm Reduct. J.

(2017)

E. Lendoiro et al.

An LC-MS/MS methodological approach to the analysis of hair for amphetamine-type-stimulant (ATS) drugs, including selected synthetic cathinones and piperazines

Drug Test. Anal.

(2017)

H. Chung et al.

Amphetamine-type stimulants in drug testing

Mass Spectrom Lett

(2019)

M.D. Krasowski et al.

Using cheminformatics to predict cross reactivity of “designer drugs” to their currently available immunoassays

J. Cheminf.

(2014)

A. Karim et al.

Efficient toxicity prediction via simple features using shallow neural networks and decision trees

ACS Omega

(2019)

G. Idakwo et al.

A review on machine learning methods for in silico toxicity prediction

J. Environ. Sci. Health Part C Environ. Carcinog. Ecotoxicol. Rev.

(2018)

Y. Wang et al.

Incorporating PLS model information into particle swarm optimization for descriptor selection in QSAR/QSPR

J. Chemom.

(2015)

S.F. Pratama

Three-Dimensional Exact Legendre Moment Invariants for Amphetamine-type Stimulants Molecular Structure Representation

(2017)

S.F. Pratama et al.

Preparation of ATS drugs 3D molecular structure for 3D moment invariants-based molecular descriptors

A. Elsawy et al.

A hybridised feature selection approach in molecular classification using CSO and GA

Int. J. Comput. Appl. Technol.

(2019)

Z.Y. Algamal et al.

QSAR model for predicting neuraminidase inhibitors of influenza A viruses (H1N1) based on adaptive grasshopper optimization algorithm

SAR QSAR Environ. Res.

(2020)

D.H. Wolpert et al.

No free lunch theorems

IEEE Trans. Evol. Comput.

(1997)

F.S. Gharehchopogh et al.

Chaotic Vortex Search Algorithm: Metaheuristic Algorithm for Feature Selection

(2021)

H. Mohammadzadeh et al.

A novel hybrid whale optimization algorithm with flower pollination algorithm for feature selection: case study Email spam detection

Comput. Intell.

(2020)

Cited by (10)

Copula entropy-based golden jackal optimization algorithm for high-dimensional feature selection problems
2024, Expert Systems with Applications
Feature selection (FS) is a crucial process that aims to remove unnecessary features from datasets. It plays a role in data mining and machine learning (ML) by reducing the risk associated with high-dimensional datasets. FS is considered a challenging problem that is difficult to solve efficiently due to its combinatorial nature. As the size of the problem increases, the computation time also grows. Recently, researchers have focused on metaheuristic FS algorithms specifically designed for high-dimensional datasets. Therefore, this article proposes a powerful metaheuristic algorithm called Binary Enhanced Golden Jackal Optimization (BEGJO), which is an improved version of the recently published Golden Jackal Optimization (GJO) algorithm. The original GJO algorithm faces challenges when dealing with high-dimensional FS problems, as it tends to get trapped in local optima. To address this issue, various enhancement strategies are employed to improve the efficiency of GJO. The proposed BEGJO algorithm utilizes Copula Entropy (CE) to reduce the dimensionality of high-dimensional FS problems while maintaining high classification accuracy using the K-Nearest Neighbour (K-NN) classifier. Additionally, four enhancement strategies are incorporated to enhance the exploration and exploitation capabilities of the fundamental GJO algorithm. The BEGJO algorithm is transformed into its binary form using the sigmoid transfer function, aligning it with the nature of the FS problem. It is then tested on various high-dimensional benchmark datasets. The effectiveness of BEGJO is evaluated by comparing it with well-known algorithms in terms of classification accuracy, feature dimension, and processing time. BEGJO outperforms other algorithms in terms of classification accuracy and feature dimension and ranks up to fourth in terms of processing time. Furthermore, the advantageous use of CE is demonstrated by comparing the performance of the proposed algorithm with traditional FS algorithms. Statistical evaluations are conducted to further validate the effectiveness and superiority of the proposed algorithm. The results confirm that BEGJO is an effective solution for high-dimensional FS problems.
Optimizing microarray cancer gene selection using swarm intelligence: Recent developments and an exploratory study
2023, Egyptian Informatics Journal
Microarray data represents a valuable tool for the identification of biomarkers associated with diseases and other biological conditions. Genes, in particular, are a type of biomarker that holds great importance for the identification and understanding of various types of tumors, including brain, lung, and breast cancers. However, a significant portion of these cancer genes are not directly associated with the target disease, which can lead to challenges during analysis, such as increased computational complexity, poor generalization, and decreased classification accuracy, among others. To address this issue, a range of techniques and algorithms have been developed to optimize the selection of the most relevant subset of cancer genes. One highly effective approach to handle this challenge is the use of Swarm Intelligent (SI) algorithms, which are known for their efficiency and effectiveness as global search agents. In this paper, we present two distinct but related sections. First, we conduct a survey of current literature from 2019 to the present, on the use of SI algorithms for optimizing the selection of an optimal subset of cancer genes. Secondly, based on the analysis and findings from the first part, a presentation of an experimental study that evaluates the efficacy of four classical SI algorithms - Particle Swarm Optimization (PSO), Salp Swarm Optimization (SSA), Firefly Algorithm (FA), and Cuckoo Search (CS) – for optimizing the selection of relevant genes in three different cancer datasets. For the experimental study, we used the Chi-square, Mutual Information, and ANOVA filter methods to individually select 100, 200, and 500 relevant genes from the identified cancer datasets. We then passed these genes as input to each of the SI algorithms. The results of the study indicate that diverse filter-wrapper combinations can effectively address the challenge of selecting cancer genes across various datasets.
Hybrid binary grey wolf naked mole-rat algorithm for fragment-type UWB antenna optimization using time-varying transfer functions[Formula presented]
2023, Expert Systems with Applications
This paper introduces a novel hybrid binary variant of the Grey Wolf Optimizer (GWO) and the Naked Mole-Rat Algorithm (NMRA), called HbGNMR. The primary objective of this hybridization is to overcome the poor exploration and local optima stagnation problems of the basic NMRA as well as the poor exploitation of GWO. HbGNMR incorporates the global search strategy of GWO into the worker phase to enhance exploration, and a mating factor based on simulated annealing-inertia weight is added to the breeder phase to improve exploitation. Moreover, fourteen new time-varying binary transfer functions of two families: seven S-shaped and seven V-shaped functions have been analysed. Among these transfer functions, $T V_{V 5}$ was identified as the most effective in converting solutions from continuous search space to binary search space and maintaining a balance between exploration and exploitation phases. The hybrid performance of HbGNMR is evaluated on twenty-three CEC’05, thirty CEC’17, and eighty CEC’21 benchmark functions. The statistical results demonstrate that HbGNMR outperforms nine well-known state-of-the-art algorithms according to Friedman-test and Wilcoxon rank-sum test. To test the effectiveness of the proposed algorithm for real-world applications, HbGNMR is employed to optimize a large dimensional planar-monopole ultrawideband (UWB) fragment-type antenna structure of an 18 × 18 binary matrix cells. Experimental testing of the prototype is conducted on a vector-network-analyser-based experimental-testbed, which shows good agreement with the simulated results. Furthermore, the optimized structure achieves large bandwidth (3.1-12.6 GHz), excellent gain, and high directivity for the 10 × 15 ${mm}^{2}$ compact size UWB antenna geometry. Therefore, the optimized antenna can be utilized for modern wireless applications.
Improving Amphetamine-type Stimulants drug classification using chaotic-based time-varying binary whale optimization algorithm
2022, Chemometrics and Intelligent Laboratory Systems
Citation Excerpt :
Besides that, the transfer function can improve the exploration and exploitation of the SI algorithm therefore the selection of an appropriate transfer function is important. Various transfer functions have been utilized and examined by several researchers in BWOA for feature selection problems in supervised data classification [22–27]. Another improvement method in the SI domain is the application of chaos theory.
A new chaotic time-varying binary whale optimization algorithm (CBWOA_TV) is introduced in this paper to optimize the feature selection process in Amphetamine-type Stimulants (ATS) and non-ATS drugs classification. Two enhancement methods were introduced in this study to provide a fit balance between exploration and exploitation in standard WOA. Firstly, a non-linear time-varying modified Sigmoid transfer function is used as the binarization method. Second, a hybrid Logistic-Tent chaotic map is employed to substitute the pseudorandom numbers of the probability operator in standard WOA. Specific high-dimensional molecular descriptors of ATS and non-ATS drugs were employed to evaluate the efficiency of the proposed algorithm. Experimental results and statistical analysis indicate that the proposed CBWOA_TV algorithm can prevent the problem of stagnation and entrapment in local minima in WOA. As a result, optimal descriptors were selected contributing to enhanced classification performance.
Advances in Manta Ray Foraging Optimization: A Comprehensive Survey
2024, Journal of Bionic Engineering
Sigmoid activation function generation by photonic artificial neuron (PAN)
2024, Optical and Quantum Electronics

View all citing articles on Scopus

View full text

Improved swarm intelligence algorithms with time-varying modified Sigmoid transfer function for Amphetamine-type stimulants drug classification

Highlights

Abstract

Introduction

Section snippets

The binary version of PSO, GWO, WOA, HHO, and MRFO

Dataset

Experimental results and discussion

Conclusions and future work

Author statement

Declaration of competing interest

Acknowledgments

Adv. Clin. Chem.

Comput. Chem. Eng.

Comput. Methods Progr. Biomed.

Drug Discov. Today

Adv. Eng. Software

Chemometr. Intell. Lab. Syst.

Adv. Eng. Software

Future Generat. Comput. Syst.

Eng. Appl. Artif. Intell.

Neurocomputing

Knowl. Base Syst.

Swarm Evol. Comput.

Inf. Sci.

Appl. Soft Comput.

Expert Syst. Appl.

Neurocomputing

Appl. Soft Comput.

Comput. Ind. Eng.

An overview of forensic drug testing methods and their suitability for harm reduction point-of-care services

Harm Reduct. J.

An LC-MS/MS methodological approach to the analysis of hair for amphetamine-type-stimulant (ATS) drugs, including selected synthetic cathinones and piperazines

Drug Test. Anal.

Amphetamine-type stimulants in drug testing

Mass Spectrom Lett

Using cheminformatics to predict cross reactivity of “designer drugs” to their currently available immunoassays

J. Cheminf.

Efficient toxicity prediction via simple features using shallow neural networks and decision trees

ACS Omega

A review on machine learning methods for in silico toxicity prediction

J. Environ. Sci. Health Part C Environ. Carcinog. Ecotoxicol. Rev.

Incorporating PLS model information into particle swarm optimization for descriptor selection in QSAR/QSPR

J. Chemom.

Three-Dimensional Exact Legendre Moment Invariants for Amphetamine-type Stimulants Molecular Structure Representation

Preparation of ATS drugs 3D molecular structure for 3D moment invariants-based molecular descriptors

A hybridised feature selection approach in molecular classification using CSO and GA

Int. J. Comput. Appl. Technol.

QSAR model for predicting neuraminidase inhibitors of influenza A viruses (H1N1) based on adaptive grasshopper optimization algorithm

SAR QSAR Environ. Res.

No free lunch theorems

IEEE Trans. Evol. Comput.

Chaotic Vortex Search Algorithm: Metaheuristic Algorithm for Feature Selection

A novel hybrid whale optimization algorithm with flower pollination algorithm for feature selection: case study Email spam detection

Comput. Intell.