Abstract
This paper proposes an improved method to analyze the effectiveness of ATS drugs identification by using a few feature selection methods such as Sequential Forward Floating Selection (SFFS), Sequential Forward Selection (SFS), Sequential Backward Floating Selection (SBFS), Sequential Backward Selection (SBS) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE). The fundamental target of this paper is to compare which feature selection methods have better classification accuracy performance in identification for a large dataset. A comprehensive verification using WEKA is carried out to determine the performance of classification accuracy. This is achieved by comparing several classifiers with all features (without feature selection methods) and with selected features (with feature selection methods). From the experimental work, it was found that the performance of classification accuracy with selected features has similar accuracy if the performance accuracy done with all features. This shows that feature selection methods help to fasten and get better accuracy performance. The result also indicates that SFFS are the best feature selection methods to use to embed with SVM-RFE, while J48, IBk and Random Forest (RF) are the best three classifiers to use for future evaluation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
W. H. O. Geneva: Neuroscience of Psychoactive Substance Use and Dependence. World Health Organization, Switzerland (2004)
Ding, Y., Wilkins, D.: Improving the performance of SVM-RFE to select genes in microarray data. BMC Bioinform. 7(2), S12 (2006)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1), 389–422 (2002)
Hall, M.A.: Correlation-based feature subset selection for machine learning. Doctor of Philosophy Dissertation, University of Waikato, Hamilton, New Zealand (1999)
Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks (1995)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 1–43 (1997)
Li, Z., Xie, W., Liu, T.: Efficient feature selection and classification for microarray data. PLoS ONE 13(8), e0202167 (2018)
Mundra, P.A., Rajapakse, J.C.: SVM-RFE with mrmr filter for gene selection. IEEE Trans. Nanobiosci. 9(1), 31–37 (2010)
Portinale, L., Saitta, L.: Feature selection: state of the art. Feature selection, pp. 1–22. Universita del Piemonte Orientale, Alessandria (2002)
Pratama, S.F., Muda, A.K., Choo, Y.H., Muda, N.A.: A new swarm-based framework for handwritten authorship identification in forensic document analysis. In: Muda, A., Choo, Y.H., Abraham, A., N. Srihari, S. (eds.) Computational Intelligence in Digital Forensics: Forensic Investigation and Applications. SCI, vol. 555, pp. 385–411. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05885-6_16
Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognit. Lett. 15, 1119–1125 (1994)
Rustam, Z., Maghfirah, N.: Correlated based SVM-RFE as feature selection for cancer classification using microarray databases. In: AIP Conference Proceedings, vol. 2023, no. 1, p. 020235. AIP Publishing (2018)
Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), pp. 2507–2517 (2007)
Sanz, H., Valim, C., Vegas, E., Oller, J.M., Reverter, F.: SVM-RFE: selection and visualization of the most relevant features through non-linear kernels. BMC Bioinform. 19(1), 432 (2018)
Luhaniwal, V.R.: A comprehensive guide to feature selection using wrapper methods in Python. Analytics Vidhya, 24 October 2020. https://www.analyticsvidhya.com/blog/2020/10/a-comprehensive-guide-to-feature-selection-using-wrapper-methods-in-python/
Tang, Y., Zhang, Y., Huang, Z.: Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(3), 365–381 (2007)
Ragan, A.: Medium. Medium, 11 October 2018. https://towardsdatascience.com/taking-the-confusion-out-of-confusion-matrices-c1ce054b3d3e
Yan, K., Zhang, D.: Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sens. Actuators B Chem. 212, 353–363 (2015)
Yoon, S., Kim, S.: Mutual information-based SVM-RFE for diagnostic classification of digitized mammograms. Pattern Recogn. Lett. 30(16), 1489–1495 (2009)
Zhang, Y., Deng, Q., Liang, W., Zou, X.: An efficient feature selection strategy based on multiple support vector machine technology with gene expression data. BioMed Res. Int. 2018 (2018)
Smith B.: An approach to graphs of linear forms (Unpublished work style) (unpublished)
Miller, E.H.: A note on reflector arrays (Periodical style—Accepted for publication). IEEE Trans. Antennas Propagat. (to be published)
Wang, J.: Fundamentals of erbium-doped fiber amplifiers arrays (periodical style—submitted for publication). IEEE J. Quantum Electron. (submitted for publication)
Bemister-Buffington, J., Wolf, A.J., Raschka, S., Kuhn, L.A.: machine learning to identify flexibility signatures of class A GPCR inhibition biomolecules 2020 10, 454 (2020). https://www.mdpi.com/2218-273X/10/3/454
Xie, J., Lei, J., Xie, W., Gao, X., Shi, Y., Liu, X.: Novel hybrid feature selection algorithms for diagnosing erythemato-squamous diseases. In: He, J., Liu, X., Krupinski, E.A., Xu, G. (eds.) HIS 2012. LNCS, vol. 7231, pp. 173–185. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29361-0_21
Mohd, F., Noor, N.M.M.: A comparative study to evaluate filtering methods for crime data feature selection. Procedia Comput. Sci. 116, 113–120 (2017)
Sequential feature selection - MATLAB & Simulink. (n.d.) MathWorks - Makers of MATLAB and Simulink - MATLAB & Simulink. https://www.mathworks.com/help/stats/sequential-feature-selection.html
Saw, Y.C., Muda, A.K., Yusoh, Z.I.M.: Significant features determination for ATS drug identification. J. Telecommun. Electron. Comput. Eng. (JTEC), 10(2–5), 87–92 (2018)
Saw, Y.C., Yusoh, Z.I.M., Muda, A.K., Abraham, A.: Ensemble filter-embedded feature ranking technique (FEFR) for 3D ATS drug molecular structure. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 9, 124–134 (2017)
Minewiskan, T.S.: Feature selection (Data mining). Developer tools, technical documentation and coding examples | Microsoft Docs, 8 May 2018. https://docs.microsoft.com/en-us/analysis-services/data-mining/feature-selection-data-mining?cv=1&view=asallproducts-allversions
De Niz, C., Rahman, R., Zhao, X., Pal, R.: Algorithms for drug sensitivity prediction. Algorithms 9(4), 77 (2016). https://doi.org/10.3390/a9040077
Brownlee: An introduction to feature selection. Mach. Learn. Mastery (2014). https://machinelearningmastery.com/an-introduction-to-feature-selection/?cv=1
Kaushik, M., Moores, A.: Nanocelluloses as versatile supports for metal nanoparticles and their applications in catalysis. Green Chem. 18(3), 622–637 (2016)
Simple guide to confusion matrix terminology. Data School, 3 February 2020. https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
Brownlee.: Hat is a confusion matrix in machine learning. Machine Learning Mastery, 18 November 18 2016. https://machinelearningmastery.com/confusion-matrix-machine-learning/
Witten, I.H., Frank, E, Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn., vol. 54, no. 2 (2011)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, p. 680. Wiley, New York (2001)
Quinlan, J.R.: C 4.5: Programs for Machine Learning. Morgan Kaufmann Ser. Mach. Learn. (1993)
Acknowledgements
The authors would like to acknowledge Universiti Teknikal Malaysia Melaka through the Fundamental Research Grant Scheme [FRGS/1/2020/FTMK-CACT/F00461] from the Ministry of Higher Education, Malaysia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Knight, P.E., Muda, A.K., Pratama, S.F. (2022). Analysis of Feature Selection Method for 3D Molecular Structure of Amphetamine-Type Stimulants (ATS) Drugs. In: Abraham, A., et al. Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021). SoCPaR 2021. Lecture Notes in Networks and Systems, vol 417. Springer, Cham. https://doi.org/10.1007/978-3-030-96302-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-96302-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96301-9
Online ISBN: 978-3-030-96302-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)