Skip to main content

Advertisement

Log in

Total ranking models by the genetic algorithm variable subset selection (GA–VSS) approach for environmental priority settings

  • Review
  • Published:
Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

Abstract

Total order ranking (TOR) strategies, which are mathematically based on elementary methods of discrete mathematics, seem to be attractive and simple tools for performing data analysis. Moreover order-ranking strategies seem to be a very useful tool not only to perform data exploration but also to develop order ranking models, a possible alternative to conventional quantitative structure–activity relationship (QSAR) methods. In fact, when data material is characterised by uncertainties, order methods can be used as alternative to statistical methods such as multilinear regression (MLR), because they do not require specific functional relationships between the independent and dependent variables (responses). A ranking model is a relationship between a set of dependent attributes, experimentally investigated, and a set of independent attributes, i.e. model attributes, which are calculated attributes. As in regression and classification models, the variable selection model is one of the main steps in finding predictive models. In this work the genetic algorithm–variable subset selection (GA–VSS) approach is proposed as the variable selection method for searching for the best ranking models within a wide set of variables. The models based on the selected subsets of variables are compared with the experimental ranking and evaluated by the Spearman’s rank index. A case study application is presented on a TOR model developed for polychlorinated biphenyl (PCB) compounds, which have been analysed according to some of their physicochemical properties which play an important role in their environmental impact.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Halfon E, Reggiani MG (1986) On ranking chemicals for environmental hazard. Environ Sci Technol 20:1173–1179

    CAS  Google Scholar 

  2. Halfon E (1989) Comparison of an index function and a vectorial approach method for ranking of waste disposal sites. Environ Sci Technol 23:600–609

    CAS  Google Scholar 

  3. Halfon E, Bruggemann R (1998) On ranking chemicals for environmental hazard. Comparison of methodologies. Proceedings of the workshop on order theoretical tools in environmental sciences, pp 11–48

  4. Massart DL, Vandeginste BGM, Buydens LMC, De Jong S, Lewi PJ, Smeyers-Verbeke J (1997) Handbook of chemometrics and qualimetrics: part A, Amsterdam, chapter 26, pp 783–803

    Google Scholar 

  5. Keller RH, Massart DL (1991) Chemom Intell Lab Syst 175–189

    Google Scholar 

  6. Hendriks MMWB, Boer JH, Smilde AK, Doorbos DA (1992) Chemom Intell Lab Syst 16:175–191

    Article  CAS  Google Scholar 

  7. Lewi PJ, Van Hoof J, Boey P (1992) Chemom Intell Lab Syst 16:139–144

    Article  CAS  Google Scholar 

  8. Harrington EC (1965) Industrial quality control 21:494–498

    Google Scholar 

  9. Hocking RR (1976) The analysis and selection of variables in linear regression. Biometrics 32:1–49

    Google Scholar 

  10. Miller AJ (1990) Subset Selection in Regression. Chapman and Hall, London (UK), pp 230

    Google Scholar 

  11. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Massachusetts

    Google Scholar 

  12. Wehrens R, Buydens LMC (1998) Evolutionary optimization: a tutorial. TrAC, Trends Anal Chem 17(4):193–203

    Article  CAS  Google Scholar 

  13. Leardi R, Boggia R, Terrile M (1992) Genetic algorithms as a strategy for feature selection. J Chemom 6:267–281

    CAS  Google Scholar 

  14. Leardi R (1994) Application of genetic algorithms to feature selection under full validation conditions and to outlier detection. J Chemom 8:65–79

    CAS  Google Scholar 

  15. Luke BT (1994) Evolutionary programming applied to the development of quantitative structure-activity relationships and quantitative structure-property relationships. J Chem Inf Comput Sci 34:1279–1287

    CAS  Google Scholar 

  16. Leardi R (1996) Genetic algorithms in feature selection. In: Devillers J (ed) Genetic algorithms in molecular modeling. Principles of QSAR and Drug Design. vol 1. Academic, London, pp 67–86

  17. Todeschini R, Consonni V, Mauri A, Pavan M (2004) MobyDigs: software for regression and classification models by genetic algorithms In: Leardi R (ed) Nature-inspired methods in chemometrics: genetic algorithms and artificial neural networks, chap 5. Elsevier, p 141–167

  18. Kendall MG (1948) Rank Correlation Methods. Charles Griffin and Co., London 195:202–204

  19. Patil GS (1991) Correlation of aqueous solubility and octanol-water partition coefficient based on molecular structure. Chemosphere 22(8):723–738

    Article  CAS  Google Scholar 

  20. Myrdal P, Ward GH, Dannenfelser R-M, Mishra D, Yalkowsky SH (1992) AQUAFAC 1: Aqueous functional group activity coefficients: application to hydrocarbons. Chemosphere 24:1047–1061

    Article  CAS  Google Scholar 

  21. Todeschini R, Consonni V, Mauri A, Pavan M (2004) DRAGON, Rel. 5 for Windows; Talete srl: Milano, Italy

  22. HYPERCHEM (1995) Rel 4 for Windows. Autodesk. Inc., Sausalito USA

  23. Bonchev D (1983) Information theoretic indices for characterization of chemical structures. Research Studies Press, Chichester, UK

    Google Scholar 

  24. Devillers J, Balaban AT (2000) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach, Amsterdam

    Google Scholar 

  25. Kier LB, Hall LH (1986) Molecular connectivity in structure-activity analysis. Research Studies Press, Wiley, Chichester , pp 262

  26. Moreau G, Broto P (1980a) The autocorrelation of a topological structure: a new molecular descriptor. Nouv J Chim 4:359–360

    CAS  Google Scholar 

  27. Moreau G, Broto P (1980b) Autocorrelation of molecular structures: application to SAR studies. Nouv J Chim 4:757–764

    CAS  Google Scholar 

  28. Broto P, Moreau G, Vandycke C (1984) Molecular structures: perception, autocorrelation descriptor and SAR studies. Autocorrelation Descriptor. Eur J Med Chem 19:66–70

    CAS  Google Scholar 

  29. Estrada E (1995) Edge adjacency relationships and a novel topological index related to molecular volume. J Chem Inf Comput Sci 35:31–33

    CAS  Google Scholar 

  30. Pearlman RS, Smith KM (1998) Novel software tools for chemical diversity. In: Kubinyi H, Folkers G, Martin YC (eds) 3D QSAR in Drug Design, vol 2. Kluwer/ESCOM, Dordrecht, pp 339–353

  31. Pearlman RS (1999) Novel software tools for addressing chemical diversity. Internet Communication, http://www.netsci.org/Science/Combichem/feature08.html

  32. Gálvez J, Garcìa R, Salabert MT, Soler R (1994) Charge indexes. New Topological Descriptors. J Chem Inf Comput Sci 34:520–525

    Google Scholar 

  33. Gálvez J, Garcìa-Domenech R, De Julián-Ortiz V, Soler R (1995) Topological approach to drug design. J Chem Inf Comput Sci 35:272–284

    PubMed  Google Scholar 

  34. Balaban AT, Ciubotariu D, Medeleanu M (1991) Topological indices and real vertex invariants based on graph eigenvalues or eigenvectors. J Chem Inf Comput Sci 31:517–523

    CAS  Google Scholar 

  35. Randic M (1995) Molecular shape profiles. J Chem Inf Comput Sci 35:373–382

    CAS  Google Scholar 

  36. Randic M (1996) Quantitative structure-property relationship—boiling points of planar benzenoids. New J Chem 20:1001–1009

    CAS  Google Scholar 

  37. Hemmer MC, Steinhauer V, Gasteiger J (1999) Deriving the 3D structure of organic molecules from their infrared spectra. Vib Spectrosc 19:151–164

    Article  CAS  Google Scholar 

  38. Schuur J, Gasteiger J (1996) 3D-MoRSE Code—a new method for coding the 3D structure of molecules. In: Gasteiger J (ed) Software Development in Chemistry, vol 10. Fachgruppe Chemie-Information-Computer (CIC), Frankfurt am Main

  39. Schuur J, Gasteiger J (1997) Infrared spectra simulation of substituted benzene derivatives on the basis of a 3D structure representation. Anal Chem 69:2398–2405

    Article  CAS  Google Scholar 

  40. Todeschini R, Lasagni M, Marengo E (1994) New molecular descriptors for 2D- and 3D-Structures. Theory J Chemom 8:263–273

    CAS  Google Scholar 

  41. Todeschini R, Gramatica P (1997) 3D-Modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of WHIM descriptors. Quant Struct-Act Relat 16:113–119

    CAS  Google Scholar 

  42. Consonni V, Todeschini R, Pavan M (2002) Structure/response correlation and similarity/diversity analysis by GETAWAY descriptors. Part 1. Theory of the novel 3D molecular descriptors. J Chem Comput Sci 42:693–705

    Article  CAS  Google Scholar 

  43. Todeschini R, Consonni V (2000) Handbook of molecular descriptors. Wiley-VCH, Weinheim, p 667

    Google Scholar 

  44. Todeschini R, Consonni V, Mauri A, Pavan M (2003) RANA for Windows; Talete srl, Milano

Download references

Acknowledgements

Financial support from the Commission of the European Union (R&D project “Beam”, EVK1-CT1999-00012) is acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Todeschini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pavan, M., Mauri, A. & Todeschini, R. Total ranking models by the genetic algorithm variable subset selection (GA–VSS) approach for environmental priority settings. Anal Bioanal Chem 380, 430–444 (2004). https://doi.org/10.1007/s00216-004-2762-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00216-004-2762-3

Keywords

Navigation