The correctness problem: evaluating the ordering of binary features in rankings

Javed, Kashif; Saeed, Mehreen; Babri, Haroon A.

doi:10.1007/s10115-013-0631-0

The correctness problem: evaluating the ordering of binary features in rankings

Regular Paper
Published: 22 March 2013

Volume 39, pages 543–563, (2014)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Kashif Javed¹,
Mehreen Saeed² &
Haroon A. Babri¹

334 Accesses
6 Citations
Explore all metrics

Abstract

In machine learning, feature ranking (FR) algorithms are used to rank features by relevance to the class variable. FR algorithms are mostly investigated for the feature selection problem and less studied for the problem of ranking. This paper focuses on the latter. A question asked about the problem of ranking given in the terminology of FR is: as different FR criteria estimate the relationship between a feature and the class variable differently on a given data, can we determine which criterion better captures the “true” feature-to-class relationship and thus generates the most “correct” order of individual features? This is termed as the “correctness” problem. It requires a reference ordering against which the ranks assigned to features by a FR algorithm are directly compared. The reference ranking is generally unknown for real-life data. In this paper, we show through theoretical and empirical analysis that for two-class classification tasks represented with binary data, the ordering of binary features based on their individual predictive powers can be used as a benchmark. Thus, allowing us to test how correct is the ordering of a FR algorithm. Based on these ideas, an evaluation method termed as FR evaluation strategy (FRES) is proposed. Rankings of three different FR criteria (relief, mutual information, and the diff-criterion) are investigated on five artificially generated and four real-life binary data sets. The results indicate that FRES works equally good for synthetic and real-life data and the diff-criterion generates the most correct orderings for binary data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

Article Open access 02 January 2020

Davide Chicco & Giuseppe Jurman

A review of unsupervised feature selection methods

Article 29 January 2019

Saúl Solorio-Fernández, J. Ariel Carrasco-Ochoa & José Fco. Martínez-Trinidad

Notes

From here onwards, the discussion will be from machine learning perspective unless stated otherwise.
Also known as variables or attributes.
Also known as examples, observations or samples.
We focus only on FR algorithms in this paper though one can find work related to this issue in other domains such as [7, 36], which employ non-FR algorithms for ranking.

References

Agarwal S, Dugar D, Sengupta S (2010) Ranking chemical structures for drug discovery: a new machine learning approach. J Chem Inf Model 50(5):716–731
Article Google Scholar
Agarwal S, Sengupta S (2009) Ranking genes by relevance to a disease. In: Proceedings of the 8th annual international conference on computational systems bioinformatics
AIMS (2010) The mathematics of ranking. http://www.aimath.org/ARCC/workshops/mathofranking.html
Arauzo-Azofra A, Aznarte J, Benitez J (2011) Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst Appl 38(7):8170–8177
Article Google Scholar
Bhamidipati N, Pal S (2009) Comparing scores intended for ranking. IEEE Trans Knowl Data Eng 21(1):21–34
Article Google Scholar
Bishop C (2006) Pattern recognition and machine learning. Springer, Berlin
MATH Google Scholar
Boldi P (2005) TotalRank: ranking without damping. In: Special interest tracks and posters of the 14th international conference on world wide web, WWW ’05, pp 898–899
Bolon-Canedo V, Sanchez-Marono N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519
Article Google Scholar
Clemencon S, Lugosi G, Vayatis N (2008) Ranking and empirical minimization of U-statistics. Ann Stat 36:844–874
Article MATH MathSciNet Google Scholar
Cohen W, Schapire R, Singer Y (1999) Learning to order things. J Artif Intell Res 10:240–270
MathSciNet Google Scholar
Conover W (1999) Practical nonparametric statistics, 3rd edn. Wiley, New York
Google Scholar
Cover T, Thomas J (1991) Elements of information theory. Wiley, New York
Book MATH Google Scholar
Duch W (2006) Feature extraction: foundations and applications. In: Guyon I, Nikravesh M, Gunn S, Zadeh L (eds) Foundations and applications. Springer, Berlin, pp 89–117
Google Scholar
Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New York
Google Scholar
Dwork C, Kumar R, Naor M et al (2001) Rank aggregation methods for the web. In: Proceedings of the tenth international conference on World wide web (WWW10), pp 613–622
Fagin R, Kumar R, Sivakumar D (2003). Comparing top \(k\) lists. In: ACM SIAM symposium on discrete algorithms, pp 28–36
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Freund Y, Iyer R, Schapire R et al (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969
Google Scholar
Gleich D, Langville A (2010) Suggested problems for discussion. http://www.stat.uchicago.edu/lekheng/meetings/mathofranking/problems/david-amy.txt
Golub T, Slonim D, Tamayo P et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537
Article Google Scholar
Gustafson A, Snitkin E, Parker S et al (2006) Towards the identification of essential genes using targeted genome sequencing and comparative analysis. BMC Bioinform 7. http://www.biomedcentral.com/1471-2164/7/265/
Guyon I, Aliferis C, Cooper G et al (2008) Design and analysis of the causation and prediction challenge. In: JMLR workshop and conference proceedings: causation and prediction challenge (WCCI 2008), vol. 3, pp 1–33
Guyon I, Cawley G, Dror G et al (eds) (2011) Hands-on pattern recognition: challenges in machine learning, vol. 1. Microtome Publishing, Brookline. http://www.mtome.com/Publications/CiML/ciml.html
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Google Scholar
Guyon I, Saffari A, Dror G et al (2007) Agnostic learning vs. prior knowledge challenge. In: Proceedings of international joint conference on neural networks (IJCNN), pp 829–834
Guyon I, Weston J, Barnhill S et al (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422
Article MATH Google Scholar
Hall M, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437–1447
Article Google Scholar
Javed K (2012) Development of feature selection algorithms for high-dimensional binary data. Ph.D. thesis, Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan
Javed K, Babri H, Saeed M (2012a) Evaluating rankings of mutual information and diff-criterion for high-dimensional binary data. In: Proceedings of the first Taibah University International on computing and information technology, pp 18–23
Javed K, Babri H, Saeed M (2012b) Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Trans Knowl Data Eng 24(3):465–477
Article Google Scholar
John G, Kohavi R, Pfleger K (1994) Irrelevant feature and the subset selection problem. In: Proceedings of the 11th international conference on machine learning (ICML), pp 121–129
Jr EH, Ebecken N (2007) Towards efficient variables ordering for Bayesian networks classifier. Data Knowl Eng 63(2):258–269
Google Scholar
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116
Article Google Scholar
Kira K, Rendell L (1992). A practical approach to feature selection. In: Proceedings of the 9th international conference on machine learning (ICML), pp 249–256
Langville A, Meyer C (2004) Deeper inside pagerank. Internet Math 1(3):335–380
MATH MathSciNet Google Scholar
Lapata M (2006) Automatic evaluation of information ordering: Kendall’s Tau. Comput Linguist 32(4):471–484
Article MATH Google Scholar
Lazar C, Taminau J, Meganck S et al (2012) A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE/ACM Trans Comput Biol Bioinf 9(4):1106–1119
Article Google Scholar
Li H (2011) A short introduction to learning to Rank. IEICE Trans 94-D(10):1854–1862
Google Scholar
Minka T (2003) A comparison of numerical optimizers for logistic regression. http://research.microsoft.com/minka/papers/
Rosa KD, Metsis V, Athitsos V (2012) Boosted ranking models: a unifying framework for ranking predictions. Knowl Inf Syst 30(3):543–568
Article Google Scholar
Ruiz R, Aguilar-Ruiz J, Riquelme J et al (2005) Analysis of feature rankings for classification. In: Proceedings of the 6th international symposium on, intelligent data analysis, pp 362–372
Saeys Y, Inza I, Larranage P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Article Google Scholar
Saffari A, Guyon I (2006) Quick start guide for challenge learning object package (CLOP), Technical report, Graz University of Technology and Clopinet. http://clopinet.com/clop/
Slavkov I, Zenko B, Dzeroski S (2010) Evaluation method for feature rankings and their aggregations for biomarker discovery. In: JMLR workshop and conference proceedings: machine learning in systems biology, vol. 8. pp 122–135
Su Y, Murali T, Pavlovic V et al (2003) RankGene: identification of diagnostic genes based on expression data. Bioinformatics 19(12):1578–1579
Article Google Scholar
Wang B, Tang J, Fan W et al (2013) Query-dependent cross-domain ranking in heterogeneous network. Knowl Inf Syst 34(1):109–145
Article Google Scholar
Xia F, Liu T-Y, Wang J et al (2008) Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the 25th international conference on machine learning (ICML), pp 1192–1199
Xiao YHYY, Segal MR (2005) Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21(7):1084–1093
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan
Kashif Javed & Haroon A. Babri
Department of Computer Science, National University of Computer and Emerging Sciences, Lahore, Pakistan
Mehreen Saeed

Authors

Kashif Javed
View author publications
You can also search for this author in PubMed Google Scholar
Mehreen Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Haroon A. Babri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kashif Javed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Javed, K., Saeed, M. & Babri, H.A. The correctness problem: evaluating the ordering of binary features in rankings. Knowl Inf Syst 39, 543–563 (2014). https://doi.org/10.1007/s10115-013-0631-0

Download citation

Received: 09 July 2012
Revised: 08 November 2012
Accepted: 08 March 2013
Published: 22 March 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10115-013-0631-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The correctness problem: evaluating the ordering of binary features in rankings

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

A review of unsupervised feature selection methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

A review of unsupervised feature selection methods

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation