Abstract
In this paper we propose a modified and improved Relief method, called extraRelief. Relief is a popular feature selection algorithm proposed by Kira and Rendell in 1992. Although compared to many other feature selection methods Relief or its extensions are found to be superior, in this paper we show that it can be further improved. In Relief, in the main loop, a number of instances are randomly selected using simple random sampling (srs), and for each of these selected instances, the nearest hit and miss are determined, and these are used to assign ranks to the features. srs fails to represent the whole dataset properly when the sampling ratio is small (i.e., when the data is large), and/or when data is noisy. In extraRelief we use an efficient method to select instances. The proposed method is based on the idea that a sample has similar distribution to that of the whole. We approximate the data distribution by the frequencies of attribute-values. Experimental comparison with Relief shows that extraRelief performs significantly better particularly for large and/or noisy domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arauzo-Azofra, A., Benitez-Sanchez, J.M., Castro-Pena, J.L.: A feature set measure based on relief. In: RASC. Proceedings of the 5th International Conference on Recent Advances in Soft Computing, pp. 104–109 (2004)
Brönnimann, H., Chen, B., Dash, M., Haas, P., Scheuermann, P.: Efficient data reduction with EASE. In: SIGKDD. Proceedings of 9th International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2003)
Chen, B., Haas, P., Scheuermann, P.: A new twophase sampling based algorithm for discovering association rules. In: SIGKDD. Proceedings of International Conference on Knowledge Discovery and Data Mining (2002)
Dash, M., Liu, H.: Feature selection for classification. International Journal of Intelligent Data Analysis 1(3) (1997)
Draper, B., Kaito, C., Bins, J.: Iterative relief. In: Proceedings of Workshop on Learning in Computer Vision and Pattern Recognition (2003)
Kira, K., Rendell, L.A.: The feature selection problem: Traditional methods and a new algorithm. In: AAAI. Proceedings of Ninth National Conference on AI (1992)
Kononenko, I.: Estimating attributes: Analysis and extension of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: An enabling technique, pp. 393–423 (2002)
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning Journal 53, 23–69 (2003)
Valiant, L.G.: A theory of the learnable. Communications of the Association for Computing Machinery 27(11), 1134–1142 (1984)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy 5, 1205–1224 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dash, M., Cher Yee, O. (2007). extraRelief: Improving Relief by Efficient Selection of Instances. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-76928-6_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76926-2
Online ISBN: 978-3-540-76928-6
eBook Packages: Computer ScienceComputer Science (R0)