Abstract
In this paper we introduce a margin based feature selection criterion and apply it to measure the quality of sets of features. Using margins we devise novel selection algorithms for multi-class categorization problems and provide theoretical generalization bound. We also study the well known Relief algorithm and show that it resembles a gradient ascent over our margin criterion. We report promising results on various datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. Bartlett. The size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2):525–536, 1998.
R. Bellman. Adaptive Control Processes: A Guided Tour. Princeton University Press, 1961.
B. Boser, I. Guyon, and V. Vapnik. Optimal margin classifiers. In In Fifth Annual Workshop on Computational Learning Theory, pages 144–152, 1992.
T.M. Cover and P.E. Hart. Nearest neighbor pattern classifier. IEEE Transactions on Information Theory, 13:21–27, 1967.
K. Crammer, R. Gilad-Bachrach, A. Navot, and N. Tishby. Margin analysis of the lvq algorithm. In Proc. 17’th Conference on Neural Information Processing Systems (NIPS), 2002.
E. Fix and j. Hodges. Discriminatory analysis. nonparametric discrimination: Consistency properties. Technical Report 4, USAF school of Aviation Medicine, 1951.
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.
I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learnig Research, pages 1157–1182, Mar 2003.
I. Guyon and S. Gunn. Nips feature selection challenge. http://www.nipsfsc.ecs.soton.ac.uk/, 2003.
K. Kira and L. Rendell. A practical approach to feature selection. In Proc. 9th International Workshop on Machine Learning, pages 249–256, 1992.
T. Kohonen. Self-Organizing Maps. Springer-Verlag, 1995.
I. Kononenko. Estimating attributes: Analysis and extensions of RELIEF. In Proc. European Conference on Machine Learning, pages 171–182, 1994. URL citeseer.nj.nec.com/kononenko94estimating.html.
A.M. Martinez and R. Benavente. The ar face database. Technical report, CVC Tech. Rep. #24, 1998. http://rvl1.ecn.purdue.edu/~aleix/aleix_face_DB.html.
R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, 1998.
J. Shawe-Taylor, P.L. Bartlett, R.C. Williamson, and M. Anthony. Structural risk minimization over data-dependent hierarchies. IEEE transactions on Information Theory, 44(5):1926–1940, 1998.
J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik. Feature selection for SVMs. In Proc. 15th Conference on Neural Information Processing Systems (NIPS), pages 668–674, 2000. URL citeseer.nj.nec.com/article/weston01feature.html.
J. Weston, A. Elisseeff, G. BakIr, and F. Sinz. The spider, 2004. http://www.kyb.tuebingen.mpg.de/bs/people/spider/.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Gilad-Bachrach, R., Navot, A., Tishby, N. (2006). Large Margin Principles for Feature Selection. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds) Feature Extraction. Studies in Fuzziness and Soft Computing, vol 207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-35488-8_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-35488-8_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35487-1
Online ISBN: 978-3-540-35488-8
eBook Packages: EngineeringEngineering (R0)