Skip to main content

Attribute-Efficient Learning

  • Reference work entry
  • First Online:
Encyclopedia of Algorithms
  • 150 Accesses

Years and Authors of Summarized Original Work

  • 1987; Littlestone

Problem Definition

Given here is a basic formulation using the online mistake-bound model, which was used by Littlestone [9] in his seminal work.

Fix a class C of Boolean functions over n variables. To start a learning scenario, a target function f∗ ∈ C is chosen but not revealed to the learning algorithm. Learning then proceeds in a sequence of trials. At trial t, an input \(\boldsymbol{x}_{t} \in \{ 0,1\}^{n}\) is first given to the learning algorithm. The learning algorithm then produces its prediction\(\hat{y}_{t}\), which is its guess as to the unknown value f∗(x t ). The correct value y t  = f∗(x t ) is then revealed to the learner. If \(y_{t}\neq \hat{y}_{t}\), the learning algorithm made a mistake. The learning algorithm learns C with mistake-bound m, if the number of mistakes never exceeds m, no matter how many trials are made and how f∗ and \(\boldsymbol{x}_{1},\boldsymbol{x}_{2},\ldots\) are chosen.

Variable (or...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Auer P, Warmuth MK (1998) Tracking the best disjunction. Mach Learn 32(2):127–150

    Article  MATH  Google Scholar 

  2. Blum A, Hellerstein L, Littlestone N (1995) Learning in the presence of finitely or infinitely many irrelevant attributes. J Comput Syst Sci 50(1):32–40

    Article  MathSciNet  MATH  Google Scholar 

  3. Bshouty N, Hellerstein L (1998) Attribute-efficient learning in query and mistake-bound models. J Comput Syst Sci 56(3):310–319

    Article  MathSciNet  MATH  Google Scholar 

  4. Dhagat A, Hellerstein L (1994) PAC learning with irrelevant attributes. In: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe. IEEE Computer Society, Los Alamitos, pp 64–74

    Chapter  Google Scholar 

  5. Gentile C, Warmuth MK (1999) Linear hinge loss and average margin. In: Kearns MJ, Solla SA, Cohn DA (eds) Advances in Neural Information Processing Systems, vol 11. MIT, Cambridge, pp 225–231

    Google Scholar 

  6. Khardon R, Roth D, Servedio RA (2005) Efficiency versus convergence of boolean kernels for on-line learning algorithms. J Artif Intell Res 24:341–356

    MathSciNet  MATH  Google Scholar 

  7. Kivinen J, Warmuth MK (1997) Exponentiated gradient versus gradient descent for linear predictors. Inf Comput 132(1):1–64

    Article  MathSciNet  MATH  Google Scholar 

  8. Klivans AR, Servedio RA (2006) Toward attribute efficient learning of decision lists and parities. J Mach Learn Res 7:587–602

    MathSciNet  MATH  Google Scholar 

  9. Littlestone N (1988) Learning quickly when irrelevant attributes abound: a new linear threshold algorithm. Mach Learn 2(4):285–318

    Google Scholar 

  10. Littlestone N, Warmuth MK (1994) The weighted majority algorithm. Inf Comput 108(2):212–261

    Article  MathSciNet  MATH  Google Scholar 

  11. Martin RK, Sethares WA, Williamson RC, Johnson CR Jr (2002) Exploiting sparsity in adaptive filters. IEEE Trans Signal Process 50(8):1883–1894

    Article  Google Scholar 

  12. Mossel E, O’Donnell R, Servedio RA (2004) Learning functions of k relevant variables. J Comput Syst Sci 69(3):421–434

    Article  MathSciNet  MATH  Google Scholar 

  13. Ng AY (2004) Feature selection, L1 vs. L2 regularization, and rotational invariance. In: Greiner R, Schuurmans D (eds) Proceedings of the 21st International Conference on Machine Learning, Banff. The International Machine Learning Society, Princeton, pp 615–622

    Google Scholar 

  14. Vovk V (1990) Aggregating strategies. In: Fulk M, Case J (eds) Proceedings of the 3rd Annual Workshop on Computational Learning Theory, Rochester. Morgan Kaufmann, San Mateo, pp 371–383

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyrki Kivinen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Kivinen, J. (2016). Attribute-Efficient Learning. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_43

Download citation

Publish with us

Policies and ethics