Sample compression, learnability, and the Vapnik-Chervonenkis dimension

Floyd, Sally; Warmuth, Manfred

doi:10.1007/BF00993593

Sample compression, learnability, and the Vapnik-Chervonenkis dimension

Published: December 1995

Volume 21, pages 269–304, (1995)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Sample compression, learnability, and the Vapnik-Chervonenkis dimension

Download PDF

Sally Floyd¹ &
Manfred Warmuth²

1481 Accesses
89 Citations
Explore all metrics

Abstract

Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of sizek for a concept class\(C \subseteq 2^X \) consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept inC and chooses a subset ofk examples as the compression set. The reconstruction function forms a hypothesis onX from a compression set ofk examples. For any sample set of a concept inC the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a classC is sufficient to ensure that the classC is pac-learnable.

Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class is finite. In the second half of this paper we explore the relationship between sample compression schemes and the VC dimension. We definemaximum andmaximal classes of VC dimensiond. For every maximum class of VC dimensiond, there is a sample compression scheme of sized, and for sufficiently-large maximum classes there is no sample compression scheme of size less thand. We discuss briefly classes of VC dimensiond that are maximal but not maximum. It is an open question whether every class of VC dimensiond has a sample compression scheme of size O(d).

Article PDF

Order Compression Schemes

Bounding Embeddings of VC Classes into Maximum Classes

PAC learning, VC dimension, and the arithmetic hierarchy

Article 07 September 2015

References

Angluin, D. (1988). Queries and concept learning.Machine Learning Vol. 2 No. 4, 319–342, Apr. 1988.
Google Scholar
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. (1987). Occam's razor.Information Processing Letters Vol. 24, 377–380.
Google Scholar
Blumer, A., A. Ehrenfeucht, D. Haussler, & M. Warmuth. (1989). Learnability and the Vapnik-Chervonenkis dimension.Journal of the Association for Computing Machinery Vol. 36, No. 4, 929–965.
Google Scholar
Blumer, A., & Littlestone, N. (1989). Learning faster than promised by the Vapnik-Chervonenkis dimension.Discrete Applied Mathematics 24, p.47–53.
Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D. P., Haussler, D., Schapire, R. E., & Warmuth, M. K. (1993). How to use expert advice.Proceedings of the 25th ACM Symposium on the Theory of Computation, 382–391.
Clarkson, K. L. (1992). Randomized geometric algorithms. In F. K. Hwang and D. Z. Hu (Eds.),Euclidean Geometry and Computers. World Scientific Publishing.
Ehrenfeucht, A., Haussler, D., Kearns, M., & Valiant, L. (1987). A general lower bound on the number of examples needed for learning.Proceedings of the 1988 Workshop on Computational Learning Theory, Morgan Kaufmann, 139–154.
Floyd, S. (1989).On space-bounded learning and the Vapnik-Chervonenkis dimension. PhD thesis, International Computer Science Institute Technical Report TR-89-061, Berkeley, California.
Freund, Y. (1995). Boosting a weak learning algorithm by majority. To appear inInformation and Computation.
Goldman, S., & Sloan, R. (1994). The power of self-directed learning.Machine Learning Vol. 14 No. 3, 271–294.
Google Scholar
Haussler, D. (1988).Space efficient learning algorithms. Technical Report UCSC-CRL-88-2, University of California Santa Cruz.
Haussler, D., Welzl, E. (1987). Epsilon-nets & simplex range queries.Discrete and Computational Geometry 2 127–151.
Google Scholar
Helmbold, D. P., & Warmuth, M. K. (1995). On weak learning.Journal of Computer and System Sciences, to appear.
Helmbold, D., Sloan, R., & Warmuth, M. (1990). Learning nested differences of intersection-closed concept classes.Machine Learning 5, 165–196, 1990.
Google Scholar
Helmbold, D., Sloan, R., & Warmuth, M. (1992). Learning integer lattices.Siam Journal on Computing Vol. 21 No. 2, 240–266.
Google Scholar
Littlestone, N. (1988). Learning when irrelevant attributes abound: A new linear-threshold algorithm.Machine Learning 2 285–318.
Google Scholar
Littlestone, N. (1989).Mistake bounds and logarithmic linear-threshold learning algorithms. PhD thesis, Technical Report UCSC-CRL-89-11, University of California Santa Cruz.
Littlestone, N., Haussler, D., & Warmuth, M. (1994). Predicting {0, 1}-functions on randomly drawn points.Information and Computation Vol. 115 No. 2, 148–292.
Google Scholar
Littlestone, N, & Warmuth, M. (1986).Relating data compression and learnability. Unpublished manuscript, University of California Santa Cruz.
Mitchell, T. (1977). Version spaces: a candidate elimination approach to rule learning.Proceedings of the International Joint Committee for Artificial Intelligence 1977. Cambridge, Mass., 305–310.
Pach, J., & Woeginger, G. (1990). Some new bounds for epsilon-nets.Proceedings of the Sixth Annual Symposium on Computational Geometry, Berkeley, California, 10–15.
Pitt, L., & Valiant, L. (1988). Computational limitations on learning from examples.Journal of the Association for Computing Machinery Vol. 35 No. 4, 965–984.
Google Scholar
Quinlan, J., & Rivest, R. (1989). Inferring decision trees using the minimum description length principle.Information and Computation Vol. 80, 227–248.
Google Scholar
Rissanen, J. (1986). Stochastic complexity and modeling.Annals of Statistics Vol. 14 No. 3, 1080–1100.
Google Scholar
Sauer, N. (1972). On the density of families of sets.Journal of Combinatorial Theory (A) 13, 145–147.
Google Scholar
Schapire, R. (1990). The strength of weak learnability.Machine Learning Vol. 5 No. 2, 197–227.
Google Scholar
Shawe-Taylor, J., Anthony, M., & Biggs, N. (1989).Bounding sample size with the Vapnik-Chervonenkis dimension. Technical Report CSD-TR-618, University of London, Royal Halloway and New Bedford College.
Valiant, L.G. (1984). A theory of the learnable.Communications of the Association for Computing Machinery Vol. 27, No. 11, 1134–42.
Google Scholar
Vapnik, V.N. (1982).Estimation of dependencies based on empirical data. Springer Verlag, New York.
Google Scholar
Vapnik, V.N. & Chervonenkis, A.Ya. (1971). On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and its Applications Vol. 16, No. 2, 264–280.
Google Scholar
Welzl, E. (1987).Complete range spaces. Unpublished notes.
Welzl, E., & Woeginger, G. (1987).On Vapnik-Chervonenkis dimension one. Unpublished manuscript, Institutes for Information Processing, Technical University of Graz and Austrian Computer Society, Austria.

Download references

Author information

Authors and Affiliations

M/S 46A-1123, Lawrence Berkeley Laboratory, One Cyclotron Road, 94720, Berkeley, CA
Sally Floyd
Department of Computer Science, University of California, 95064, Santa Cruz, CA
Manfred Warmuth

Authors

Sally Floyd
View author publications
You can also search for this author in PubMed Google Scholar
Manfred Warmuth
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

S. Floyd was supported in part by the Director, Office of Energy Research, Scientific Computing Staff, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098.

M. Warmuth was supported by ONR grants N00014-K-86-K-0454 and NO0014-91-J-1162 and NSF grant IRI-9123692.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Floyd, S., Warmuth, M. Sample compression, learnability, and the Vapnik-Chervonenkis dimension. Mach Learn 21, 269–304 (1995). https://doi.org/10.1007/BF00993593

Download citation

Received: 05 April 1993
Accepted: 02 June 1995
Issue Date: December 1995
DOI: https://doi.org/10.1007/BF00993593

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Sample compression, learnability, and the Vapnik-Chervonenkis dimension

Abstract

Article PDF

Similar content being viewed by others

Order Compression Schemes

Bounding Embeddings of VC Classes into Maximum Classes

PAC learning, VC dimension, and the arithmetic hierarchy

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sample compression, learnability, and the Vapnik-Chervonenkis dimension

Abstract

Article PDF

Similar content being viewed by others

Order Compression Schemes

Bounding Embeddings of VC Classes into Maximum Classes

PAC learning, VC dimension, and the arithmetic hierarchy

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation