Abstract
We describe a slightly subexponential time algorithm for learning parity functions in the presence of random classification noise, a problem closely related to several cryptographic and coding problems. Our algorithm runs in polynomial time for the case of parity functions that depend on only the first O(log n log log n) bits of input, which provides the first known instance of an efficient noise-tolerant algorithm for a concept class that is not learnable in the Statistical Query model of Kearns [1998]. Thus, we demonstrate that the set of problems learnable in the statistical query model is a strict subset of those problems learnable in the presence of noise in the PAC model.In coding-theory terms, what we give is a poly(n)-time algorithm for decoding linear k × n codes in the presence of random noise for the case of k = c log n log log n for some c > 0. (The case of k = O(log n) is trivial since one can just individually check each of the 2k possible messages and choose the one that yields the closest codeword.)A natural extension of the statistical query model is to allow queries about statistical properties that involve t-tuples of examples, as opposed to just single examples. The second result of this article is to show that any class of functions learnable (strongly or weakly) with t-wise queries for t = O(log n) is also weakly learnable with standard unary queries. Hence, this natural extension to the statistical query model does not increase the set of weakly learnable functions.
- Ajtai, M., Kumar, R., and Sivakumar, D. 2001. A sieve algorithm for the shortest lattice vector problem. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing. ACM, New York. Google Scholar
- Angluin, D., and Laird, P. 1988. Learning from noisy examples. Mach. Learn. 2, 4, 343--370. Google ScholarCross Ref
- Aslam, J. A., and Decatur, S. E. 1998a. General bounds on statistical query learning and PAC learning with noise via hypothesis boosting. Inf. Comput. 141, 2 (Mar.), 85--118. Google ScholarDigital Library
- Aslam, J. A., and Decatur, S. E. 1998b. Specification and simulation of statistical query algorithms for efficiency and noise tolerance. J. Comput. Syst. Sci. 56, 2 (Apr.), 191--208. Google ScholarDigital Library
- Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., and Rudich, S. 1994. Weakly learning DNF and characterizing statistical query learning using fourier analysis. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (May). ACM New York, pp. 253--262. Google Scholar
- Decatur, S. E. 1993. Statistical queries and faulty PAC oracles. In Proceedings of the 6th Annual ACM Workshop on Computational Learning Theory. ACM, New York. Google Scholar
- Decatur, S. E. 1996. Learning in hybrid noise environments using statistical queries. In Learning from Data: Artificial Intelligence and Statistics V, D. Fisher and H.-J. Lenz, Eds. Springer-Verlag, New York.Google Scholar
- Helmbold, D., Sloan, R., and Warmuth, M. 1992. Learning integer lattices. SIAM J. Comput. 21, 2, 240--266. Google ScholarDigital Library
- MacWilliams, F., and Sloane, N. 1977. The Theory of Error-Correcting Codes. North-Holland, Amsterdam, The Netherlands.Google Scholar
- Jackson, J. 2000. On the efficiency of noise-tolerant PAC algorithms derived from statistical queries. In Proceedings of the 13th Annual Workshop on Computational Learning Theory. Google Scholar
- Kearns, M. 1998. Efficient noise-tolerant learning from statistical queries. J. ACM, 45, 6 (Nov.), 983--1006. Google ScholarDigital Library
- Kumar, R., and Sivakumar, D. 2001. On polynomial approximations to the shortest lattice vector length. In Proceedings of the 12th Annual Symposium on Discrete Algorithms. Google Scholar
- Littlestone, N. 1988. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Mach. Learn. 2, 285--318. Google ScholarCross Ref
- Littlestone, N. 1989. From online to batch learning. In Proceedings of the 2nd Annual ACM Conference on Computational Learning Theory. ACM, New York, pp. 269--284. Google Scholar
- Wagner, D. 2002. A generalized birthday problem. In Proceedings in Advances in Cryptology---Crypto 2002. Lecture Notes in Computer Science, vol. 2442. Springer-Verlag, New York, pp. 288--304. Google Scholar
Index Terms
- Noise-tolerant learning, the parity problem, and the statistical query model
Recommendations
Efficient noise-tolerant learning from statistical queries
In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize ...
Unconditional lower bounds for learning intersections of halfspaces
We prove new lower bounds for learning intersections of halfspaces, one of the most important concept classes in computational learning theory. Our main result is that any statistical-query algorithm for learning the intersection of $\sqrt{n}$ halfspaces in n ...
A general dimension for query learning
We introduce a combinatorial dimension that characterizes the number of queries needed to exactly (or approximately) learn concept classes in various models. Our general dimension provides tight upper and lower bounds on the query complexity for all ...
Comments