Active Learning in Neural Networks

Hasenjäger, M.; Ritter, H.

doi:10.1007/978-3-7908-1803-1_5

M. Hasenjäger⁴ &
H. Ritter⁴

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 84))

496 Accesses
18 Citations

Abstract

We discuss a new paradigm, called active learning, for supervised learning that aims at improving the efficiency of neural network training procedures. The starting point for active learning is the observation that the traditional approach of randomly selecting training samples leads to large, highly redundant training sets. This redundancy is not always desirable. Especially if the acquisition of training data is expensive, one is rather interested in small, informative training sets. Such training sets can be obtained if the learner is enabled to select those training data that he or she expects to be most informative. In this case, the learner is no longer a passive recipient of information but takes an active role in the selection of the training data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Machine Learning

Blind Descent: A Prequel to Gradient Descent

References

Fahlmann, S.E. and Lebiere, C. (1990), “The cascade-correlation learning architecture,” Advances in Neural Information Processing Systems, Touretzky, D.S., editor, Los Altos, CA. Morgan Kaufmann, vol. 2, pp. 524–532.
Google Scholar
LeCun, Y., Denker J.S., and Solla, S.A. (1990), “Optimal brain damage,” Advances in Neural Information Processing Systems, Touretzky, D.S., editor, San Mateo, CA. Morgan Kaufmann, vol. 2, pp. 598–605.
Google Scholar
Littmann, E. and Ritter, H. (1996), “Learning and generalization in cascade network architectures,” Neural Computation, vol. 8, pp. 1521–1539.
Article Google Scholar
Riedmiller, M. (1994), “Advanced supervised learning in multilayer perceptrons–from backpropagation to adaptive learning techniques,” Computer Standards and Interfaces, vol. 16, pp. 265–278.
Article Google Scholar
Fedorov, V.V. (1972), Theory of optimal experiments, Academic Press, New York.
Google Scholar
Box, G.E.P. and Draper, N.R. (1987), Empirical Model Building and Response Surfaces, Wiley, New York.
MATH Google Scholar
Atkinson, A.C. and Donev, A.N. (1992), Optimum Experimental Designs, Clarendon Press, Oxford.
MATH Google Scholar
Valiant, L.G. (1984), “A theory of the learnable,” Communications of the ACM, vol. 27, pp. 1134–1142.
Article MATH Google Scholar
Angluin, D. (1988), “Queries and concept learning,” Machine Learning, vol. 2, pp. 319–342.
Google Scholar
Angluin, D. (1987), “Learning regular sets from queries and counterexamples,” Information and Computation, vol. 75, pp. 87–106.
Article MathSciNet MATH Google Scholar
Angluin, D. and Kharitonov, M. (1995), “When won’t membership queries help?,” Journal of Computer and System Sciences, vol. 50, pp. 336–355.
Article MathSciNet MATH Google Scholar
Plutowski, M. and White, H. (1993), “Selecting concise training sets from clean data,” IEEE Transactions on Neural Networks, vol. 4, pp. 305–318.
Article Google Scholar
Plutowski, M., Cottrell, G., and White, H. (1996), “Experience with selecting exemplars from clean data,” Neural Networks, vol. 9, pp. 273–294.
Article Google Scholar
Röbel, A. (1993), “The dynamic pattern selection algorithm: Effective training and controlled generalization of backpropagation neural networks,” Technical Report 93–23, Technische Universität Berlin, Berlin.
Google Scholar
Cortes, C. and Vapnik, V. (1995), “Support-vector networks,” Machine Learning, vol. 20, pp. 273–297.
MATH Google Scholar
Guyon, I., Matie, N., and Vapnik, V. (1996), “Discovering informative patterns and data cleaning,” Advances in Knowledge Discovery and Data Mining, Fayyad, U.M., editor, AAI Press, Menlo Park, CA, pp. 181–20.
Google Scholar
Jung, G. and Opper, M. (1996), “Selection of examples for a linear classifier,” Journal of Physics A, vol. 29, pp. 1367–1380.
Article MathSciNet MATH Google Scholar
Munro, P.W. (1992), “Repeat until bored: A pattern selection strategy,” in J.E. Moody, S.J. Hanson, and R.P. Lippmann, editors Advances in Neural Information Processing Systems vol. 4, pp. 1001–1008, San Mateo, CA. Morgan Kaufmann
Google Scholar
Cachin, C. (1994), “Pedagogical pattern selection strategies,” Neural Networks, vol. 7, pp. 175–181.
Article Google Scholar
Thrun, S.B. (1992), “The role of exploration in learning control,” Handbook of Intelligent Control: Neural Fuzzy and Adaptive ApproachesWhite, D.A. and Sofge, D.A., editors, Van Nordstrand Reinhold, Florence, Kentucky, pp. 527–559.
Google Scholar
Thrun, S. (1995), “Exploration in active learning,” The Handbook of Brain Theory and Neural NetworksArbib, M.A., editor, MIT Press, Cambridge, MA, pp. 381–384.
Google Scholar
Ratsaby, J. (1998), “An incremental nearest neighbor algorithm with queries,” Advances in Neural Processing Systems, Jordan, M.I., Kearns, M.J., and Solla, S.A., editors, Cambridge, MA. MIT Press, vol. 10, pp. 612–618.
Google Scholar
Heskes, T. (1994), “The use of being stubborn and introspective,” Proceedings of the ZiF Conference on Adaptive Behavior and Learning, Dean, J., Cruse, H., and Ritter, H., editors, Bielefeld, pp. 55–65.
Google Scholar
Leisch, F., Jain, L.C., and Hornik, K. (1998), “Cross-validation with active pattern selection for neural-network classifiers,” IEEE Transactions on Neural Networks vol. 9, pp. 35–41.
Google Scholar
van de Laar, P., Gielen, S., and Heskes, T. (1997), “Input selection with partial retraining,” Artificial Neural Networks–ICANN ‘87, Gerstner, W., Germond, A., Hasler, M., and Nicoud, J.-D., editors, Berlin. Springer, pp. 469–474.
Google Scholar
Kinzel, W. and Rujân, P. (1990), “Improving a network generalization ability by selecting examples,” Europhysics Letters vol. 13, pp. 473–477.
Google Scholar
Watkin, T.L.H. and Rau, A. (1992), “Selecting examples for perceptrons,” Journal of Physics A: Mathematical and General vol. 25, pp. 113–121.
Google Scholar
Kinouchi, O. and Caticha, N. (1992), “Optimal generalization in perceptrons,” Journal of Physics A vol. 25, pp. 6243–6250.
Google Scholar
Hwang, J.-N., Choi, J.J., Oh, S., and Marks II, R.J. (1991), “Query-based learning applied to partially trained multilayer perceptrons,” IEEE Transactions on Neural Networks vol. 2, pp. 131–136.
Google Scholar
Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1988), “Learning internal representations by error propagation,” Parallel Distributed Processing, Rumelhart, D.E. and MacClelland, J.L., editors, 7th ed., MIT Pr., Cambridge, Mass, vol. 1, chapter 8, pp. 318–362.
Google Scholar
Linden, A. and Kindermann, J. (1989), “Inversion of multilayer nets,” Proceedings of the International Joint Conference on Neural Networks, New York, IEEE Press, vol. 2, pp. 425–430.
Google Scholar
Baum, E.B. (1991), “Neural net algorithms that learn in polynomial time from examples and queries,” IEEE Transactions on Neural Networks vol. 2, pp. 5–19.
Google Scholar
Hasenjäger, M., and Ritter, H. (1998), “Active learning with local models,” Neural Processing Letter vol. 7, pp. 107–117.
Google Scholar
Sollich, P. (1994), “Query construction, entropy and generalization in neural network models,” Physical Review E vol. 49, pp. 4637–4651.
Google Scholar
MacKay, D.J.C. (1992), “Information-based objective functions for active data selection,” Neural Computation vol. 4, pp. 590–604.
Google Scholar
MacKay, D.J.C. (1992), “The evidence framework applied to classification networks,” Neural Computation, vol. 4, pp. 720–736.
Article Google Scholar
Cohn, D.A. (1996), “Neural network exploration using optimal experiment design,” Neural Networks, vol. 9, pp. 1071–1083.
Article Google Scholar
Belue, L.M., Bauer Jr., K.W., and Ruck, D.W. (1997), “Selecting optimal experiments for multiple output multilayer perceptrons,” Neural Computation, vol. 9, pp. 161–183.
Article Google Scholar
Seung, H.S., Opper, M., and Sompolinsky, H. (1992), “Query by committee,” Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, New York, NY. ACM Pr., pp. 287–294.
Google Scholar
Freund, Y., Seung, H.S., Shamir, E., and Tishby, N. (1997), “Selective sampling using the Query by Committee algorithm,” Machine Learning, vol. 28, pp. 133168.
Google Scholar
Krogh, A. and Vedelsby, J. (1995), “Neural network ensembles, cross validation, and active learning,” Advances in Neural Information Processing Systems, Tesauro, G., Touretzky, D., and Leen, T.K., editors, Cambridge, MA. MIT Pr., vol. 7, pp. 231–238.
Google Scholar
Eisenberg, B. and Rivest, R.L. (1990), “On the sample complexity of paclearning using random and chosen examples,” Proceedings of the Third Annual Workshop on Computational Learning Theory, Fulk, M. and Case, J., editors, San Mateo, CA. Morgan Kaufmann, pp. 154–162.
Google Scholar
Kulkarni, S.R., Mitter, S.K., and Tsitsiklis, J.N. (1993), “Active learning using arbitrary binary valued queries,” Machine Learning, vol. 11, pp. 23–35.
Article MATH Google Scholar
Baum, E.B. and Lang, K. (1992), “Query learning can work poorly when a human oracle is used,” International Joint Conference on Neural Networks, Beijing, China.
Google Scholar
Cohn, D. (1997), “Minimizing statistical bias with queries,” Advances in Neural Information Processing Systems, Mozer, M.C., Jordan, M.I., and Petsche, T., editors, Cambridge, MA. MIT Pr., vol. 9, pp. 417–423.
Google Scholar
Cohn, D.A., Ghahramani, Z., and Jordan, M.I. (1996), “Active learning with statistical models,” Journal of Artificial Intelligence Research, vol. 4, pp. 129145.
Google Scholar
Cohn, D., Atlas, L., and Ladner, R. (1994), “Improving generalization with active learning,” Machine Learning, vol. 15, pp. 201–221.
Google Scholar
Tishby, N., Levin, E., and Solla, S. (1989), “Consistent inference of probabilities in layered networks: Predictions and generalization,” Proceedings of the International Joint Conference on Neural Networks, New York, IEEE Press, vol. 2, pp. 403–409.
Google Scholar
Kirkpatrick, S., Gelatt, Jr., C.D., and Vecchi, M.P. (1983), “Optimization by simulated annealing,” Science, vol. 220, pp. 671–680.
Article MathSciNet MATH Google Scholar
Bachrach, R., Fine, S., and Shamir, E. (1998), “Query by Committee, linear separation and random walks,” Accepted to EuroColt-99, full version available at [http://www.cs.huji.ac.il/labs/learning/Papers/qbc-main.ps.gz].
Dagan, I. and Engelson, S. (1995), “Selective sampling in natural language learning,” IJCA I-95 Workshop On New Approaches to Learning for Natural Language Processing, available at [http://www.cs.biu.ac.il:8080/~argamon/Access/ijcai-ml-n195.ps.Z].
Dagan, I. and Engelson, S. (1995), “Committee-based sampling for training probabilistic classifiers,” Proceedings of the 12th International Conference on Machine Learning, San Francisco, CA. Morgan Kaufmann, pp. 150–157.
Google Scholar
Liere, R. and Tadepalli, P. (1997) “Active learning with committees for text categorization,” Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence (AAAI97/IAAI-97), Menlo Park, CA. AAAI Press, pp. 591–596.
Google Scholar
Lang, K.J. and Witbrock, M.J. (1989), “Learning to tell two spirals apart,” Proceedings of the 1988 Connectionist Summer School, Touretzky, D., Hinton, G., and Sejnowski, T., editors, San Mateo, CA. Carnegie Mellon Univ., Morgan Kaufmann, pp. 52–59.
Google Scholar
Barber, C.B., Dobkin, D.P., and Huhdanpaa, H. (1996), “The quickhull algorithm for convex hulls,” ACM Transactions on Mathematical Software, vol. 22, pp. 469–483.
Article MathSciNet MATH Google Scholar
Kohonen, T. (1995), Self-Organizing Maps, Springer, Berlin, chapter 6, pp. 175–189.
Google Scholar
Fukumizu, K. (1996), “Active learning in multilayer perceptrons,” Advances in Neural Information Processing Systems, Touretzky, D.S., Mozer, M.C., and Hasselmo, M.E., editors, Cambridge, MA. MIT Press, vol. 8, pp. 295–301.
Google Scholar
Paass, G. and Kindermann, J. (1995), “Bayesian query construction for neural network models,” Advances in Neural Processing Systems, Tesauro, G., Touretzky, D., and Leen, T.K., editors, Cambridge, MA, MIT Pr., vol. 7, pp. 443–450.
Google Scholar
Sung, K.K. and Niyogi, P. (1995), “Active learning for function approximation,” Advances in Neural Processing Systems, Tesauro, G., Touretzky, D., and Leen, T.K., editors, Cambridge, MA. MIT Pr., vol. 7, pp. 593–600.
Google Scholar
Hofmann, T and Buhmann, J.M. (1998), “Active data clustering,” Advances in Neural Processing Systems, Jordan, M.I., Kearns, M.J., and Solla, S.A., editors, Cambridge, MA. MIT Press, vol. 10, pp. 528–534.
Google Scholar
Hasenjäger, M., Ritter, H., and Obermayer, K. (1999), “Active learning in self-organizing maps,” Kohonen Maps, Oja, E. and Kaski, S., editors, Elsevier, Amsterdam, pp. 57–70.
Google Scholar
Cohn, D., Riskin, E., and Ladner, R. (1994), “Theory and practice of vector quantizers trained on small training sets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, pp. 54–65.
Article Google Scholar
Borg, I. and Groenen, P. (1997), Modern Multidimensional Scaling, Springer, New York.
Book MATH Google Scholar
Hofmann, T. and Buhmann, J (1997), “Pairwise data clustering by deterministic annealing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 1–14.
Article Google Scholar
Graepel, T. and Obermayer, K. (1999), “A stochastic self-organizing map for proximity data,” Neural Computation, vol. 11, pp. 139–155.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Technische Fakultät, Universität Bielefeld, Postfach 10 01 31, D-33501, Bielefeld, Germany
M. Hasenjäger & H. Ritter

Authors

M. Hasenjäger
View author publications
You can also search for this author in PubMed Google Scholar
H. Ritter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge-Based Intelligent Engineering Systems Centre, University of South Australia, 5095, Adelaide, Mawson Lakes, South Australia, Australia
Lakhmi C. Jain
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447, Warsaw, Poland
Janusz Kacprzyk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hasenjäger, M., Ritter, H. (2002). Active Learning in Neural Networks. In: Jain, L.C., Kacprzyk, J. (eds) New Learning Paradigms in Soft Computing. Studies in Fuzziness and Soft Computing, vol 84. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1803-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-7908-1803-1_5
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-2499-5
Online ISBN: 978-3-7908-1803-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Active Learning in Neural Networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Machine Learning

Machine Learning

Blind Descent: A Prequel to Gradient Descent

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Active Learning in Neural Networks

Abstract

Access this chapter

Preview

Similar content being viewed by others

Machine Learning

Machine Learning

Blind Descent: A Prequel to Gradient Descent

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation