Abstract
Supervised classifiers are limited by the annotated corpora available. Active learning is a way to circumvent this bottleneck, reducing the number of annotated examples required. In this paper, we analyze the benefits of active learning combined with bagging applied to Quotation Start, Noun Phrase Chunking and Text Chunking tasks. We employ query-by-committee as query strategy to actively select examples to be annotated. By using these techniques, we achieve reductions up to 62.50% on the annotation effort depending on the task to obtain the same quality as in passive supervised learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abe, N., Mamitsuka, H.: Query learning strategies using boosting and bagging. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 1–9. Morgan Kaufmann Publishers Inc., San Francisco (1998), http://dl.acm.org/citation.cfm?id=645527.657478
Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: ICML 1995, pp. 150–157 (1995)
Fernandes, W.P.D., Motta, E., Milidiú, R.L.: Quotation extraction for portuguese. In: Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology (STIL 2011), Cuiabá, pp. 204–208 (2011)
Freitas, M.C., Garrao, M., Oliveira, C., dos Santos, C.N., Silveira, M.: A anotação de um corpus para o aprendizado supervisionado de um modelo de sn. In: Proceedings of the III TIL/XXV Congresso da SBC, São Leopoldo - RS - Brasil (2005)
Freitas, C., Rocha, P., Bick, E.: Floresta Sintá(c)tica: Bigger, Thicker and Easier. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 216–219. Springer, Heidelberg (2008)
Hammerton, J.: Introduction to Special Issue on Machine Learning Approaches to Shallow Parsing. Journal of Machine Learning Research 19(2), 313–558 (2002), doi:10.1162/153244302320884533
Milidiú, R.L., Santos, C.N., Duarte, J.C.: Phrase chunking using entropy guided transformation. In: Proc. of ACL 2008: HLT, pp. 647–655 (2008)
Olsson, F.: A literature survey of active machine learning in the context of natural language processing. Tech. Rep. 06, Box 1263, SE-164 29 Kista, Sweden(2009), http://soda.swedish-ict.se/3600/1/SICS-T2009-06--SE.pdf
Sang, E.F.T.K., Buchholz, S.: Introduction to the conll-2000 shared task: Chunking. In: Proceedings of CoNLL 2000 and LLL 2000, Lisbon, Portugal, pp. 127–132 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this paper
Cite this paper
Milidiú, R.L., Schwabe, D., Motta, E. (2012). Active Learning with Bagging for NLP Tasks. In: Wyld, D., Zizka, J., Nagamalai, D. (eds) Advances in Computer Science, Engineering & Applications. Advances in Intelligent Systems and Computing, vol 167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30111-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-30111-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30110-0
Online ISBN: 978-3-642-30111-7
eBook Packages: EngineeringEngineering (R0)