Combining Classifiers with Informational Confidence

Jaeger, Stefan; Ma, Huanfeng; Doermann, David

doi:10.1007/978-3-540-76280-5_7

Stefan Jaeger⁴,
Huanfeng Ma⁴ &
David Doermann^4,5

Part of the book series: Studies in Computational Intelligence ((SCI,volume 90))

2530 Accesses

We propose a new statistical method for learning normalized confidence values in multiple classifier systems. Our main idea is to adjust confidence values so that their nominal values equal the information actually conveyed. In order to do so, we assume that information depends on the actual performance of each confidence value on an evaluation set. As information measure, we use Shannon's well-known logarithmic notion of information. With the confidence values matching their informational content, the classifier combination scheme reduces to the simple sum-rule, theoretically justifying this elementary combination scheme. In experimental evaluations for script identification, and both handwritten and printed character recognition, we achieve a consistent improvement on the best single recognition rate. We cherish the hope that our information-theoretical framework helps fill the theoretical gap we still experience in classifier combination, putting the excellent practical performance of multiple classifier systems on a more solid basis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Xu, L., Krzyzak, A., Suen, C.: Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition. IEEE Trans. on Systems, Man, and Cybernetics 22 (1992) 418-435
Article Google Scholar
Gader, P., Mohamed, M., Keller, J.: Fusion of Handwritten Word Classifiers. Pattern Recognition Letters 17 (1996) 577-584
Article Google Scholar
Sirlantzis, K., Hoque, S., Fairhurst, M.C.: Trainable Multiple Classifier Schemes for Handwritten Character Recognition. In: 3rd International Workshop on Mul-tiple Classifier Systems (MCS), Cagliari, Italy, Lecture Notes in Computer Science, Springer-Verlag (2002) 169-178
Google Scholar
Wang, W., Brakensiek, A., Rigoll, G.: Combination of Multiple Classifiers for Handwritten Word Recognition. In: Proc. of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR-8), Niagara-on-the-Lake, Canada (2002) 117-122
Chapter Google Scholar
Breiman, L.: Bagging Predictors. Machine Learning 2 (1996) 123-140
Google Scholar
Freund, Y., Schapire, R.: A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence 14 (1999) 771-780
Google Scholar
Freund, Y., Schapire, R.: Experiments with a New Boosting Algorithm. In: Proc. of 13th Int. Conf. on Machine Learning, Bari, Italy (1996) 148-156
Google Scholar
Guenter, S., Bunke, H.: New Boosting Algorithms for Classification Problems with Large Number of Classes Applied to a Handwritten Word Recognition Task. In: 4th International Workshop on Multiple Classifier Systems (MCS), Guildford, UK, Lecture Notes in Computer Science, Springer-Verlag (2003) 326-335
Chapter Google Scholar
Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) 20 (1998) 832-844
Article Google Scholar
Ianakiev, K., Govindaraju, V.: Architecture for Classifier Combination Using Entropy Measures. In: 1st International Workshop on Multiple Classifier Sys-tems (MCS), Cagliari, Italy, Lecture Notes in Computer Science, Springer-Verlag (2000) 340-350
Chapter Google Scholar
Oberlaender, M.: Mustererkennungsverfahren(1995) German Patent DE 4436408 C1 (in German).
Google Scholar
Shannon, C.E.: A Mathematical Theory of Communication. Bell System Tech. J. 27 (1948) 379-423
MATH MathSciNet Google Scholar
Ho, T., Hull, J., Srihari, S.: Decision Combination in Multiple Classifier Systems. IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) 16 (1994) 66-75
Article Google Scholar
Erp, M.V., Vuurpijl, L.G., Schomaker, L.: An Overview and Comparison of Voting Methods for Pattern Recognition. In: Proc. of the 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR-8), Niagara-on-the-Lake, Canada (2002) 195-200
Chapter Google Scholar
Kang, H.J., Kim, J.: A Probabilistic Framework for Combining Multiple Classi-fiers at Abstract Level. In: Fourth International Conference on Document Anal-ysis and Recognition (ICDAR), Ulm, Germany (1997) 870-874
Chapter Google Scholar
Mandler, E., Schuermann, J.: Combining the Classification Results of Inde- pendent Classifiers Based on the Dempster/Shafer Theory of Evidence. In E.S. Gelsema, L.K., ed.: Pattern Recognition and Artificial Intelligence. (1988) 381-393
Google Scholar
Huang, Y., Suen, C.: A Method of Combining Multiple Experts for Recognition of Unconstrained Handwritten Numerals. IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) 17 (1995) 90-94
Article Google Scholar
Kittler, J., Hatef, M., Duin, R., Matas, J.: On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) 226-239
Article Google Scholar
Velek, O., Liu, C.L., Jaeger, S., Nakagawa, M.: An Improved Approach to Gen-erating Realistic Kanji Character Images from On-Line Characters and its Ben-efit to Off-Line Recognition Performance. In: 16th International Conference on Pattern Recognition (ICPR). Volume 1., Quebec (2002) 588-591
Google Scholar
Velek, O., Jaeger, S., Nakagawa, M.: A New Warping Technique for Normaliz- ing Likelihood of Multiple Classifiers and its Effectiveness in Combined OnLine/Off-Line Japanese Character Recognition. In: 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR), Niagara-on-the-Lake, Canada (2002) 177-182
Chapter Google Scholar
Velek, O., Jaeger, S., Nakagawa, M.: Accumulated-Recognition-Rate Normaliza-tion for Combining Multiple On/Off-line Japanese Character Classifiers Tested on a Large Database. In: 4th International Workshop on Multiple Classifier Systems (MCS), Guildford, UK, Lecture Notes in Computer Science, Springer-Verlag (2003) 196-205
Chapter Google Scholar
Pierce, J.R.: An Introduction to Information Theory: Symbols, Signals, and Noise. Dover Publications, Inc., New York (1980)
MATH Google Scholar
Sacco, W., Copes, W., Sloyer, C., Stark, R.: Information Theory: Saving Bits. Janson Publications, Inc., Dedham, MA (1988)
Google Scholar
Sloane, N.J.A., Wyner, A.D.: Claude Elwood Shannon: Collected Papers. IEEE Press, Piscataway, NJ (1993)
Google Scholar
Jaeger, S.: Informational Classifier Fusion. In: Proc. of the 17th Int. Conf. on Pattern Recognition, Cambridge, UK (2004) 216-219
Google Scholar
Jaeger, S.: Using Informational Confidence Values for Classifier Combination: An Experiment with Combined On-Line/Off-Line Japanese Character Recogni-tion. In: Proc. of the 9th Int. Workshop on Frontiers in Handwriting Recognition, Tokyo, Japan (2004) 87-92
Google Scholar
Jaeger, S., Manke, S., Reichert, J., Waibel, A.: Online Handwriting Recogni- tion: The Npen++ Recognizer. International Journal on Document Analysis and Recognition 3 (2001) 169-180
Article Google Scholar
Jaeger, S.: Recovering Dynamic Information from Static, Handwritten Word Images. PhD thesis, University of Freiburg (1998) Foelbach Verlag
Google Scholar
Jaeger, S., Liu, C.L., Nakagawa, M.: The State of the Art in Japanese On-line Handwriting Recognition Compared to Techniques in Western Handwrit-ing Recognition. International Journal on Document Analysis and Recognition 6 (2003) 75-88
Article Google Scholar
Liu, C.L., Jaeger, S., Nakagawa, M.: Online Recognition of Chinese Characters: The State-of-the-Art. IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI) 26 (2004) 198-213
Article Google Scholar
Jaeger, S., Nakagawa, M.: Two On-Line Japanese Character Databases in Unipen Format. In: 6th International Conference on Document Analysis and Recognition (ICDAR), Seattle (2001) 566-570
Google Scholar
Nakagawa, M., Akiyama, K., Tu, L., Homma, A., Higashiyama, T.: Robust and Highly Customizable Recognition of On-Line Handwritten Japanese Characters. In: Proc. of the 13th International Conference on Pattern Recognition. Volume III., Vienna, Austria (1996) 269-273
Chapter Google Scholar
Nakagawa, M., Higashiyama, T., Yamanaka, Y., Sawada, S., Higashigawa, L., Akiyama, K.: On-Line Handwritten Character Pattern Database Sampled in a Sequence of Sentences without Any Writing Instructions. In: Fourth Inter-national Conference on Document Analysis and Recognition (ICDAR), Ulm, Germany (1997) 376-381
Chapter Google Scholar
Teh, C.H., Chin, R.T.: On image anlaysis by the methods of moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 10 (1988) 496-513
Article MATH Google Scholar
Teague, M.: Image analysis via the general theory of moments. Journal of the Optical Society of America 70 (1979) 920-930
Article MathSciNet Google Scholar
Khotanzad, A., Hong, Y.H.: Rotation invariant image recognition using feature selected via a systematic method. Pattern Recognition 23 (1990) 1089-1101
Article Google Scholar
Khotanzad, A., Hong, Y.H.: Invariant image recognition by zernike moments. IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (1990) 489-497
Article Google Scholar
Hochberg, J., Kelly, P., Thomas, T., Kerns, L.: Automatic script identifica-tion from document images using cluster-based templates. IEEE Trans. Pattern Analysis and Machine Intelligence 19 (1997) 176-181
Article Google Scholar
Sibun, P., Spitz, A.L.: Language determination: Natural language processing from scanned document images. In: Proc. 4th Conference on Applied Natural Language Processing, Stuttgart (1994) 115-121
Google Scholar
Spitz, A.L.: Determination of the script and language content of document im-ages. IEEE Trans. Pattern Analysis and Machine Intelligence 19 (1997) 235-245
Article Google Scholar
Tan, C., Leong, T., He, S.: Language Identification in Multilingual Documents. In: Int. Symposium on Intelligent Multimedia and Distance Education (ISI-MADE’99), Baden-Baden, Germany (1999) 59-64
Google Scholar
Waked, B., Bergler, S., Suen, C.Y.: Skew detection, page segmentation, and script classification of printed document images. In: IEEE International Con-ference on Systems, Man, and Cybernetics (SMC’98), San Diego, CA (1998) 4470-4475
Google Scholar
Zhu, Y., Tan, T., Wang, Y.: Font recognition based on global texture analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 23 (2001) 1192-1200
Article Google Scholar
Ma, H., Doermann, D.: Word level script identification for scanned document images. Proc. of Int. Conf. on Document Recognition and Retrieval (SPIE) (2004) 178-191
Google Scholar
Jaeger, S., Ma, H., Doermann, D.: Identifying Script on Word-Level with In-formational Confidence. In: Int. Conf. on Document Analysis and Recognition (ICDAR), Seoul, Korea (2005) 416-420
Chapter Google Scholar
O’Gorman, L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 15 (1993) 1162-1173
Article Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Information Theory IT 13 (1967) 21-27
Article MATH Google Scholar
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2 (1998) 121-167
Article Google Scholar
Joachims, T.: Making Large-Scale SVM Learning Practical. In: Advances in Ker-nel Methods-Support Vector Learning. B. Schölkopf, C. Burges, and A. Smola, MIT-Press (1999) 41-56
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA
Stefan Jaeger, Huanfeng Ma & David Doermann
Laboratory for Language and Media Processing Institute for Advanced Computer Studies 3451 AV Williams Building, University of Maryland, College Park, Maryland, 20742
David Doermann

Authors

Stefan Jaeger
View author publications
You can also search for this author in PubMed Google Scholar
Huanfeng Ma
View author publications
You can also search for this author in PubMed Google Scholar
David Doermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Sistemi e Informatica, University of Florence, Via S. Marta, 3, 50139, Firenze, Italy
Simone Marinai
Hitachi Central Research Laboratory, 1-280, Higashi-Koigakubo, Kokubunji-shi, Tokyo, 185-8601, Japan
Hiromichi Fujisawa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jaeger, S., Ma, H., Doermann, D. (2008). Combining Classifiers with Informational Confidence. In: Marinai, S., Fujisawa, H. (eds) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol 90. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76280-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-76280-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76279-9
Online ISBN: 978-3-540-76280-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics