Skip to main content

Named Entity Recognizer for Konkani Text

  • Conference paper
  • First Online:
ICT with Intelligent Applications

Abstract

A named entity recognizer (NER), an essential tool for natural language processing (NLP), is presented for the first time for the Konkani language. Gold data of 1000 NER-tagged Konkani sentences consisting of 1068 named entities is one of the linguistic resources generated through this work. A conditional random field (CRF) classifier built on the training data set of 794 named entities from 800 sentences of the corpus, demonstrated 96% accuracy and 72% f-score. On the test data set of 274 named entities from 200 sentences of the corpus, 86% accuracy and 66% f-score were obtained. When the training and test data were complemented with a lookup table consisting of a database of 12 months, 53 locations, 44 person-names and 23 numerals and their synonyms, the figures improved to 99% accuracy and 90% f-score for the training data set, and 89% accuracy and 73% f-score for the test data set. To place our research in perspective, a summary is presented of the NER literature for world languages as well as Indian languages, as also NER for Indian languages using CRF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chinchor, N.: MUC-7 named entity task definition (1997). https://www-nlpir.nist.gov/related_projects/muc/proceedings/ne_task.html

  2. Chinchor, N., Brown, E.J., Ferro, L., Robinson, P.: Named entity recognition task definition (1999). http://www.dessem.com/sites/default/files/ne99_taskdef_v1_4.pdf

  3. Bhattacharyya, P.: Natural language processing—a perspective from computation in presence of ambiguity resource constraint and multilinguality. CSI J. Comput. 1(2), 1–13 (2012). https://www.cse.iitb.ac.in/~pb/papers/csi-nlp-pb-8aug12.pdf

  4. Gelbukh, A.: Natural language processing. In: Fifth International Conference on Hybrid Intelligent Systems (HIS’05), pp. 1 (2005). https://ieeexplore.ieee.org/document/1587718

  5. Makhoul, J., Jelinek, F., Rabiner, L., Weinstein, C., Zue, V.: White paper on spoken language systems. In: Proceedings of the Workshop on Speech and Natural Language, pp. 463–479. Association for Computational Linguistics, Stroudsburg, PA, USA (1989). https://dl.acm.org/citation.cfm?id=1075525

  6. Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble—a high-performance learning name-finder. In: Fifth Conference on Applied Natural Language Processing, pp. 194–201. Association for Computational Linguistics, Washington, DC, USA (1997). https://pdfs.semanticscholar.org/e929/2ba3230f01b9f6990362fdf06783b9347bf6.pdf

  7. Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs (2015). https://www.aclweb.org/anthology/Q16-1026

  8. Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web—an experimental study: Artif. Intell. 165(1), 91–134 (2005). https://web.eecs.umich.edu/~michjc/papers/knowitall-aij.pdf

  9. Florian, R., Ittycheriah, A., Jing, H., Zhang, T.: Named entity recognition through classifier combination. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003—Volume Association for Computational Linguistics, Stroudsburg, PA, USA (2003). https://www.aclweb.org/anthology/W03-0425

  10. Murthy, R., Khapra, M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition CoRR abs/1607.00198 (2016). https://arxiv.org/abs/1607.00198

  11. Huang, F.: Multilingual Named Entity Extraction and Translation from Text and Speech (2006). https://pdfs.semanticscholar.org/8368/d39c085e311c94f7f7434c983e0f37a779b3.pdf

  12. Shao, Y., Hardmeier, C., Nivre, J.: Multilingual named entity recognition using hybrid neural networks. In: The Sixth Swedish Language Technology Conference (SLTC) (2016). https://uu.diva-portal.org/smash/get/diva2:1055627/FULLTEXT01.pdf

  13. Agerri, R., Rigau, G.: Robust multilingual named entity recognition with shallow semi-supervised features. Artif. Intell. (2016). https://arxiv.org/pdf/1701.09123.pdf

  14. Loinaz, A., Tardaguila, B., Ramos, E., González, F.V., Enbeita, U.: Named entity recognition and classification for texts in Basque1 (2003). https://pdfs.semanticscholar.org/98ea/2de250b88016605751fe003e17d49310c511.pdf?_ga=2.231301334.832029152.1566558149-859315823.1563803599

  15. Whitelaw, C., Patrick, J.: Evaluating corpora for named entity recognition using character-level features. In: AI, pp. 910–921. Advances in Artificial Intelligence, Springer Berlin Heidelberg, Berlin, Heidelberg (2003). https://link.springer.com/chapter/10.1007/978-3-540-24581-0_78

  16. Georgiev, G., Nakov, P., Ganchev, K., Osenova, P., Simov, K.: Feature-rich named entity recognition for Bulgarian using conditional random fields. In: Proceedings of the International Conference (RANLP), pp. 113–117. Association for Computational Linguistics, Borovets, Bulgaria (2009). https://www.aclweb.org/anthology/R09-1022

  17. Da Silva, J.F., Kozareva, Z., Lopes, J.G.P.: Cluster analysis and classification of named entities. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004). https://www.aclweb.org/anthology/papers/L/L04/L04-1520/

  18. Carreras, X., Marquez, L., Padro, L.: Named entity recognition for Catalan using Spanish resources. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1. Association for Computational Linguistics, Stroudsburg, PA (2003). https://pdfs.semanticscholar.org/9cb2/55171dc7c1cbebbe73ba7f77a43f08914599.pdf

  19. Villodre, L.M., de Gispert, A., Carreras, X., Padró, L.: Low-cost named entity classification for catalan- exploiting multilingual resources and unlabeled data. In: NER@ACL (2003). https://www.semanticscholar.org/paper/Low-cost-Named-Entity-Classification-for-Catalan%3A-Villodre-Gispert/2176526afaf3e2c961bfddb8352fa13a7ffc7271

  20. May, J., Brunstein, A., Natarajan, P., Weischedel, R.: Surprise! What’s in a Cebuano or Hindi name? 2(3), 169–180 (2003). https://www.isi.edu/~jonmay/pubs/cebuano.pdf

  21. Chen, H.-H., Lee, J.-C.: Identification and classification of proper nouns in Chinese texts. In: Proceedings of the 16th Conference on Computational Linguistics, vol. 1, pp. 222–229. Association for Computational Linguistics, Stroudsburg, PA, USA (1996). https://pdfs.semanticsholar.org/9c26/98612474d02cb16406ac63cb6eca0ffda52e.pdf

  22. Wang, L.J., Li, W.-C., Chang, C.-H.: Recognizing unregistered names for mandarin word identification. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 4, pp. 1239–1243. Association for Computational Linguistics, Stroudsburg, PA, USA (1992). https://pdfs.semanticscholar.org/1267/879e1caf9627271dcdcf428eac4555b800b7.pdf?_ga=2.200318318.1133831351.1565924948-859315823.1563803599

  23. Yu, S., Bai, S., Wu, P.: Description of the Kent Ridge Digital Labs system used for MUC-7. In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29–May 1 (1998). https://pdfs.semanticscholar.org/83d3/9fe06722650134845f2688b99fc1b217cd3f.pdf?_ga=2.166984190.1133831351.1565924948-859315823.1563803599

  24. Piskorski, J., Pivovarova, L., Snajder, J., Steinberger, J., Yangarber, R.: The first cross-lingual challenge on recognition, normalization and matching of named entities in Slavic languages. In: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pp. 76–85. Association for Computationnal Linguistics, Valencia, Spain, (2017). https://www.aclweb.org/anthology/W17-1412

  25. Bick, E.: A named entity recognizer for Danish. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004)

    Google Scholar 

  26. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies, pp. 260–270. Association for Computational Linguistics, San Diego, California (2016). https://www.aclweb.org/anthology/N16-1030

  27. Azpeitia, A., Cuadros, M., Gaines, S., Rigau, G.: Supervised named entity recognition for French. In: Sojka, P., Horák, A., Kope, I., Pala, K. (eds.) Text, Speech and Dialogue, pp. 158–165. Springer International Publishing, Cham (2014). https://link.springer.com/chapter/10.1007/978-3-319-10816-2_20

  28. Petasis, G., Vichot, F., Wolinski, F., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Using machine learning to maintain rule-based named-entity recognition and classification systems. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistucs, Toulouse, France, pp. 426–433 (2001). https://pdfs.semanticscholar.org/05f8/f8e97e51c4fdb4511b17d57def2c05b35975.pdf

  29. Poibeau, T., Acoulon, A., Avaux, C., Beroff-Bénéat, L., Cadeau, A., Calberg, M., Delale, A., De Temmerman, L., Guenet, A.L., Huis, D., Jamalpour, M., Krul, A., Marcus, A., Picoli, F., Plancq, C.: The multilingual named entity recognition framework. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary (2003). https://www.aclweb.org/anthology/E03-1082

  30. Benikova, D., Biemann, C., Reznicek, M.: NoSta-D Named Entity Annotation for German-Guidelines and Dataset (n.d.)

    Google Scholar 

  31. Boutsis, S., Demiros, I., Giouli, V., Liakata, M., Papageorgiou, H., Piperidis, S.: A system for recognition of named entities in Greek. In: Christodoulakis, D.N. (ed.) Natural Language Processing—NLP 2000. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 424–435 (2000). https://link.springer.com/chapter/10.1007/3-540-45154-4_39

  32. Karkaletsis, V., Paliouras, G., Petasis, G., Manousopoulou, N., Spyropoulos, C.D.: Named-entity recognition from Greek and English texts. J. Intell. Rob. Syst. 26(2), 123–135 (1999). https://link.springer.com/article/10.1023/A:1008124406923

  33. Michailidis, I., Diamantaras, K., Vasileiadis, S., Frère, Y.: Greek named entity recognition using support vector machines, maximum entropy and onetime. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Genoa, Italy (2006). http://www.lrec-conf.org/proceedings/lrec2006/pdf/557_pdf.pdf

  34. Black, W.J., Rinaldi, F., Mowatt, D.: FACILE: description of the NE system used for MUC-7 (1998). https://www.aclweb.org/anthology/M98-1014

  35. Cucchiarelli, A., Velardi, P.: Unsupervised named entity recognition using syntactic and semantic contextual evidence. Comput. Linguist. 27(1), 123–131 (2001). https://www.aclweb.org/anthology/J01-1005

  36. Federico, M., Bertoldi, N., Sandrini, V.: Bootstrapping named entity recognition for Italian broadcast news. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 296–303. Association for Computational Linguistics, Stroudsburg, PA, USA (2002). https://dl.acm.org/doi/10.3115/1118693.1118731

  37. Choi, Y., Cha, J.: Korean named entity recognition and classification using word enbedding features. J. KIISE 43, 678–685 (2016). https://www.researchgate.net/publication/305842855

  38. Piskorski, J.: Extraction of Polish Named-Entities (2004). https://www.researchgate.net/publication/230729674

  39. de Castro, P.V.Q., da Silva, N.F.F., da Silva Soares, A.: Portuguese named entity recognition using LSTM-CRF. In: Villavicencio, A., Moreira, V., Abad, A., Caseli, H., Gamallo, P., Ramisch, C., Oliveira, H.G., Paetzold, G.H. (eds.) Computational Processing of the Portuguese Language, pp. 83–92. Springer International Publishing, Cham (2018). https://link.springer.com/chapter/10.1007/978-3-319-99722-3_9

  40. do Amaral, D.O.F., Fonseca, E., Lopes, L., Vieira, R.: Comparing NERP-CRF with publicly available Portuguese named entities recognition tools. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Nunes, M.G.V. (eds.) Computational Processing of the Portuguese Language, pp. 244–249. Springer International Publishing, Cham (2014). https://link.springer.com/chapter/10.1007/978-3-319-09761-9_27

  41. Palmer, D.D., Day, D.S.: A statistical profile of the named entity task. In: Fifth Conference on Applied Natural Language Processing, pp. 190–193. Association for Computational Linguistics. Washington, DC, USA (1997). https://www.aclweb.org/anthology/papers/A/A97/A97-1028/

  42. Pirovani, J.P.C., de Oliveira, E.: Portuguese named entity recognition using conditional random fields and local grammars. In: LREC (2018). https://www.semanticscholar.org/paper/Portuguese-Named-Entity-Recognition-using-Random-Pirovani-Oliveira/515f8e9fbbd04d3d1e625aa7f4e84e854bfc5987

  43. Cucerzan, S., Yarowsky, D.: Language independent named entity recognition combining morphological and contextual evidence. In: Joint {SIGDAT} Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999). https://pdfs.semanticscholar.org/425e/27250b17fe4c6fde104c97a754d0e2b296cd.pdf?_ga=2

  44. Mitrofan, M.: Bootstrapping a Romanian corpus for medical named entity recognition. In: RANLP (2017). https://www.semanticscholar.org/paper/Bootstrapping-a-Romanian-Corpus-for-Medical-Named-Mitrofan/5b0a0aff06999e36d0908dcd1375530d4e0ab6ea

  45. Popov, B., Kirilov, A., Maynard, D., Manov, D.: Creation of reusable components and language resources for named entity recognition in Russain. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004). http://www.lrec-conf.org/proceedings/lrec2004/pdf/267.pdf

  46. Almgren, S., Pavlov, S., MOgren, O.: Named entity recognition in Swedish health records with character-based deep bidirectional LSTM. In: Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining Bio Txt M 2016, pp. 30–39. The COLING 2016 Organizing Committee, Osaka, Japan (2016). https://www.aclweb.org/anthology/W16-5104

  47. Kokkinakis, D.: Aventinus, gate and Swedish Lingware. In: Proceedings of the 11th Nordic Conference of Computational Linguistics (NODALIDA), pp. 22–33. Center for Sprogteknologi, University of Copenhagen, Denmark (1998). https://www.aclweb.org/anthology/W98-1603

  48. Minh, P.Q.N.: A Feature-Rich Vietnamese Named-Entity Recognition Model: CoRR abs/1803.04375 (2018). https://arxiv.org/abs/1803.04375

  49. Yeniterizi, R., Tür, G., Oflazet, K.: Turkish named-entity recognition. In: Turkish Natural Language Processing, pp. 115–132. Springer International Publishing, Cham, Switzerland (2018). https://link.springer.com/chapter/10.1007/978-3-319-90165-7_6

  50. Ekbal, A., Haque, R., Das, A., Poka, V., Bandyopadhyay, S.: Language independent named entity recognition in Indian languages. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/108-5006

  51. Saha, S.K., Chatterji, S., Dandapat, S., Sarkar, S., Mitra, P.: A hybrid approach for named entity recognition in Indian languages (2008). https://www.semanticscholar.org/paper/A-Hybrid-Approach-for-Named-Entity-Recognition-in-Saha-Chatterji/5888d813565f3117ecec5e6f250051e6be2934c8

  52. Shishthla, P.M., Pingali, P., Varma, V.: A character n-gram based approach for improved recall in Indian language NER. In: Proceedings of the IJCNLP, Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5010

  53. Morwal, S., Jahan, N.: Named entity recognition using Hidden Markov Model (HMM)—an experimental result on Hindi, Urdu and Marathi languages (2013). https://pdfs.semanticscholar.org/85e4/cd646e0189eaab9aa2902e7e1a7ab6095f69.pdf

  54. Mishra, D., Sangal, R., Singh, A.K.: Asian federation of natural language processing. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=12

  55. Borker, S.B.: Text to Speech System for Konkani (Goan) Language (2006). https://www.semanticscholar.org/paper/TO-SPEECH_SYSTEM-FOR-KONKANI-(-GOAN-)-LANGUAGE-Borker/65e04340663c488da215797c276142debbd10564

  56. Colaco, J., Borkar, S.: Design and implementation of Konkani text to speech generation system using OCR technique. 2, 6 (2016). http://www.ijsrp.org/research-paper-0916.php?rp=P575773

  57. Fal Dessai, N.B., Naik, G.A., Pawar, J.D.: Implementation of a TTS system for Devanagari Konkani using festival. Int. J. Adv. Res. Comput. Sci. 8(5), 386–391 (2017). http://irgu.unigoa.ac.in/drs/handle/unigoa/4778

  58. Dessai, N.F., Naik, G., Pawar, J.: Development of Konkani TTS system using concatenative synthesis. In: International Conference on Computing and Control Engineering (2016). https://www.w3.org/2006/10/SSML/papers/paper.pdf

  59. Dessai, N.F., Naik, S., Salkar, S., Mohanan, S.: Text to speech for Konkani language. In: International Conference on Computing and Control Engineering (2012). https://www.w3.org/2006/10/SSML/papers/paper.pdf

  60. Mohanan, S., Salkar, S., Naik, G., Dessai, N.F., Naik, S.: Text reader for Konkani language. CIIT Int. J. Autom. Auton. Syst. 4(8) (2012). http://www.ciitresearch.org/dl/index.php/aa/article/view/AA072012029

  61. Kane, M.M.P.: Part of speech tagging for Konkani corpus. Int. J. Eng. Res. Comput. Sci. Eng. 4(6) (2017). http://ijercse.com/abstract.php?id=10298

  62. Khorjuvenkar, D.N.P., Ainapurkar, M., Chagas, S.: Part of speech tagging for Konkani language. Int. J. Eng. Res. Comput. Sci. Eng. 5(2) (2018). https://ieeexplore.ieee.org/document/8487620

  63. Rajan, A.: Design and implementation of a PoS tagger for Konkani using NLTK (unpublished) (2016)

    Google Scholar 

  64. Desai, S., Desai, N., Pawar, J., Bhattacharyya, P.: AutoParSe—An Automatic Paradigm Selector for Nouns in Konkani (2014). https://www.researchgate.net/publication/307137557

  65. Karmali, R., Pawar, J.D., Fondekar, A.: Konkani SentiWordNet—resource for sentiment analysis using supervised learning approach, pp. 55–59 (2016). http://irgu.unigoa.ac.in/drs/bitstream/handle/unigoa/4462/3_Workshop_Indian_Lang_Data_Resour_Evaluat_2016_55-59.pdf?sequence=1

  66. Miranda, D.T., Mascarenhas, M.: KOP—an opinion mining system in Konkani. In: IEEE International Conference on Recent Trends in Electronics Information Communication Technology, 20 May 2016, Bangalore, India (2016). https://ieeexplore.ieee.org/document/7807914

  67. Rajan, A., Salgaonkar, A.: Sentiment analysis for Konkani language—Konkani poetry, a case study. In: ICT Systems and Sustainability, Goa, India, vol. 1077, pp. 321–329. Springer, Singapore (2020). https://link.springer.com/chapter/10.1007/978-981-15-0936-0_32

  68. Rajan, V.: Konkanverter—a finite state transducer based statistical machine transliteration engine for Konkani language. In: Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing, pp. 11–19. Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014). https://www.aclweb.org/anthology/W14-5502/

  69. Phadte, A.: Resource creation for training and testing of normalisation systems for Konkani-English code-mixed social media text. In: International Conference on Applications of Natural Language to Information System, June 13–15, Paris, France, pp. 264–271 (2018). https://link.springer.com/chapter/10.1007/978-3-319-91947-8_26

  70. Karmali, R.N., Walawalikar, S., Ghanekar, D., Pawar, J., D’ Souza, C., Naik, S., Desai, S.: Experiences in building the Konkani WordNet using the expansion approach (2010). https://www.semanticscholar.org/paper/Experiences-in-Building-the-Konkani-WordNet-Using-Walawalikar-Desai/122d56a1ab3a9a468d9b5192381c8321dda16faa

  71. Pawar, J.: Linguistic data consortium for Indian languages (unpublished) (2019)

    Google Scholar 

  72. Sardesai, M., Pawar, J., Vaz, E., Walawalikar, S.: BIS annotation standards with reference to Konkani language. In: Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing, pp. 145–152 (2012). https://www.aclweb.org/anthology/W12-5012/

  73. Rajan, A., Salgaonkar, A., Joshi, R.: A survey of Konkani NLP resources. Comput. Sci. Rev. 38, 100299 (2020). https://doi.org/10.1016/j.cosrev.2020.100299

  74. Lafferty, J.D., McCallum, A., Pereira, C.N.: Conditional random fields—probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2001). https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers

  75. Wallach, H.M.: Conditional random fields—an introduction (2004). https://repository.upenn.edu/cgi/viewcontent.cgi?article=1011&context=cis_reports

  76. Bindu, S.M., Idicula, S.M.: Named entity identifier for Malayalam using linguistic principles employing statistical methods (2011). https://pdfs.semanticscholar.org/2445/3e39cb70faaf6bb2f3100b0119bd01def941.pdf

  77. Ekbal, A., Haque, R., Bandyopadhyay, S.: Named entity recognition in Bengali—a conditional random field approach. In: Proceedings of the Third International Joint Conference on Natural Language Processing, vol. II (2008). https://www.aclweb.org/anthology/108-2077

  78. Josan, G., Kaur, A., Kaur, J.: Named entity recognition for Punjabi—a conditional random field approach (2009). https://www.researchgate.net/publication/262731986

  79. Bikel, D.M., Schwartz, R., Weischedel, R.M.: An algorithm that learns what’s in a name. Mach. Learn. 34(1), 211–231 (1999). https://link.springer.com/article/10.1023/A:1007558221122

  80. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 3 (1995). http://image.diku.dk/imagecanon/material/cortes_vapnik95.pdf

  81. Borthwick, A.E.: A Maximum Entropy Approach to Named Entity Recognition (1999). https://dl.acm.org/citation.cfm?id=930095

  82. Krupka, G.R., Hausman, K.: IsoQuest Inc.—description of the NetOwl text extractor system as used for MUC-7 (1998). https://www.semanticscholar.org/paper/IsoQuest-Inc.%3A-Description-Of-The-NetOwl-(TM)-As-Krupka-Hausman/8fbdee94032da667c252eebfaa7b554f5f0379a2

  83. Chopra, D., Joshi, N., Mathur, I.: Named entity recognition in Hindi using conditional random fields. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, ACM, New York, USA (2016). https://link.springer.com/chapter/10.1007/978-3-642-19403-0_5

  84. Li, W., McCallum, A.: Rapid development of Hindi named entity recognition using conditional random fields and feature induction 2(3), 290–294 (2003). https://people.cs.umass.edu/~mccallum/papers/hindi-talip2003.pdf

  85. Goyal, A.: Named entity recognition for South Asian languages. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/108-5013

  86. Gupta, V., Lehal, G.S.: Named entity recognition for Punjabi language text summarization. Int. J. Comput. Appl. 33(3), 28–32 (2011). https://pdfs.semanticscholar.org/9a16/b8f37b236abce4893647e52ace69c17d5c2.pdf

  87. Reddy, A., Khapra, M.M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition for Telugu using LSTM-CRF (2016). http://lrec-conf.org/workshops/lrec2018/W11/pdf/2_W11.pdf

  88. Shishthla, P.M., Gali, K., Pingali, P., Varma, V.: Experiments in Telugu NER—a conditional random field approach. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5015

  89. Nongmeikapam, K., Shangkhunem, T., Chanu, N.M., Singh, L., Salam, B., Bandyopadhyay, S.: CRF based Name Entity Recognition (NER) in Manipuri—a highly agglutinative Indian language (2011). https://www.researchgate.net/publication/230785821

  90. Garg, V., Saraf, N., Majumder, P.: Named entity recognition for Gujarati—a CRF based approach. In: Proceedings of the First International Conference on Mining Intelligence and Knowledge Exploration, vol. 8284. Springer-Verlag New York, Inc., New York (2013). https://link.springer.com/chapter/10.1007/978-3-319-03844-5_74

  91. Amarappa, S., Sathyanarayana, S.V.: Kannada named entity recognition and classification using conditional random fields. In: 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), pp. 186–191 (2015). https://ieeexplore.ieee.org/document/7499010

  92. Sharma, P., Sharma, U., Kalita, J.: Named entity recognition in Assamese. Int. J. Comput. Appl. 142(1–8) (2016). https://www.researchgate.net/publication/303318979_Named_Entity_Recognition_in_Assamese

  93. Vijayakrishna, R., Sobha, L.: Domain focused named entity recognizer for tamil using conditional random fields. In: Proceedings of the IJCNLP: Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5009

  94. Malarkodi, C.S., Rao, P.R.K., Devi, S.L.: Tamil NER—coping with real time challenges. In: Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages, pp. 23–38. The COLING 2012 Organizing Committee, Mumbai, India (2012). https://www.aclweb.org/anthology/W12-5603

  95. Balabantaray, R., Das, S., Tanaya, K.: Case study of named entity recognition in Odia using Crf++ tool. Int. J. Adv. Comput. Sci. Appl. 4 (2013) https://www.researchgate.net/publication/269524231

  96. Malik, M.K., Sarwar, S.M.: Urdu named entity recognition and classification system using conditional random field (2015). https://pdfs.semanticscholar.org/192f/719579ae906149113e34f5912e3bfebf8ca5.pdf?_ga=2.196122220.1133831351.1565924948-859315823.1563803599

  97. Mukund, S., Srihari, R.K.: NE tagging for Urdu based on bootstrap POS learning. In: Proceedings of the Third International Workshop on Cross Lingual Information Access-AdDressing the Information Need of Multilingual Societies, pp. 61–69. Association for Computational Linguistics, Stroudsburg, PA, USA (2009). https://www.aclweb.org/anthology/W09-1609

  98. Sharma, P., Sharma, U., Kalita, J.: Named entity recognition in Assamese using CRFS and rules, pp. 15–18 (2014). https://www.researchgate.net/publication/286662918

  99. Singh, K.: Name entity recognition on Punjabi language. Int. J. Comput. Sci. Eng. Inf. Technol. Res. 3(5), 95–101 (2013). http://www.tjprc.org/publishpapers/--1383901932-11.%20Name%20entity%20recognition.full.pdf

  100. Ministry of Electronics & Information Technology: Technology Development for Indian Languages Programme (2009). http://tdil.meity.gov.in/

  101. Korobov, M.: Sklearn_crfsuite (2015). https://sklearn-crfsuite.readthedocs.io/en/latest/_modules/sklearn_crfsuite/estimator.html

  102. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program 45(1–3), 503–528 (1989). https://dl.acm.org/citation.cfm?id=3112866

  103. Rajan, A.: Named entity recognition corpus for Konkani language (2019). http://www.annierajan.com/nlp.php

Download references

Acknowledgements

We wish to acknowledge the help provided by Mrs Anju Sakardande, Head, Department of Indian Languages at Dhempe College of Arts and Science, Panaji, Goa and Mr. Sharat K. Raikar, language interpreter for Konkani and Hindi.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rajan, A., Salgaonkar, A. (2022). Named Entity Recognizer for Konkani Text. In: Senjyu, T., Mahalle, P.N., Perumal, T., Joshi, A. (eds) ICT with Intelligent Applications. Smart Innovation, Systems and Technologies, vol 248. Springer, Singapore. https://doi.org/10.1007/978-981-16-4177-0_69

Download citation

Publish with us

Policies and ethics