Named Entity Recognizer for Konkani Text

Rajan, Annie; Salgaonkar, Ambuja

doi:10.1007/978-981-16-4177-0_69

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 248))

2 Citations

Abstract

A named entity recognizer (NER), an essential tool for natural language processing (NLP), is presented for the first time for the Konkani language. Gold data of 1000 NER-tagged Konkani sentences consisting of 1068 named entities is one of the linguistic resources generated through this work. A conditional random field (CRF) classifier built on the training data set of 794 named entities from 800 sentences of the corpus, demonstrated 96% accuracy and 72% f-score. On the test data set of 274 named entities from 200 sentences of the corpus, 86% accuracy and 66% f-score were obtained. When the training and test data were complemented with a lookup table consisting of a database of 12 months, 53 locations, 44 person-names and 23 numerals and their synonyms, the figures improved to 99% accuracy and 90% f-score for the training data set, and 89% accuracy and 73% f-score for the test data set. To place our research in perspective, a summary is presented of the NER literature for world languages as well as Indian languages, as also NER for Indian languages using CRF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chinchor, N.: MUC-7 named entity task definition (1997). https://www-nlpir.nist.gov/related_projects/muc/proceedings/ne_task.html
Chinchor, N., Brown, E.J., Ferro, L., Robinson, P.: Named entity recognition task definition (1999). http://www.dessem.com/sites/default/files/ne99_taskdef_v1_4.pdf
Bhattacharyya, P.: Natural language processing—a perspective from computation in presence of ambiguity resource constraint and multilinguality. CSI J. Comput. 1(2), 1–13 (2012). https://www.cse.iitb.ac.in/~pb/papers/csi-nlp-pb-8aug12.pdf
Gelbukh, A.: Natural language processing. In: Fifth International Conference on Hybrid Intelligent Systems (HIS’05), pp. 1 (2005). https://ieeexplore.ieee.org/document/1587718
Makhoul, J., Jelinek, F., Rabiner, L., Weinstein, C., Zue, V.: White paper on spoken language systems. In: Proceedings of the Workshop on Speech and Natural Language, pp. 463–479. Association for Computational Linguistics, Stroudsburg, PA, USA (1989). https://dl.acm.org/citation.cfm?id=1075525
Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble—a high-performance learning name-finder. In: Fifth Conference on Applied Natural Language Processing, pp. 194–201. Association for Computational Linguistics, Washington, DC, USA (1997). https://pdfs.semanticscholar.org/e929/2ba3230f01b9f6990362fdf06783b9347bf6.pdf
Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs (2015). https://www.aclweb.org/anthology/Q16-1026
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web—an experimental study: Artif. Intell. 165(1), 91–134 (2005). https://web.eecs.umich.edu/~michjc/papers/knowitall-aij.pdf
Florian, R., Ittycheriah, A., Jing, H., Zhang, T.: Named entity recognition through classifier combination. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003—Volume Association for Computational Linguistics, Stroudsburg, PA, USA (2003). https://www.aclweb.org/anthology/W03-0425
Murthy, R., Khapra, M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition CoRR abs/1607.00198 (2016). https://arxiv.org/abs/1607.00198
Huang, F.: Multilingual Named Entity Extraction and Translation from Text and Speech (2006). https://pdfs.semanticscholar.org/8368/d39c085e311c94f7f7434c983e0f37a779b3.pdf
Shao, Y., Hardmeier, C., Nivre, J.: Multilingual named entity recognition using hybrid neural networks. In: The Sixth Swedish Language Technology Conference (SLTC) (2016). https://uu.diva-portal.org/smash/get/diva2:1055627/FULLTEXT01.pdf
Agerri, R., Rigau, G.: Robust multilingual named entity recognition with shallow semi-supervised features. Artif. Intell. (2016). https://arxiv.org/pdf/1701.09123.pdf
Loinaz, A., Tardaguila, B., Ramos, E., González, F.V., Enbeita, U.: Named entity recognition and classification for texts in Basque1 (2003). https://pdfs.semanticscholar.org/98ea/2de250b88016605751fe003e17d49310c511.pdf?_ga=2.231301334.832029152.1566558149-859315823.1563803599
Whitelaw, C., Patrick, J.: Evaluating corpora for named entity recognition using character-level features. In: AI, pp. 910–921. Advances in Artificial Intelligence, Springer Berlin Heidelberg, Berlin, Heidelberg (2003). https://link.springer.com/chapter/10.1007/978-3-540-24581-0_78
Georgiev, G., Nakov, P., Ganchev, K., Osenova, P., Simov, K.: Feature-rich named entity recognition for Bulgarian using conditional random fields. In: Proceedings of the International Conference (RANLP), pp. 113–117. Association for Computational Linguistics, Borovets, Bulgaria (2009). https://www.aclweb.org/anthology/R09-1022
Da Silva, J.F., Kozareva, Z., Lopes, J.G.P.: Cluster analysis and classification of named entities. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004). https://www.aclweb.org/anthology/papers/L/L04/L04-1520/
Carreras, X., Marquez, L., Padro, L.: Named entity recognition for Catalan using Spanish resources. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 1. Association for Computational Linguistics, Stroudsburg, PA (2003). https://pdfs.semanticscholar.org/9cb2/55171dc7c1cbebbe73ba7f77a43f08914599.pdf
Villodre, L.M., de Gispert, A., Carreras, X., Padró, L.: Low-cost named entity classification for catalan- exploiting multilingual resources and unlabeled data. In: NER@ACL (2003). https://www.semanticscholar.org/paper/Low-cost-Named-Entity-Classification-for-Catalan%3A-Villodre-Gispert/2176526afaf3e2c961bfddb8352fa13a7ffc7271
May, J., Brunstein, A., Natarajan, P., Weischedel, R.: Surprise! What’s in a Cebuano or Hindi name? 2(3), 169–180 (2003). https://www.isi.edu/~jonmay/pubs/cebuano.pdf
Chen, H.-H., Lee, J.-C.: Identification and classification of proper nouns in Chinese texts. In: Proceedings of the 16th Conference on Computational Linguistics, vol. 1, pp. 222–229. Association for Computational Linguistics, Stroudsburg, PA, USA (1996). https://pdfs.semanticsholar.org/9c26/98612474d02cb16406ac63cb6eca0ffda52e.pdf
Wang, L.J., Li, W.-C., Chang, C.-H.: Recognizing unregistered names for mandarin word identification. In: Proceedings of the 14th Conference on Computational Linguistics, vol. 4, pp. 1239–1243. Association for Computational Linguistics, Stroudsburg, PA, USA (1992). https://pdfs.semanticscholar.org/1267/879e1caf9627271dcdcf428eac4555b800b7.pdf?_ga=2.200318318.1133831351.1565924948-859315823.1563803599
Yu, S., Bai, S., Wu, P.: Description of the Kent Ridge Digital Labs system used for MUC-7. In: Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29–May 1 (1998). https://pdfs.semanticscholar.org/83d3/9fe06722650134845f2688b99fc1b217cd3f.pdf?_ga=2.166984190.1133831351.1565924948-859315823.1563803599
Piskorski, J., Pivovarova, L., Snajder, J., Steinberger, J., Yangarber, R.: The first cross-lingual challenge on recognition, normalization and matching of named entities in Slavic languages. In: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pp. 76–85. Association for Computationnal Linguistics, Valencia, Spain, (2017). https://www.aclweb.org/anthology/W17-1412
Bick, E.: A named entity recognizer for Danish. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies, pp. 260–270. Association for Computational Linguistics, San Diego, California (2016). https://www.aclweb.org/anthology/N16-1030
Azpeitia, A., Cuadros, M., Gaines, S., Rigau, G.: Supervised named entity recognition for French. In: Sojka, P., Horák, A., Kope, I., Pala, K. (eds.) Text, Speech and Dialogue, pp. 158–165. Springer International Publishing, Cham (2014). https://link.springer.com/chapter/10.1007/978-3-319-10816-2_20
Petasis, G., Vichot, F., Wolinski, F., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Using machine learning to maintain rule-based named-entity recognition and classification systems. In: Proceedings of the 39th Annual Meeting of the Association for Computational Linguistucs, Toulouse, France, pp. 426–433 (2001). https://pdfs.semanticscholar.org/05f8/f8e97e51c4fdb4511b17d57def2c05b35975.pdf
Poibeau, T., Acoulon, A., Avaux, C., Beroff-Bénéat, L., Cadeau, A., Calberg, M., Delale, A., De Temmerman, L., Guenet, A.L., Huis, D., Jamalpour, M., Krul, A., Marcus, A., Picoli, F., Plancq, C.: The multilingual named entity recognition framework. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest, Hungary (2003). https://www.aclweb.org/anthology/E03-1082
Benikova, D., Biemann, C., Reznicek, M.: NoSta-D Named Entity Annotation for German-Guidelines and Dataset (n.d.)
Google Scholar
Boutsis, S., Demiros, I., Giouli, V., Liakata, M., Papageorgiou, H., Piperidis, S.: A system for recognition of named entities in Greek. In: Christodoulakis, D.N. (ed.) Natural Language Processing—NLP 2000. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 424–435 (2000). https://link.springer.com/chapter/10.1007/3-540-45154-4_39
Karkaletsis, V., Paliouras, G., Petasis, G., Manousopoulou, N., Spyropoulos, C.D.: Named-entity recognition from Greek and English texts. J. Intell. Rob. Syst. 26(2), 123–135 (1999). https://link.springer.com/article/10.1023/A:1008124406923
Michailidis, I., Diamantaras, K., Vasileiadis, S., Frère, Y.: Greek named entity recognition using support vector machines, maximum entropy and onetime. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Genoa, Italy (2006). http://www.lrec-conf.org/proceedings/lrec2006/pdf/557_pdf.pdf
Black, W.J., Rinaldi, F., Mowatt, D.: FACILE: description of the NE system used for MUC-7 (1998). https://www.aclweb.org/anthology/M98-1014
Cucchiarelli, A., Velardi, P.: Unsupervised named entity recognition using syntactic and semantic contextual evidence. Comput. Linguist. 27(1), 123–131 (2001). https://www.aclweb.org/anthology/J01-1005
Federico, M., Bertoldi, N., Sandrini, V.: Bootstrapping named entity recognition for Italian broadcast news. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 296–303. Association for Computational Linguistics, Stroudsburg, PA, USA (2002). https://dl.acm.org/doi/10.3115/1118693.1118731
Choi, Y., Cha, J.: Korean named entity recognition and classification using word enbedding features. J. KIISE 43, 678–685 (2016). https://www.researchgate.net/publication/305842855
Piskorski, J.: Extraction of Polish Named-Entities (2004). https://www.researchgate.net/publication/230729674
de Castro, P.V.Q., da Silva, N.F.F., da Silva Soares, A.: Portuguese named entity recognition using LSTM-CRF. In: Villavicencio, A., Moreira, V., Abad, A., Caseli, H., Gamallo, P., Ramisch, C., Oliveira, H.G., Paetzold, G.H. (eds.) Computational Processing of the Portuguese Language, pp. 83–92. Springer International Publishing, Cham (2018). https://link.springer.com/chapter/10.1007/978-3-319-99722-3_9
do Amaral, D.O.F., Fonseca, E., Lopes, L., Vieira, R.: Comparing NERP-CRF with publicly available Portuguese named entities recognition tools. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Nunes, M.G.V. (eds.) Computational Processing of the Portuguese Language, pp. 244–249. Springer International Publishing, Cham (2014). https://link.springer.com/chapter/10.1007/978-3-319-09761-9_27
Palmer, D.D., Day, D.S.: A statistical profile of the named entity task. In: Fifth Conference on Applied Natural Language Processing, pp. 190–193. Association for Computational Linguistics. Washington, DC, USA (1997). https://www.aclweb.org/anthology/papers/A/A97/A97-1028/
Pirovani, J.P.C., de Oliveira, E.: Portuguese named entity recognition using conditional random fields and local grammars. In: LREC (2018). https://www.semanticscholar.org/paper/Portuguese-Named-Entity-Recognition-using-Random-Pirovani-Oliveira/515f8e9fbbd04d3d1e625aa7f4e84e854bfc5987
Cucerzan, S., Yarowsky, D.: Language independent named entity recognition combining morphological and contextual evidence. In: Joint {SIGDAT} Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999). https://pdfs.semanticscholar.org/425e/27250b17fe4c6fde104c97a754d0e2b296cd.pdf?_ga=2
Mitrofan, M.: Bootstrapping a Romanian corpus for medical named entity recognition. In: RANLP (2017). https://www.semanticscholar.org/paper/Bootstrapping-a-Romanian-Corpus-for-Medical-Named-Mitrofan/5b0a0aff06999e36d0908dcd1375530d4e0ab6ea
Popov, B., Kirilov, A., Maynard, D., Manov, D.: Creation of reusable components and language resources for named entity recognition in Russain. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA), Lisbon, Portugal (2004). http://www.lrec-conf.org/proceedings/lrec2004/pdf/267.pdf
Almgren, S., Pavlov, S., MOgren, O.: Named entity recognition in Swedish health records with character-based deep bidirectional LSTM. In: Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining Bio Txt M 2016, pp. 30–39. The COLING 2016 Organizing Committee, Osaka, Japan (2016). https://www.aclweb.org/anthology/W16-5104
Kokkinakis, D.: Aventinus, gate and Swedish Lingware. In: Proceedings of the 11th Nordic Conference of Computational Linguistics (NODALIDA), pp. 22–33. Center for Sprogteknologi, University of Copenhagen, Denmark (1998). https://www.aclweb.org/anthology/W98-1603
Minh, P.Q.N.: A Feature-Rich Vietnamese Named-Entity Recognition Model: CoRR abs/1803.04375 (2018). https://arxiv.org/abs/1803.04375
Yeniterizi, R., Tür, G., Oflazet, K.: Turkish named-entity recognition. In: Turkish Natural Language Processing, pp. 115–132. Springer International Publishing, Cham, Switzerland (2018). https://link.springer.com/chapter/10.1007/978-3-319-90165-7_6
Ekbal, A., Haque, R., Das, A., Poka, V., Bandyopadhyay, S.: Language independent named entity recognition in Indian languages. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/108-5006
Saha, S.K., Chatterji, S., Dandapat, S., Sarkar, S., Mitra, P.: A hybrid approach for named entity recognition in Indian languages (2008). https://www.semanticscholar.org/paper/A-Hybrid-Approach-for-Named-Entity-Recognition-in-Saha-Chatterji/5888d813565f3117ecec5e6f250051e6be2934c8
Shishthla, P.M., Pingali, P., Varma, V.: A character n-gram based approach for improved recall in Indian language NER. In: Proceedings of the IJCNLP, Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5010
Morwal, S., Jahan, N.: Named entity recognition using Hidden Markov Model (HMM)—an experimental result on Hindi, Urdu and Marathi languages (2013). https://pdfs.semanticscholar.org/85e4/cd646e0189eaab9aa2902e7e1a7ab6095f69.pdf
Mishra, D., Sangal, R., Singh, A.K.: Asian federation of natural language processing. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=12
Borker, S.B.: Text to Speech System for Konkani (Goan) Language (2006). https://www.semanticscholar.org/paper/TO-SPEECH_SYSTEM-FOR-KONKANI-(-GOAN-)-LANGUAGE-Borker/65e04340663c488da215797c276142debbd10564
Colaco, J., Borkar, S.: Design and implementation of Konkani text to speech generation system using OCR technique. 2, 6 (2016). http://www.ijsrp.org/research-paper-0916.php?rp=P575773
Fal Dessai, N.B., Naik, G.A., Pawar, J.D.: Implementation of a TTS system for Devanagari Konkani using festival. Int. J. Adv. Res. Comput. Sci. 8(5), 386–391 (2017). http://irgu.unigoa.ac.in/drs/handle/unigoa/4778
Dessai, N.F., Naik, G., Pawar, J.: Development of Konkani TTS system using concatenative synthesis. In: International Conference on Computing and Control Engineering (2016). https://www.w3.org/2006/10/SSML/papers/paper.pdf
Dessai, N.F., Naik, S., Salkar, S., Mohanan, S.: Text to speech for Konkani language. In: International Conference on Computing and Control Engineering (2012). https://www.w3.org/2006/10/SSML/papers/paper.pdf
Mohanan, S., Salkar, S., Naik, G., Dessai, N.F., Naik, S.: Text reader for Konkani language. CIIT Int. J. Autom. Auton. Syst. 4(8) (2012). http://www.ciitresearch.org/dl/index.php/aa/article/view/AA072012029
Kane, M.M.P.: Part of speech tagging for Konkani corpus. Int. J. Eng. Res. Comput. Sci. Eng. 4(6) (2017). http://ijercse.com/abstract.php?id=10298
Khorjuvenkar, D.N.P., Ainapurkar, M., Chagas, S.: Part of speech tagging for Konkani language. Int. J. Eng. Res. Comput. Sci. Eng. 5(2) (2018). https://ieeexplore.ieee.org/document/8487620
Rajan, A.: Design and implementation of a PoS tagger for Konkani using NLTK (unpublished) (2016)
Google Scholar
Desai, S., Desai, N., Pawar, J., Bhattacharyya, P.: AutoParSe—An Automatic Paradigm Selector for Nouns in Konkani (2014). https://www.researchgate.net/publication/307137557
Karmali, R., Pawar, J.D., Fondekar, A.: Konkani SentiWordNet—resource for sentiment analysis using supervised learning approach, pp. 55–59 (2016). http://irgu.unigoa.ac.in/drs/bitstream/handle/unigoa/4462/3_Workshop_Indian_Lang_Data_Resour_Evaluat_2016_55-59.pdf?sequence=1
Miranda, D.T., Mascarenhas, M.: KOP—an opinion mining system in Konkani. In: IEEE International Conference on Recent Trends in Electronics Information Communication Technology, 20 May 2016, Bangalore, India (2016). https://ieeexplore.ieee.org/document/7807914
Rajan, A., Salgaonkar, A.: Sentiment analysis for Konkani language—Konkani poetry, a case study. In: ICT Systems and Sustainability, Goa, India, vol. 1077, pp. 321–329. Springer, Singapore (2020). https://link.springer.com/chapter/10.1007/978-981-15-0936-0_32
Rajan, V.: Konkanverter—a finite state transducer based statistical machine transliteration engine for Konkani language. In: Proceedings of the Fifth Workshop on South and Southeast Asian Natural Language Processing, pp. 11–19. Association for Computational Linguistics and Dublin City University, Dublin, Ireland (2014). https://www.aclweb.org/anthology/W14-5502/
Phadte, A.: Resource creation for training and testing of normalisation systems for Konkani-English code-mixed social media text. In: International Conference on Applications of Natural Language to Information System, June 13–15, Paris, France, pp. 264–271 (2018). https://link.springer.com/chapter/10.1007/978-3-319-91947-8_26
Karmali, R.N., Walawalikar, S., Ghanekar, D., Pawar, J., D’ Souza, C., Naik, S., Desai, S.: Experiences in building the Konkani WordNet using the expansion approach (2010). https://www.semanticscholar.org/paper/Experiences-in-Building-the-Konkani-WordNet-Using-Walawalikar-Desai/122d56a1ab3a9a468d9b5192381c8321dda16faa
Pawar, J.: Linguistic data consortium for Indian languages (unpublished) (2019)
Google Scholar
Sardesai, M., Pawar, J., Vaz, E., Walawalikar, S.: BIS annotation standards with reference to Konkani language. In: Proceedings of the 3rd Workshop on South and Southeast Asian Natural Language Processing, pp. 145–152 (2012). https://www.aclweb.org/anthology/W12-5012/
Rajan, A., Salgaonkar, A., Joshi, R.: A survey of Konkani NLP resources. Comput. Sci. Rev. 38, 100299 (2020). https://doi.org/10.1016/j.cosrev.2020.100299
Lafferty, J.D., McCallum, A., Pereira, C.N.: Conditional random fields—probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2001). https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers
Wallach, H.M.: Conditional random fields—an introduction (2004). https://repository.upenn.edu/cgi/viewcontent.cgi?article=1011&context=cis_reports
Bindu, S.M., Idicula, S.M.: Named entity identifier for Malayalam using linguistic principles employing statistical methods (2011). https://pdfs.semanticscholar.org/2445/3e39cb70faaf6bb2f3100b0119bd01def941.pdf
Ekbal, A., Haque, R., Bandyopadhyay, S.: Named entity recognition in Bengali—a conditional random field approach. In: Proceedings of the Third International Joint Conference on Natural Language Processing, vol. II (2008). https://www.aclweb.org/anthology/108-2077
Josan, G., Kaur, A., Kaur, J.: Named entity recognition for Punjabi—a conditional random field approach (2009). https://www.researchgate.net/publication/262731986
Bikel, D.M., Schwartz, R., Weischedel, R.M.: An algorithm that learns what’s in a name. Mach. Learn. 34(1), 211–231 (1999). https://link.springer.com/article/10.1023/A:1007558221122
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 3 (1995). http://image.diku.dk/imagecanon/material/cortes_vapnik95.pdf
Borthwick, A.E.: A Maximum Entropy Approach to Named Entity Recognition (1999). https://dl.acm.org/citation.cfm?id=930095
Krupka, G.R., Hausman, K.: IsoQuest Inc.—description of the NetOwl text extractor system as used for MUC-7 (1998). https://www.semanticscholar.org/paper/IsoQuest-Inc.%3A-Description-Of-The-NetOwl-(TM)-As-Krupka-Hausman/8fbdee94032da667c252eebfaa7b554f5f0379a2
Chopra, D., Joshi, N., Mathur, I.: Named entity recognition in Hindi using conditional random fields. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, ACM, New York, USA (2016). https://link.springer.com/chapter/10.1007/978-3-642-19403-0_5
Li, W., McCallum, A.: Rapid development of Hindi named entity recognition using conditional random fields and feature induction 2(3), 290–294 (2003). https://people.cs.umass.edu/~mccallum/papers/hindi-talip2003.pdf
Goyal, A.: Named entity recognition for South Asian languages. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/108-5013
Gupta, V., Lehal, G.S.: Named entity recognition for Punjabi language text summarization. Int. J. Comput. Appl. 33(3), 28–32 (2011). https://pdfs.semanticscholar.org/9a16/b8f37b236abce4893647e52ace69c17d5c2.pdf
Reddy, A., Khapra, M.M., Bhattacharyya, P.: Sharing network parameters for crosslingual named entity recognition for Telugu using LSTM-CRF (2016). http://lrec-conf.org/workshops/lrec2018/W11/pdf/2_W11.pdf
Shishthla, P.M., Gali, K., Pingali, P., Varma, V.: Experiments in Telugu NER—a conditional random field approach. In: Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5015
Nongmeikapam, K., Shangkhunem, T., Chanu, N.M., Singh, L., Salam, B., Bandyopadhyay, S.: CRF based Name Entity Recognition (NER) in Manipuri—a highly agglutinative Indian language (2011). https://www.researchgate.net/publication/230785821
Garg, V., Saraf, N., Majumder, P.: Named entity recognition for Gujarati—a CRF based approach. In: Proceedings of the First International Conference on Mining Intelligence and Knowledge Exploration, vol. 8284. Springer-Verlag New York, Inc., New York (2013). https://link.springer.com/chapter/10.1007/978-3-319-03844-5_74
Amarappa, S., Sathyanarayana, S.V.: Kannada named entity recognition and classification using conditional random fields. In: 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), pp. 186–191 (2015). https://ieeexplore.ieee.org/document/7499010
Sharma, P., Sharma, U., Kalita, J.: Named entity recognition in Assamese. Int. J. Comput. Appl. 142(1–8) (2016). https://www.researchgate.net/publication/303318979_Named_Entity_Recognition_in_Assamese
Vijayakrishna, R., Sobha, L.: Domain focused named entity recognizer for tamil using conditional random fields. In: Proceedings of the IJCNLP: Workshop on Named Entity Recognition for South and South East Asian Languages (2008). https://www.aclweb.org/anthology/I08-5009
Malarkodi, C.S., Rao, P.R.K., Devi, S.L.: Tamil NER—coping with real time challenges. In: Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages, pp. 23–38. The COLING 2012 Organizing Committee, Mumbai, India (2012). https://www.aclweb.org/anthology/W12-5603
Balabantaray, R., Das, S., Tanaya, K.: Case study of named entity recognition in Odia using Crf++ tool. Int. J. Adv. Comput. Sci. Appl. 4 (2013) https://www.researchgate.net/publication/269524231
Malik, M.K., Sarwar, S.M.: Urdu named entity recognition and classification system using conditional random field (2015). https://pdfs.semanticscholar.org/192f/719579ae906149113e34f5912e3bfebf8ca5.pdf?_ga=2.196122220.1133831351.1565924948-859315823.1563803599
Mukund, S., Srihari, R.K.: NE tagging for Urdu based on bootstrap POS learning. In: Proceedings of the Third International Workshop on Cross Lingual Information Access-AdDressing the Information Need of Multilingual Societies, pp. 61–69. Association for Computational Linguistics, Stroudsburg, PA, USA (2009). https://www.aclweb.org/anthology/W09-1609
Sharma, P., Sharma, U., Kalita, J.: Named entity recognition in Assamese using CRFS and rules, pp. 15–18 (2014). https://www.researchgate.net/publication/286662918
Singh, K.: Name entity recognition on Punjabi language. Int. J. Comput. Sci. Eng. Inf. Technol. Res. 3(5), 95–101 (2013). http://www.tjprc.org/publishpapers/--1383901932-11.%20Name%20entity%20recognition.full.pdf
Ministry of Electronics & Information Technology: Technology Development for Indian Languages Programme (2009). http://tdil.meity.gov.in/
Korobov, M.: Sklearn_crfsuite (2015). https://sklearn-crfsuite.readthedocs.io/en/latest/_modules/sklearn_crfsuite/estimator.html
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program 45(1–3), 503–528 (1989). https://dl.acm.org/citation.cfm?id=3112866
Rajan, A.: Named entity recognition corpus for Konkani language (2019). http://www.annierajan.com/nlp.php

Download references

Acknowledgements

We wish to acknowledge the help provided by Mrs Anju Sakardande, Head, Department of Indian Languages at Dhempe College of Arts and Science, Panaji, Goa and Mr. Sharat K. Raikar, language interpreter for Konkani and Hindi.

Author information

Authors and Affiliations

DCT’s Dhempe College of Arts and Science, University of Mumbai, Mumbai, Maharashtra, 400032, India
Annie Rajan
University of Mumbai, Mumbai, Maharashtra, 400032, India
Ambuja Salgaonkar

Authors

Annie Rajan
View author publications
You can also search for this author in PubMed Google Scholar
Ambuja Salgaonkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of the Ryukyus, Okinawa, Japan
Tomonobu Senjyu
Sinhgad Technical Education society, SKNCOE, Pune, India
Parikshit N. Mahalle
Computer Science, Faculty of CS and IT, Universiti Putra Malaysia, Seri Kembangan, Malaysia
Thinagaran Perumal
Global Knowledge Research Foundation, Ahmedabad, India
Amit Joshi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajan, A., Salgaonkar, A. (2022). Named Entity Recognizer for Konkani Text. In: Senjyu, T., Mahalle, P.N., Perumal, T., Joshi, A. (eds) ICT with Intelligent Applications. Smart Innovation, Systems and Technologies, vol 248. Springer, Singapore. https://doi.org/10.1007/978-981-16-4177-0_69

Download citation

DOI: https://doi.org/10.1007/978-981-16-4177-0_69
Published: 06 December 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-4176-3
Online ISBN: 978-981-16-4177-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics