Hostname: page-component-8448b6f56d-c47g7 Total loading time: 0 Render date: 2024-04-24T20:20:58.070Z Has data issue: false hasContentIssue false

Interactive Text Graph Mining with a Prolog-Based Dialog Engine

Published online by Cambridge University Press:  07 October 2020

PAUL TARAU
Affiliation:
Department of Computer Science and Engineering, University of North Texas, 1155 Union Circle, Denton, Texas76203, USA, (e-mails:paul.tarau@unt.edu, eduardo.blanco@unt.edu)
EDUARDO BLANCO
Affiliation:
Department of Computer Science and Engineering, University of North Texas, 1155 Union Circle, Denton, Texas76203, USA, (e-mails:paul.tarau@unt.edu, eduardo.blanco@unt.edu)

Abstract

On top of a neural network-based dependency parser and a graph-based natural language processing module, we design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document. We reorganize dependency graphs to focus on the most relevant content elements of a sentence and integrate sentence identifiers as graph nodes. Additionally, after ranking the graph, we take advantage of the implicit semantic information that dependency links and WordNet bring in the form of subject–verb–object, “is-a” and “part-of” relations. Working on the Prolog facts and their inferred consequences, the dialog engine specializes the text graph with respect to a query and reveals interactively the document’s most relevant content elements. The open-source code of the integrated system is available at https://github.com/ptarau/DeepRank.

Type
Rapid Communication
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

We are thankful to the anonymous reviewers of PADL’2020 for their careful reading and constructive suggestions.

References

Adolphs, P., Xu, F., Li, H. and Uszkoreit, H. 2011. Dependency graphs as a generic interface between parsers and relation extraction rule learning. In KI 2011: Advances in Artificial Intelligence, 34th Annual German Conference on AI, Berlin, Germany, October 4–7, 2011. Proceedings, Bach, J. and Edelkamp, S., Eds. Lecture Notes in Computer Science, vol. 7006. Springer, 50–62.Google Scholar
Allahyari, M., Pouriyeh, S. A., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B. and Kochut, K. 2017. Text Summarization Techniques: A Brief Survey. CoRR abs/1707.02268.CrossRefGoogle Scholar
Bird, S. and Loper, E. 2004. NLTK: The natural language toolkit. In Proceedings of the ACL Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, Barcelona, Spain, 214217.Google Scholar
Bos, J. 2015. Open-domain semantic parsing with boxer. In Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015, May 11–13, 2015, Institute of the Lithuanian Language, Vilnius, Lithuania, B. Megyesi, Ed. Linköping University Electronic Press/ACL, 301–304.Google Scholar
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1–7, 107117. http://citeseer.nj.nec.com/brin98anatomy.html.CrossRefGoogle Scholar
Bunescu, R. C. and Mooney, R. J. 2005. A shortest path dependency Kernel for relation extraction. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005. Association for Computational Linguistics, Stroudsburg, PA, USA, 724–731.Google Scholar
Chen, D. and Manning, C. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 740750.Google Scholar
Choi, J. D. 2017. Deep dependency graph conversion in English. In Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories. TLT 2017, Bloomington, IN, 35–62.Google Scholar
Choi, J. D. and Palmer, M. 2011. Transition-based semantic role labeling using predicate argument clustering. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics, RELMS 2011. Association for Computational Linguistics, Stroudsburg, PA, USA, 37–45.Google Scholar
de Marneffe, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. and Manning, C. D. 2014. Universal stanford dependencies: A cross-linguistic typology. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014). European Languages Resources Association (ELRA), Reykjavik, Iceland, 4585–4592.Google Scholar
Denecker, M. and Kakas, A. C. 2002. Abduction in logic programming. In Computational Logic: Logic Programming and Beyond, Essays in Honour of Robert A. Kowalski, Part I. Springer-Verlag, London, UK, 402–436.Google Scholar
Devlin, J., Chang, M., Lee, K. and Toutanova, K. 2018. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805.Google Scholar
Erkan, G. and Radev, D. R. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22, 1, 457479.Google Scholar
Fellbaum, C. 1998. WordNet, An Electronic Lexical Database. The MIT Press.CrossRefGoogle Scholar
Gebser, M., Leone, N., Maratea, M., Perri, S., Ricca, F. and Schaub, T. 2018. Evaluation techniques and systems for answer set programming: A survey. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, 5450–5456.Google Scholar
Haveliwala, T., Kamvar, S. and Jeh, G. 2003. An Analytical Comparison of Approaches to Personalizing Pagerank. Technical Report 2003-35, Stanford InfoLab. June.Google Scholar
Haveliwala, T. H. 2002. Topic-sensitive pagerank. In Proceedings of the 11th International Conference on World Wide Web, WWW 2002. ACM, New York, NY, USA, 517–526.Google Scholar
Inclezan, D. 2019. Restkb: A library of commonsense knowledge about dining at a restaurant. In Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, September 20-25, 2019, Bogaerts, B., Erdem, E., Fodor, P., Formisano, A., Ianni, G., Inclezan, D., Vidal, G., Villanueva, A., Vos, M. D. and Yang, F., Eds. Electronic Proceedings in Theoretical Computer Science, vol. 306. Open Publishing Association, 126139.Google Scholar
Inclezan, D., Zhang, Q., Balduccini, M. and Israney, A. 2018. An ASP methodology for understanding narratives about stereotypical activities. TPLP 18, 3-4, 535552.Google Scholar
Krapivin, M., Autayeu, A. and Marchese, M. 2008. Large Dataset for Keyphrases Extraction. Technical Report DISI-09-055, DISI, Trento, Italy. May.Google Scholar
Li, W. and Zhao, J. 2016. TextRank algorithm by exploiting Wikipedia for short text keywords extraction. In 2016 3rd International Conference on Information Science and Control Engineering (ICISCE), 683–686.Google Scholar
Lierler, Y., Inclezan, D. and Gelfond, M. 2017. Action languages and question answering. In IWCS 2017 - 12th International Conference on Computational Semantics - Short papers, Montpellier, France, September 19–22, 2017, Gardent, C. and Retoré, C., Eds. The Association for Computer Linguistics.Google Scholar
Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J. and McClosky, D. 2014. The stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, 55–60.Google Scholar
Mihalcea, R. and Csomai, A. 2007. Wikify!: Linking documents to encyclopedic knowledge. In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007. ACM, New York, NY, USA, 233–242.Google Scholar
Mihalcea, R. and Tarau, P. 2004. TextRank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain.Google Scholar
Mihalcea, R. and Tarau, P. 2005. An algorithm for language independent single and multiple document summarization. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Korea.Google Scholar
Mihalcea, R., Tarau, P. and Figa, E. 2004. PageRank on semantic networks, with application to word sense disambiguation. In Proceedings of the 20st International Conference on Computational Linguistics (COLING 2004). Geneva, Switzerland.CrossRefGoogle Scholar
Mihalcea, R. F. and Radev, D. R. 2011. Graph-Based Natural Language Processing and Information Retrieval, 1st ed. Cambridge University Press, New York, NY, USA.CrossRefGoogle Scholar
Mitra, A., Clark, P., Tafjord, O. and Baral, C. 2019. Declarative question answering over knowledge bases containing natural language text with answer set programming. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI. AAAI Press, 30033010.Google Scholar
Nenkova, A. and McKeown, K. R. 2012. A survey of text summarization techniques. In Mining Text Data, Aggarwal, C. C. and Zhai, C., Eds. Springer, 43–76.Google Scholar
Olson, C. and Lierler, Y. 2019. Information extraction tool text2alm: From narratives to action language system descriptions. In Proceedings 35th International Conference on Logic Programming (Technical Communications), Las Cruces, NM, USA, September 20–25, 2019, Bogaerts, B., Erdem, E., Fodor, P., Formisano, A., Ianni, G., Inclezan, D., Vidal, G., Villanueva, A., Vos, M. D. and Yang, F., Eds. Electronic Proceedings in Theoretical Computer Science, vol. 306. Open Publishing Association, 87–100.Google Scholar
Page, L., Brin, S., Motwani, R. and Winograd, T. 1998. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford Digital Library Technologies Project.Google Scholar
Peng, Y., Gupta, S., Wu, C. and Shanker, V. 2015. An extended dependency graph for relation extraction in biomedical texts. In Proceedings of BioNLP 15. Association for Computational Linguistics, 2130.Google Scholar
Schaub, T. and Woltran, S. 2018. Special Issue on Answer Set Programming. KI 32, 2-3, 101103.Google Scholar
Schulte, C. 1997. Programming constraint inference engines. In Proceedings of the Third International Conference on Principles and Practice of Constraint Programming, G. Smolka, Ed. Lecture Notes in Computer Science, vol. 1330. Springer-Verlag, SchloßHagenberg, Austria, 519–533.Google Scholar
Stevenson, M. and Greenwood, M. 2009. Dependency pattern models for information extraction. Research on Language & Computation 7, 1, 1339.CrossRefGoogle Scholar
Tarau, P. and Blanco, E. 2019. Dependency-based text graphs for keyphrase and summary extraction with applications to interactive content retrieval. arXiv abs/1909.09742.Google Scholar
Tarau, P. and Blanco, E. 2020. Interactive text graph mining with a prolog-based dialog engine. In Practical Aspects of Declarative Languages – 22nd International Symposium, PADL 2020, New Orleans, USA, January 20-21, 2020, Proceedings, Komendantskaya, E. and Liu, Y. A., Eds. Lecture Notes in Computer Science, vol. 12007. Springer, 3–19.Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. and Polosukhin, I. 2017. Attention is all you need. CoRR abs/1706.03762.Google Scholar
Wielemaker, J., Schrijvers, T., Triska, M. and Lager, T. 2012. SWI-Prolog. Theory and Practice of Logic Programming 12, 6796.CrossRefGoogle Scholar