skip to main content
10.1145/3589334.3645584acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Open Access
Artifacts Available / v1.1

Taxonomy Completion via Implicit Concept Insertion

Published:13 May 2024Publication History

ABSTRACT

\beginabstract High quality taxonomies play a critical role in various domains such as e-commerce, web search and ontology engineering. While there has been extensive work on expanding taxonomies from externally mined data, there has been less attention paid to enriching taxonomies by exploiting existing concepts and structure within the taxonomy. In this work, we show the usefulness of this kind of enrichment, and explore its viability with a new taxonomy completion system ICON (I mplicit CON cept Insertion). ICON generates new concepts by identifying implicit concepts based on the existing concept structure, generating names for such concepts and inserting them in appropriate positions within the taxonomy. ICON integrates techniques from entity retrieval, text summary, and subsumption prediction; this modular architecture offers high flexibility while achieving state-of-the-art performance. We have evaluated ICON on two e-commerce taxonomies, and the results show that it offers significant advantages over strong baselines including recent taxonomy completion models and the large language model, ChatGPT.

Skip Supplemental Material Section

Supplemental Material

rfp1605.mp4

Supplemental video

mp4

4.2 MB

References

  1. Franz Baader, Bernhard Hollunder, Bernhard Nebel, Hans-Jürgen Profitlich, and Enrico Franconi. 1994. Am empirical analysis of optimization techniques for terminological representation systems: Or: Making KRIS get a move on. Applied Intelligence 4 (1994), 109--132.Google ScholarGoogle ScholarCross RefCross Ref
  2. Franz Baader, Ian Horrocks, Carsten Lutz, and Uli Sattler. 2017. Introduction to description logic. Cambridge University Press.Google ScholarGoogle Scholar
  3. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.Google ScholarGoogle Scholar
  4. Jiaoyan Chen, Yuan He, Ernesto Jimenez-Ruiz, Hang Dong, and Ian Horrocks. 2022. Contextual Semantic Embeddings for Ontology Subsumption Prediction. arXiv preprint arXiv:2202.09791 (2022).Google ScholarGoogle Scholar
  5. Bhuwan Dhingra, Christopher J Shallue, Mohammad Norouzi, Andrew M Dai, and George E Dahl. 2018. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313 (2018).Google ScholarGoogle Scholar
  6. Hang Dong, Jiaoyan Chen, Yuan He, and Ian Horrocks. 2023. Ontology Enrich- ment from Texts: A Biomedical Dataset for Concept Discovery and Placement. In Proceedings of the 32nd ACM International Conference on Information & Knowl- edge Management (Birmingham, United Kingdom). Association for Computing Machinery, New York, NY, USA, 5 pages. https://doi.org/10.1145/3583780.3615126Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hang Dong, Jiaoyan Chen, Yuan He, Yinan Liu, and Ian Horrocks. 2023. Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking. In Proceedings of the 32nd ACM International Conference on Information & Knowledge Management (Birmingham, United Kingdom). Association for Computing Ma- chinery, New York, NY, USA, 11 pages. https://doi.org/10.1145/3583780.3615036Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021).Google ScholarGoogle Scholar
  9. Birte Glimm, Ian Horrocks, Boris Motik, Rob Shearer, and Giorgos Stoilos. 2012. A novel approach to ontology classification. Journal of Web Semantics 14 (2012), 84--101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yuhang Guo, Wanxiang Che, Ting Liu, and Sheng Li. 2011. A graph-based method for entity linking. In Proceedings of 5th International Joint Conference on Natural Language Processing. 1010--1018.Google ScholarGoogle Scholar
  11. Nicolas Heist and Heiko Paulheim. 2023. NASTyLinker: NIL-Aware Scalable Transformer-Based Entity Linker. In The Semantic Web - 20th International Con- ference, ESWC 2023, Hersonissos, Crete, Greece, May 28 - June 1, 2023, Proceedings (Lecture Notes in Computer Science, Vol. 13870), Catia Pesquita, Ernesto Jiménez- Ruiz, Jamie P. McCusker, Daniel Faria, Mauro Dragoni, Anastasia Dimou, Raphaël Troncy, and Sven Hertling (Eds.). Springer, Cham, 174--191.Google ScholarGoogle Scholar
  12. Minhao Jiang, Xiangchen Song, Jieyu Zhang, and Jiawei Han. 2022. Taxoenrich: Self-supervised taxonomy completion via structure-semantic representations. In Proceedings of the ACM Web Conference 2022. 925--934.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel Jurafsky and James H. Martin. 2023. Speech and Language Processing (3rd Edition). Online, Chapter 10 Transformers and Pretrained Language Models.Google ScholarGoogle Scholar
  14. Zornitsa Kozareva and Eduard Hovy. 2010. A semi-supervised method to learn and construct taxonomies using the web. In Proceedings of the 2010 conference on empirical methods in natural language processing. 1110--1118.Google ScholarGoogle Scholar
  15. Matt Le, Stephen Roller, Laetitia Papaxanthos, Douwe Kiela, and Maximilian Nickel. 2019. Inferring concept hierarchies from text corpora via hyperbolic embeddings. arXiv preprint arXiv:1902.00913 (2019).Google ScholarGoogle Scholar
  16. Carolyn E Lipscomb. 2000. Medical subject headings (MeSH). Bulletin of the Medical Library Association 88, 3 (2000), 265.Google ScholarGoogle Scholar
  17. Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. arXiv:1711.05101 [cs.LG]Google ScholarGoogle Scholar
  18. Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, and Nanyun Peng. 2021. Hyper- expan: Taxonomy expansion with hyperbolic representation learning. arXiv preprint arXiv:2109.10500 (2021).Google ScholarGoogle Scholar
  19. Emaad Manzoor, Rui Li, Dhananjay Shrouty, and Jure Leskovec. 2020. Expanding taxonomies with implicit edge semantics. In Proceedings of The Web Conference 2020. 2044--2054.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, and Jiawei Han. 2018. End- to-end reinforcement learning for automatic taxonomy induction. arXiv preprint arXiv:1805.04044 (2018).Google ScholarGoogle Scholar
  21. George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 11048--11064. https://doi.org/10. 18653/v1/2022.emnlp-main.759Google ScholarGoogle ScholarCross RefCross Ref
  23. Roberto Navigli, Paola Velardi, and Stefano Faralli. 2011. A graph-based algorithm for inducing lexical taxonomies from scratch. In IJCAI, Vol. 11. 1872--1877.Google ScholarGoogle Scholar
  24. Maximillian Nickel and Douwe Kiela. 2017. Poincaré embeddings for learning hierarchical representations. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  25. Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730--27744.Google ScholarGoogle Scholar
  26. Simone Paolo Ponzetto, Michael Strube, et al. 2007. Deriving a large scale taxon- omy from Wikipedia. In AAAI, Vol. 7. 1440--1445.Google ScholarGoogle Scholar
  27. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.htmlGoogle ScholarGoogle Scholar
  28. Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang, and Jiawei Han. 2020. TaxoExpan: Self-supervised taxonomy expansion with position- enhanced graph neural network. In Proceedings of The Web Conference 2020. 486--497.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T Vanni, Brian M Sadler, and Jiawei Han. 2018. Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2180-- 2189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wei Shen, Jianyong Wang, Ping Luo, and Min Wang. 2012. A graph-based approach for ontology population with named entities. In Proceedings of the 21st ACM international conference on Information and knowledge management. 345--354.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jingchuan Shi, Jiaoyan Chen, Hang Dong, Ishita Khan, Lizzie Liang, Qunzhi Zhou, Zhe Wu, and Ian Horrocks. 2023. Subsumption Prediction for E-Commerce Taxonomies. In European Semantic Web Conference. Springer, 244--261.Google ScholarGoogle Scholar
  32. Nikhita Vedula, Patrick K Nicholson, Deepak Ajwani, Sourav Dutta, Alessandra Sala, and Srinivasan Parthasarathy. 2018. Enriching taxonomies with functional domain knowledge. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 745--754.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chengyu Wang, Xiaofeng He, and Aoying Zhou. 2017. A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1190--1203. https://doi.org/10.18653/v1/D17--1123Google ScholarGoogle ScholarCross RefCross Ref
  34. Suyuchen Wang, Ruihui Zhao, Xi Chen, Yefeng Zheng, and Bang Liu. 2021. Enquire one's parent and child before decision: Fully exploit hierarchical structure for self-supervised taxonomy expansion. In Proceedings of the Web Conference 2021. 3291--3304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Suyuchen Wang, Ruihui Zhao, Yefeng Zheng, and Bang Liu. 2022. Qen: Applicable taxonomy completion via evaluating full taxonomic relations. In Proceedings of the ACM Web Conference 2022. 1008--1017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Q Zhu. 2012. Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD international conference on management of data. 481--492.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yue Yu, Yinghao Li, Jiaming Shen, Hao Feng, Jimeng Sun, and Chao Zhang. 2020. Steam: Self-supervised taxonomy expansion with mini-paths. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1026--1035.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Qingkai Zeng, Jinfeng Lin, Wenhao Yu, Jane Cleland-Huang, and Meng Jiang. 2021. Enhancing taxonomy completion with concept generation via fusing relational representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2104--2113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Jieyu Zhang, Xiangchen Song, Ying Zeng, Jiaze Chen, Jiaming Shen, Yuning Mao, and Lei Li. 2021. Taxonomy completion via triplet matching network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4662--4670.Google ScholarGoogle ScholarCross RefCross Ref
  40. Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Confer- ence on Learning Representations. https://openreview.net/forum?id=SkeHuCVFDrGoogle ScholarGoogle Scholar

Index Terms

  1. Taxonomy Completion via Implicit Concept Insertion

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Article Metrics

      • Downloads (Last 12 months)35
      • Downloads (Last 6 weeks)35

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader