research-article

Open Access

Taxonomy Completion via Implicit Concept Insertion

Authors:
Jingchuan Shi

University of Oxford, Oxford, United Kingdom

University of Oxford, Oxford, United Kingdom

0000-0002-4615-5636
View Profile

,
Hang Dong

University of Oxford, Oxford, United Kingdom

University of Oxford, Oxford, United Kingdom

0000-0001-6828-6891
View Profile

,
Jiaoyan Chen

The University of Manchester, Manchester, United Kingdom

The University of Manchester, Manchester, United Kingdom

0000-0003-4643-6750
View Profile

,
Zhe Wu

eBay Inc., San Jose, USA

eBay Inc., San Jose, USA

0009-0003-6109-3386
View Profile

,
Ian Horrocks

University of Oxford, Oxford, United Kingdom

University of Oxford, Oxford, United Kingdom

0000-0002-2685-7462
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 2159–2169https://doi.org/10.1145/3589334.3645584

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 2159–2169

ABSTRACT

\beginabstract High quality taxonomies play a critical role in various domains such as e-commerce, web search and ontology engineering. While there has been extensive work on expanding taxonomies from externally mined data, there has been less attention paid to enriching taxonomies by exploiting existing concepts and structure within the taxonomy. In this work, we show the usefulness of this kind of enrichment, and explore its viability with a new taxonomy completion system ICON (I mplicit CON cept Insertion). ICON generates new concepts by identifying implicit concepts based on the existing concept structure, generating names for such concepts and inserting them in appropriate positions within the taxonomy. ICON integrates techniques from entity retrieval, text summary, and subsumption prediction; this modular architecture offers high flexibility while achieving state-of-the-art performance. We have evaluated ICON on two e-commerce taxonomies, and the results show that it offers significant advantages over strong baselines including recent taxonomy completion models and the large language model, ChatGPT.

Supplemental Material

rfp1605.mp4

Supplemental video

mp4

4.2 MB

Download

References

Franz Baader, Bernhard Hollunder, Bernhard Nebel, Hans-Jürgen Profitlich, and Enrico Franconi. 1994. Am empirical analysis of optimization techniques for terminological representation systems: Or: Making KRIS get a move on. Applied Intelligence 4 (1994), 109--132.Google ScholarCross Ref
Franz Baader, Ian Horrocks, Carsten Lutz, and Uli Sattler. 2017. Introduction to description logic. Cambridge University Press.Google Scholar
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.Google Scholar
Jiaoyan Chen, Yuan He, Ernesto Jimenez-Ruiz, Hang Dong, and Ian Horrocks. 2022. Contextual Semantic Embeddings for Ontology Subsumption Prediction. arXiv preprint arXiv:2202.09791 (2022).Google Scholar
Bhuwan Dhingra, Christopher J Shallue, Mohammad Norouzi, Andrew M Dai, and George E Dahl. 2018. Embedding text in hyperbolic spaces. arXiv preprint arXiv:1806.04313 (2018).Google Scholar
Hang Dong, Jiaoyan Chen, Yuan He, and Ian Horrocks. 2023. Ontology Enrich- ment from Texts: A Biomedical Dataset for Concept Discovery and Placement. In Proceedings of the 32nd ACM International Conference on Information & Knowl- edge Management (Birmingham, United Kingdom). Association for Computing Machinery, New York, NY, USA, 5 pages. https://doi.org/10.1145/3583780.3615126Google ScholarDigital Library
Hang Dong, Jiaoyan Chen, Yuan He, Yinan Liu, and Ian Horrocks. 2023. Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking. In Proceedings of the 32nd ACM International Conference on Information & Knowledge Management (Birmingham, United Kingdom). Association for Computing Ma- chinery, New York, NY, USA, 11 pages. https://doi.org/10.1145/3583780.3615036Google ScholarDigital Library
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021).Google Scholar
Birte Glimm, Ian Horrocks, Boris Motik, Rob Shearer, and Giorgos Stoilos. 2012. A novel approach to ontology classification. Journal of Web Semantics 14 (2012), 84--101.Google ScholarDigital Library
Yuhang Guo, Wanxiang Che, Ting Liu, and Sheng Li. 2011. A graph-based method for entity linking. In Proceedings of 5th International Joint Conference on Natural Language Processing. 1010--1018.Google Scholar
Nicolas Heist and Heiko Paulheim. 2023. NASTyLinker: NIL-Aware Scalable Transformer-Based Entity Linker. In The Semantic Web - 20th International Con- ference, ESWC 2023, Hersonissos, Crete, Greece, May 28 - June 1, 2023, Proceedings (Lecture Notes in Computer Science, Vol. 13870), Catia Pesquita, Ernesto Jiménez- Ruiz, Jamie P. McCusker, Daniel Faria, Mauro Dragoni, Anastasia Dimou, Raphaël Troncy, and Sven Hertling (Eds.). Springer, Cham, 174--191.Google Scholar
Minhao Jiang, Xiangchen Song, Jieyu Zhang, and Jiawei Han. 2022. Taxoenrich: Self-supervised taxonomy completion via structure-semantic representations. In Proceedings of the ACM Web Conference 2022. 925--934.Google ScholarDigital Library
Daniel Jurafsky and James H. Martin. 2023. Speech and Language Processing (3rd Edition). Online, Chapter 10 Transformers and Pretrained Language Models.Google Scholar
Zornitsa Kozareva and Eduard Hovy. 2010. A semi-supervised method to learn and construct taxonomies using the web. In Proceedings of the 2010 conference on empirical methods in natural language processing. 1110--1118.Google Scholar
Matt Le, Stephen Roller, Laetitia Papaxanthos, Douwe Kiela, and Maximilian Nickel. 2019. Inferring concept hierarchies from text corpora via hyperbolic embeddings. arXiv preprint arXiv:1902.00913 (2019).Google Scholar
Carolyn E Lipscomb. 2000. Medical subject headings (MeSH). Bulletin of the Medical Library Association 88, 3 (2000), 265.Google Scholar
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. arXiv:1711.05101 [cs.LG]Google Scholar
Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, and Nanyun Peng. 2021. Hyper- expan: Taxonomy expansion with hyperbolic representation learning. arXiv preprint arXiv:2109.10500 (2021).Google Scholar
Emaad Manzoor, Rui Li, Dhananjay Shrouty, and Jure Leskovec. 2020. Expanding taxonomies with implicit edge semantics. In Proceedings of The Web Conference 2020. 2044--2054.Google ScholarDigital Library
Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu, and Jiawei Han. 2018. End- to-end reinforcement learning for automatic taxonomy induction. arXiv preprint arXiv:1805.04044 (2018).Google Scholar
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarDigital Library
Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2022. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 11048--11064. https://doi.org/10. 18653/v1/2022.emnlp-main.759Google ScholarCross Ref
Roberto Navigli, Paola Velardi, and Stefano Faralli. 2011. A graph-based algorithm for inducing lexical taxonomies from scratch. In IJCAI, Vol. 11. 1872--1877.Google Scholar
Maximillian Nickel and Douwe Kiela. 2017. Poincaré embeddings for learning hierarchical representations. Advances in neural information processing systems 30 (2017).Google Scholar
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730--27744.Google Scholar
Simone Paolo Ponzetto, Michael Strube, et al. 2007. Deriving a large scale taxon- omy from Wikipedia. In AAAI, Vol. 7. 1440--1445.Google Scholar
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.htmlGoogle Scholar
Jiaming Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang, and Jiawei Han. 2020. TaxoExpan: Self-supervised taxonomy expansion with position- enhanced graph neural network. In Proceedings of The Web Conference 2020. 486--497.Google ScholarDigital Library
Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T Vanni, Brian M Sadler, and Jiawei Han. 2018. Hiexpan: Task-guided taxonomy construction by hierarchical tree expansion. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2180-- 2189.Google ScholarDigital Library
Wei Shen, Jianyong Wang, Ping Luo, and Min Wang. 2012. A graph-based approach for ontology population with named entities. In Proceedings of the 21st ACM international conference on Information and knowledge management. 345--354.Google ScholarDigital Library
Jingchuan Shi, Jiaoyan Chen, Hang Dong, Ishita Khan, Lizzie Liang, Qunzhi Zhou, Zhe Wu, and Ian Horrocks. 2023. Subsumption Prediction for E-Commerce Taxonomies. In European Semantic Web Conference. Springer, 244--261.Google Scholar
Nikhita Vedula, Patrick K Nicholson, Deepak Ajwani, Sourav Dutta, Alessandra Sala, and Srinivasan Parthasarathy. 2018. Enriching taxonomies with functional domain knowledge. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 745--754.Google ScholarDigital Library
Chengyu Wang, Xiaofeng He, and Aoying Zhou. 2017. A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 1190--1203. https://doi.org/10.18653/v1/D17--1123Google ScholarCross Ref
Suyuchen Wang, Ruihui Zhao, Xi Chen, Yefeng Zheng, and Bang Liu. 2021. Enquire one's parent and child before decision: Fully exploit hierarchical structure for self-supervised taxonomy expansion. In Proceedings of the Web Conference 2021. 3291--3304.Google ScholarDigital Library
Suyuchen Wang, Ruihui Zhao, Yefeng Zheng, and Bang Liu. 2022. Qen: Applicable taxonomy completion via evaluating full taxonomic relations. In Proceedings of the ACM Web Conference 2022. 1008--1017.Google ScholarDigital Library
Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Q Zhu. 2012. Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD international conference on management of data. 481--492.Google ScholarDigital Library
Yue Yu, Yinghao Li, Jiaming Shen, Hao Feng, Jimeng Sun, and Chao Zhang. 2020. Steam: Self-supervised taxonomy expansion with mini-paths. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1026--1035.Google ScholarDigital Library
Qingkai Zeng, Jinfeng Lin, Wenhao Yu, Jane Cleland-Huang, and Meng Jiang. 2021. Enhancing taxonomy completion with concept generation via fusing relational representations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2104--2113.Google ScholarDigital Library
Jieyu Zhang, Xiangchen Song, Ying Zeng, Jiaze Chen, Jiaming Shen, Yuning Mao, and Lei Li. 2021. Taxonomy completion via triplet matching network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4662--4670.Google ScholarCross Ref
Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Confer- ence on Learning Representations. https://openreview.net/forum?id=SkeHuCVFDrGoogle Scholar

Index Terms

Taxonomy Completion via Implicit Concept Insertion
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Ontology engineering

Recommendations

Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Automatic construction of a taxonomy supports many applications in e-commerce, web search, and question answering. Existing taxonomy expansion or completion methods assume that new concepts have been accurately extracted and their embedding vectors ...
Read More
TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic Representations
WWW '22: Proceedings of the ACM Web Conference 2022

Taxonomies are fundamental to many real-world applications in various domains, serving as structural representations of knowledge. To deal with the increasing volume of new concepts needed to be organized as taxonomies, researchers turn to automatically ...
Read More
TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching
WWW '23: Proceedings of the ACM Web Conference 2023

Taxonomies are used to organize knowledge in many applications, including recommender systems, content browsing, or web search. With the emergence of new concepts, static taxonomies become obsolete as they fail to capture up-to-date knowledge. Several ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Check for updates
Badges
- Artifacts Available / v1.1
Author Tags
ontology engineering
pre-trained language model
taxonomy completion
taxonomy enrichment
text summarisation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 35
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Taxonomy Completion via Implicit Concept Insertion

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations

TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic Representations

TaxoComplete: Self-Supervised Taxonomy Completion Leveraging Position-Enhanced Semantic Matching