research-article

Free Access

DRAM-like Architecture with Asynchronous Refreshing for Continual Relation Extraction

Authors:
Tianci Bu

College of Systems Engineering, National University of Defense Technology, Changsha, China

College of Systems Engineering, National University of Defense Technology, Changsha, China

0009-0001-2005-1441
View Profile

,
Kang Yang

School of Information, Renmin University of China, Beijing, China

School of Information, Renmin University of China, Beijing, China

0009-0000-5489-516X
View Profile

,
Wenchuan Yang

College of Systems Engineering, National University of Defense Technology, Changsha, China

College of Systems Engineering, National University of Defense Technology, Changsha, China

0000-0003-3194-1690
View Profile

,
Jiawei Feng

College of Systems Engineering, National University of Defense Technology, Changsha, China

College of Systems Engineering, National University of Defense Technology, Changsha, China

0009-0005-4316-3904
View Profile

,
Xiaoyu Zhang

College of Systems Engineering, National University of Defense Technology, Changsha, China

College of Systems Engineering, National University of Defense Technology, Changsha, China

0009-0001-8502-6087
View Profile

,
Xin Lu

College of Systems Engineering, National University of Defense Technology, Changsha, China

College of Systems Engineering, National University of Defense Technology, Changsha, China

0000-0002-3547-6493
View Profile

Authors Info & Claims

WWW '24: Proceedings of the ACM on Web Conference 2024May 2024Pages 2216–2225https://doi.org/10.1145/3589334.3645621

Published:13 May 2024Publication History

WWW '24: Proceedings of the ACM on Web Conference 2024

Pages 2216–2225

ABSTRACT

Continual Relation Extraction (CRE) has found widespread web applications (e.g., search engines) in recent times. One significant challenge in this task is the phenomenon of catastrophic forgetting, where models tend to forget earlier information. Existing approaches in this field predominantly rely on memory-based methods to alleviate catastrophic forgetting, which overlooks the inherent challenge posed by the varying memory requirements of different relations and the need for a suitable memory refreshing strategy. Drawing inspiration from the mechanisms of Dynamic Random Access Memory (DRAM), our study introduces a novel CRE architecture with an asynchronous refreshing strategy to tackle these challenges. We first design a DRAM-like architecture, comprising three key modules: perceptron, controller, and refresher. This architecture dynamically allocates memory, enabling the consolidation of well-remembered relations while allocating additional memory for revisiting poorly learned relations. Furthermore, we propose a compromising asynchronous refreshing strategy to find the pivot between over-memorization and overfitting, which focuses on the current learning task and mixed-memory data asynchronously. Additionally, we explain the existing refreshing strategies in CRE from the DRAM perspective. Our proposed method has experimented on two benchmarks and overall outperforms ConPL (the SOTA method) by an average of 1.50% on accuracy, which demonstrates the efficiency of the proposed architecture and refreshing strategy.

Supplemental Material

rfp1833.mp4

Supplemental video

mp4

28.8 MB

Download

References

Amr Azzam, Christian Aebeloe, Gabriela Montoya, Ilkcan Keles, Axel Polleres, and Katja Hose. 2021. WiseKG: Balanced access to web knowledge graphs. In Proceedings of the Web Conference 2021. 1422--1434.Google ScholarDigital Library
Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. 2018. Efficient lifelong learning with a-gem. ICLR (2018).Google Scholar
Xiudi Chen, Hui Wu, and Xiaodong Shi. 2023. Consistent Prototype Learning for Few-Shot Continual Relation Extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 7409--7422.Google ScholarCross Ref
Li Cui, Deqing Yang, Jiaxin Yu, Chengwei Hu, Jiayang Cheng, Jingjie Yi, and Yanghua Xiao. 2021. Refining sample embeddings with relation prototypes to enhance continual relation extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 232--243.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
Li Dong, Furu Wei, Ming Zhou, and Ke Xu. 2015. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 260--269.Google ScholarCross Ref
Chrisantha Fernando, Dylan Banarse, Charles Blundell, Yori Zwols, David Ha, Andrei A Rusu, Alexander Pritzel, and Daan Wierstra. 2017. Pathnet: Evolution channels gradient descent in super neural networks. CoRR (2017).Google Scholar
Robert M French. 1999. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, Vol. 3, 4 (1999), 128--135.Google Scholar
Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. 2020. Continual relation learning via episodic memory activation and reconsolidation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6429--6440.Google ScholarCross Ref
Xu Han, Pengfei Yu, Zhiyuan Liu, Maosong Sun, and Peng Li. 2018a. Hierarchical relation extraction with coarse-to-fine grained attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2236--2245.Google ScholarCross Ref
Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2018b. Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. arXiv preprint arXiv:1810.10147 (2018).Google Scholar
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems , Vol. 33 (2020), 6840--6851.Google Scholar
Chengwei Hu, Deqing Yang, Haoliang Jin, Zhen Chen, and Yanghua Xiao. 2022. Improving continual relation extraction through prototypical contrastive learning. arXiv preprint arXiv:2210.04513 (2022).Google Scholar
Heechul Jung, Jeongwoo Ju, Minju Jung, and Junmo Kim. 2016. Less-forgetting learning in deep neural networks. arXiv preprint arXiv:1607.00122 (2016).Google Scholar
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, Vol. 114, 13 (2017), 3521--3526.Google ScholarCross Ref
Na Lei, Yang Guo, Dongsheng An, Xin Qi, Zhongxuan Luo, Shing-Tung Yau, and Xianfeng Gu. 2019. Mode collapse and regularity of optimal transportation maps. arXiv preprint arXiv:1902.02934 (2019).Google Scholar
Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2124--2133.Google ScholarCross Ref
Xialei Liu, Marc Masana, Luis Herranz, Joost Van de Weijer, Antonio M Lopez, and Andrew D Bagdanov. 2018. Rotate your networks: Better weight consolidation and less catastrophic forgetting. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2262--2268.Google ScholarCross Ref
David Lopez-Paz and Marc'Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
James L McClelland, Bruce L McNaughton, and Randall C O'Reilly. 1995. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychological review, Vol. 102, 3 (1995), 419.Google Scholar
Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Vol. 24. Elsevier, 109--165.Google Scholar
Franck Michel, Fabien Gandon, Valentin Ah-Kane, Anna Bobasheva, Elena Cabrio, Olivier Corby, Raphaël Gazzotti, Alain Giboin, Santiago Marro, Tobias Mayer, et al. 2020. Covid-on-the-Web: Knowledge graph and services to advance COVID-19 research. In The Semantic Web--ISWC 2020: 19th International Semantic Web Conference, Athens, Greece, November 2--6, 2020, Proceedings, Part II 19. Springer, 294--310.Google Scholar
Seyedali Mirjalili and Seyedali Mirjalili. 2019. Genetic algorithm. Evolutionary Algorithms and Neural Networks: Theory and Applications (2019), 43--55.Google ScholarCross Ref
Makoto Miwa and Mohit Bansal. 2016. End-to-end relation extraction using lstms on sequences and tree structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2016), 1105--1116.Google ScholarCross Ref
Mohsen Paniri, Mohammad Bagher Dowlatshahi, and Hossein Nezamabadi-Pour. 2020. MLACO: A multi-label feature selection algorithm based on ant colony optimization. Knowledge-Based Systems , Vol. 192 (2020), 105285.Google ScholarCross Ref
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems , Vol. 32 (2019).Google Scholar
Chengwei Qin and Shafiq Joty. 2022. Continual few-shot relation learning via embedding space regularization and data augmentation. arXiv preprint arXiv:2203.02135 (2022).Google Scholar
Roger Ratcliff. 1990. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological review, Vol. 97, 2 (1990), 285.Google Scholar
Hippolyt Ritter, Aleksandar Botev, and David Barber. 2018. Online structured laplace approximations for overcoming catastrophic forgetting. Advances in Neural Information Processing Systems , Vol. 31 (2018).Google Scholar
Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Progressive neural networks. CoRR (2016).Google Scholar
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings 15. Springer, 593--607.Google ScholarDigital Library
Sebastian Thrun. 1998. Lifelong learning algorithms. In Learning to learn. Springer, 181--209.Google ScholarDigital Library
Sebastian Thrun and Lorien Pratt. 2012. Learning to learn. Springer Science & Business Media.Google Scholar
Eli Verwimp, Matthias De Lange, and Tinne Tuytelaars. 2021. Rehearsal revealed: The limits and merits of revisiting samples in continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9385--9394.Google ScholarCross Ref
Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019. Sentence embedding alignment for lifelong relation extraction. arXiv preprint arXiv:1903.02588 (2019).Google Scholar
Tongtong Wu, Xuekai Li, Yuan-Fang Li, Gholamreza Haffari, Guilin Qi, Yujin Zhu, and Guoqiang Xu. 2021. Curriculum-meta learning for order-robust continual relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10363--10369.Google ScholarCross Ref
Tianjun Xiao, Jiaxing Zhang, Kuiyuan Yang, Yuxin Peng, and Zheng Zhang. 2014. Error-driven incremental learning in deep convolutional neural network for large-scale image classification. In Proceedings of the 22nd ACM international conference on Multimedia. 177--186.Google ScholarDigital Library
Chenyan Xiong, Russell Power, and Jamie Callan. 2017. Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th international conference on world wide web. 1271--1279.Google ScholarDigital Library
Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. In International conference on machine learning. PMLR, 3987--3995.Google Scholar
Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D Manning. 2017. Position-aware attention and supervised data improve slot filling. In Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
Kang Zhao, Hua Xu, Jiangong Yang, and Kai Gao. 2022. Consistent representation learning for continual relation extraction. arXiv preprint arXiv:2203.02721 (2022).Google Scholar
Zhi-Hua Zhou. 2022. Rehearsal: learning from prediction to decision. Frontiers of Computer Science, Vol. 16, 4 (2022), 164352.Google ScholarDigital Library

Index Terms

DRAM-like Architecture with Asynchronous Refreshing for Continual Relation Extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules
MEMSYS '19: Proceedings of the International Symposium on Memory Systems

Non-volatile, byte-addressable memory (NVM) has been introduced by Intel in the form of NVDIMMs named Intel® Optane™ DC PMM. This memory module has the ability to persist the data stored in it without the need for power. This expands the memory ...
Read More
Achieving DRAM-Like PCM by Trading Off Capacity for Latency
Phase Change Memory (PCM) is considered one of the most promising scalable non-volatile main memory alternatives to DRAM. It provides <inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives><mml:math><mml:mo>∼</mml:mo></mml:math><...
Read More
Reducing DRAM Power Using Compiler Assisted Refreshing
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '24: Proceedings of the ACM on Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
continual relation extraction
dynamic random access memory
memory allocation
refreshing strategy
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 25
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DRAM-like Architecture with Asynchronous Refreshing for Continual Relation Extraction

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules

Achieving DRAM-Like PCM by Trading Off Capacity for Latency

Reducing DRAM Power Using Compiler Assisted Refreshing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

DRAM-like Architecture with Asynchronous Refreshing for Continual Relation Extraction

WWW '24: Proceedings of the ACM on Web Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules

Achieving DRAM-Like PCM by Trading Off Capacity for Latency

Reducing DRAM Power Using Compiler Assisted Refreshing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media