K-Adapter - Infusing Knowledge into Pre-Trained Models with Adapters.

Wang Ruize,Tang Duyu,Duan Nan,Wei Zhongyu,Huang Xuanjing,ji Jianshu,Cao Cuihong,Jiang Daxin,Zhou Ming

ACL/IJCNLP（2021）

引用 477|浏览674

暂无评分

摘要

We study the problem of injecting knowledge into large pre-trained models like BERT and RoBERTa. Existing methods typically update the original parameters of pre-trained models when injecting knowledge. However, when multiple kinds of knowledge are injected, they may suffer from the problem of catastrophic forgetting. To address this, we propose K-Adapter, which remains the original parameters of the pre-trained model fixed and supports continual knowledge infusion. Taking RoBERTa as the pre-trained model, K-Adapter has a neural adapter for each kind of infused knowledge, like a plug-in connected to RoBERTa. There is no information flow between different adapters, thus different adapters are efficiently trained in a distributed way. We inject two kinds of knowledge, including factual knowledge obtained from automatically aligned text-triplets on Wikipedia and Wikidata, and linguistic knowledge obtained from dependency parsing. Results on three knowledge-driven tasks (total six datasets) including relation classification, entity typing and question answering demonstrate that each adapter improves the performance, and the combination of both adapters brings further improvements. Probing experiments further show that K-Adapter captures richer factual and commonsense knowledge than RoBERTa.

查看译文

关键词

models,knowledge,k-adapter,pre-trained

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要