AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We investigated coarse-to-fine entity knowledge enhanced pre-training for named entity recognition, which integrates three kinds of entity knowledge with different granularity levels

Coarse to Fine Pre training for Named Entity Recognition

EMNLP 2020, pp.6345-6354, (2020)

Cited by: 0|Views233
Full Text
Bibtex
Weibo

Abstract

More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT. However, currentpre-training techniques focus on building lan-guage modeling objectives to learn a gen-eral representation, ignoring the named entity-related knowledge. To this end, we proposea NER-specific pre-training framewo...More

Code:

Data:

0
Introduction
  • Named Entity Recognition (NER) is the task of discovering information entities and identifying their corresponding categories, such as mentions of people, organizations, locations, temporal and numeric expressions (Freitag, 2004).
  • Despite refreshing the state-of-theart performance of NER, the current pre-training techniques are not directly optimized for NER
  • These models build unsupervised training objectives to capture dependency between words and learn a general language representation (Tian et al, 2020), while rarely considering incorporating named entity information which can provide rich knowledge for NER.
  • This strategy naturally introduces the natural language query which encodes significant prior knowledge about the entity
Highlights
  • Named Entity Recognition (NER) is the task of discovering information entities and identifying their corresponding categories, such as mentions of people, organizations, locations, temporal and numeric expressions (Freitag, 2004)
  • We propose a Coarse-to-Fine Entity knowledge Enhanced (CoFEE) pre-training framework for NER task, aiming to gather and utilize knowledge related to named entities
  • We explore the influence of our proposed pre-training tasks by removing entity span identification pretraining (-Entity Span Identification (ESI)) and fine-grained entity typing pretraining (-FTP) from CoFEE-machine reading comprehension (MRC)
  • We can observe that our CoFEE-MRC pre-training performs remarkably better than MRC-NER, establishing an impressive new state-of-the-art for supervised NER on OntoNotes and Twitter of 82.64% and 73.86%, re
  • We can observe that increasing the size of gazetteers will generally improve the performance of our proposed CoFEE-MRC model and the performance growths in line with the performance of “Matching”, indicating that in addition to the gazetteer size, matching degree has a crucial influence on the model performance
  • We investigated coarse-to-fine entity knowledge enhanced pre-training for named entity recognition, which integrates three kinds of entity knowledge with different granularity levels
Methods
  • The authors introduce the overall framework of the coarse-to-fine pre-training.
  • As the entity information of a text is seldom explicitly studied, it is hard to expect such pre-trained general representations to capture entity-centric knowledge.
  • In order to better capture entity information and learn NER-specific representation, the authors propose the first pre-training task named Entity Span Identification (ESI).
  • By integrating the general-typed named entity knowledge into the pretraining process, the learned representation would be incorporated with the structural information of crucial importance for NER
Results
  • Following the evaluation metrics in previous work (Li et al, 2020), the authors apply the entity-level standard micro Precision (P), Recall (R), and F1 score to evaluate the results.

    5.6 Overall Performance

    Table 2 contains results for models tuned on humanlabeled NER data.
  • Following the evaluation metrics in previous work (Li et al, 2020), the authors apply the entity-level standard micro Precision (P), Recall (R), and F1 score to evaluate the results.
  • Table 2 contains results for models tuned on humanlabeled NER data.
  • The authors can observe that the CoFEE-MRC pre-training performs remarkably better than MRC-NER, establishing an impressive new state-of-the-art for supervised NER on OntoNotes and Twitter of 82.64% and 73.86%, re- P R Matching -FET -FET-ESI E-commerce.
  • MRC-NER (Li et al, 2020) 54.84 Twitter
Conclusion
  • The authors investigated coarse-to-fine entity knowledge enhanced pre-training for named entity recognition, which integrates three kinds of entity knowledge with different granularity levels.
  • The authors' framework is highly effective and easy to implement.
  • On three popular NER benchmarks, the authors found consistent improvements over both state-of-the-art supervised and weakly-supervised methods.
  • Further analysis verifies the necessity of utilizing NER knowledge for pre-training models
Tables
  • Table1: Neural language questions for each entity type used in our model
  • Table2: Model performance (%) for supervised NER on three benchmark datasets. Bold marks highest number among all models
  • Table3: Model performance (%) for weakly supervised NER on three benchmark datasets. Bold marks highest number among all models
Download tables as Excel
Related work
Funding
  • This research is supported by the National Key Research and Development Program of China (grant No.2016YFB0801003) and the Strategic Priority Research Program of Chinese Academy of Sciences (grant No.XDC02040400)
Study subjects and analysis
public NER datasets: 3
Then we leverage thegazetteer-based distant supervision strategy totrain the model extract coarse-grained typedentities. Finally, we devise a self-supervisedauxiliary task to mine the fine-grained namedentity knowledge via clustering.Empiricalstudies on three public NER datasets demon-strate that our framework achieves significantimprovements against several pre-trained base-lines, establishing the new state-of-the-art per-formance on three benchmarks. Besides, weshow that our framework gains promising re-sults without using human-labeled trainingdata, demonstrating its effectiveness in label-few and low-resource scenarios

tweets: 4000
(3) Twitter is an English NER dataset (Qi et al, 2018), following (Peng et al, 2019), we only use textual information to perform NER and make entity detection on PER, LOC and ORG. It contains 4,000 tweets for training and 3,257 tweets for testing. 5.2 Pre-training Corpora

Reference
  • Ritter Alan, Clark Sam, Etzioni Oren, and et al. 201Named Entity Recognition in Tweets: An Experimental Study. In ACL, pages 1524–1534.
    Google ScholarLocate open access versionFindings
  • Bogdan Babych and Anthony Hartley. 2003. Improving Machine Translation Quality with Automatic Named Entity Recognition. In EAMT workshop, pages 1–8.
    Google ScholarLocate open access versionFindings
  • Yixin Cao, Zikun Hu, Tat-seng Chua, Zhiyuan Liu, and Heng Ji. 2019. Low-Resource Name Tagging Learned with Weakly Labeled Data. In EMNLPIJCNLP, pages 261–270.
    Google ScholarLocate open access versionFindings
  • Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. 2009. Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews]. IEEE Transactions on Neural Networks, 20(3):542–542.
    Google ScholarLocate open access versionFindings
  • Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011.
    Google ScholarFindings
  • Natural Language Processing almost from Scratch. Journal of Machine Learning Research, pages 2493– 2537.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL, pages 4171–4186.
    Google ScholarLocate open access versionFindings
  • Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, and Luo Si. 2019. A Neural Multi-digraph Model for Chinese NER with Gazetteers. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1462– 1467.
    Google ScholarLocate open access versionFindings
  • Seyler Dominic, Dembelova Tatiana, Del Corro Luciano, Hoffart Johannes, and Weikum Gerhard. 2018. A Study of The Importance of External Knowledge in The Named Entity Recognition Task. In ACL, pages 241–246.
    Google ScholarLocate open access versionFindings
  • Dayne Freitag. 2004. Trained Named Entity Recognition Using Distributional Clusters. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 262–269.
    Google ScholarLocate open access versionFindings
  • Tao Gui, Ruotian Ma, Qi Zhang, Lujun Zhao, Yu-Gang Jiang, and Xuanjing Huang. 2019a. CNN-Based Chinese NER with Lexicon Rethinking. In IJCAI, pages 4982–4988.
    Google ScholarLocate open access versionFindings
  • Tao Gui, Yicheng Zou, Qi Zhang, Minlong Peng, Jinlan Fu, Zhongyu Wei, and Xuanjing Huang. 2019b. A Lexicon-Based Graph Neural Network for Chinese NER. In EMNLP, pages 1039–1049.
    Google ScholarLocate open access versionFindings
  • Luyao Huang, Chi Sun, Xipeng Qiu, and Xuanjing Huang. 2019. GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge. In EMNLPIJCNLP, pages 3509–3514.
    Google ScholarLocate open access versionFindings
  • Zhiheng Huang, Wei Liang Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv: Computation and Language.
    Google ScholarFindings
  • Chen Jia, Xiaobo Liang, and Yue Zhang. 2019. CrossDomain NER using Cross-Domain Language Modeling. In ACL, pages 2464–2474.
    Google ScholarLocate open access versionFindings
  • Yoav Levine, Barak Lenz, Or Dagan, Dan Padnos, Or Sharir, Shai Shalevshwartz, Amnon Shashua, and Yoav Shoham. 2020. SenseBERT: Driving Some Sense into BERT. In ACL.
    Google ScholarFindings
  • Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, and Jiwei Li. 2020. A Unified MRC Framework for Named Entity Recognition. ACL.
    Google ScholarLocate open access versionFindings
  • Bill Yuchen Lin and Wei Lu. 20Neural Adaptation Layers for Cross-domain Named Entity Recognition. In EMNLP, pages 2012–2022.
    Google ScholarLocate open access versionFindings
  • Shengyu Liu, Buzhou Tang, Qingcai Chen, and Xiaolong Wang. 2015. Effects of Semantic Features on Machine Learning-Based Drug Name Recognition Systems: Word Embeddings vs. Manually Constructed Dictionaries. Information, 6(4):848–865.
    Google ScholarLocate open access versionFindings
  • Tianyu Liu, Jin-Ge Yao, and Chin-Yew Lin. 2019. Towards Improving Neural Named Entity Recognition with Gazetteers. In ACL, pages 5301–5307.
    Google ScholarLocate open access versionFindings
  • Xuezhe Ma and uard Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNsCRF. In ACL, pages 1064–1074.
    Google ScholarLocate open access versionFindings
  • Jian Ni, Georgiana Dinu, and Radu Florian. 2017. Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection. In ACL, pages 1470–1480.
    Google ScholarLocate open access versionFindings
  • Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, and Xuanjing Huang. 2019. Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning. In ACL, pages 2409–2419.
    Google ScholarLocate open access versionFindings
  • Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In NAACL, pages 2227–2237.
    Google ScholarLocate open access versionFindings
  • Zhang Qi, Fu Jinlan, Liu Xiaoyu, and Huang Xuanjing. 2018. Adaptive Co-attention Network for Named Entity Recognition in Tweets. In AAAI, pages 5674– 5681.
    Google ScholarLocate open access versionFindings
  • Weischedel Ralph, Pradhan Sameer, Ramshaw Lance, Palmer Martha, Xue Nianwen, Marcus Mitchell, Taylor Ann, Greenberg Craig, Hovy Eduard, Belvin Robert, and et al. 2011. Ontonotes release 4.0. In LDC2011T03, Philadelphia, Penn.: Linguistic Data Consortium.
    Google ScholarLocate open access versionFindings
  • Cıcero dos Santos and Victor Guimaraes. 2015. Boosting Named Entity Recognition with Neural Character Embeddings. In Proceedings of the Fifth Named Entity Workshop, pages 25–33.
    Google ScholarLocate open access versionFindings
  • Dianbo Sui, Yubo Chen, Kang Liu, Jun Zhao, and Shengping Liu. 2019. Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network. In EMNLP, pages 3821– 3831.
    Google ScholarLocate open access versionFindings
  • Hao Tian, Can Gao, Xinyan Xiao, Hao Liu, Bolei He, Hua Wu, Haifeng Wang, and Feng Wu. 2020. SKEP: Sentiment Knowledge Enhanced Pretraining for Sentiment Analysis. arXiv preprint arXiv:2005.05635.
    Findings
  • Jiateng Xie, Zhilin Yang, Graham Neubig, and Jaime Smith, Noah A. andCarbonell. 2018. Neural CrossLingual Named Entity Recognition with Minimal Resources. In EMNLP, pages 369–379.
    Google ScholarLocate open access versionFindings
  • Mengge Xue, Weiming Cai, Jinsong Su, Linfeng Song, Yubin Ge, Yubao Liu, and Bin Wang. 2019a. Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning. In IJCAI, pages 5327–5333.
    Google ScholarLocate open access versionFindings
  • Mengge Xue, Bowen Yu, Tingwen Liu, Erli Meng, and Bin Wang. 2019b. Porous Lattice Transformer Encoder for Chiense NER. arXiv: Computation and Language.
    Google ScholarFindings
  • Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He, and Min Zhang. 2018. Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning. In COLING, pages 2159– 2169.
    Google ScholarLocate open access versionFindings
  • Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in neural information processing systems, pages 5754–5764.
    Google ScholarLocate open access versionFindings
  • Zhilin Yang, Ruslan Salakhutdinov, and William W Cohen. 2017. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks. In ICLR.
    Google ScholarFindings
  • Bowen Yu, Zhenyu Zhang, Tingwen Liu, Bin Wang, Sujian Li, and Quangang Li. 2019. Beyond Word Attention: Using Segment Attention in Neural Relation Extraction. In IJCAI, pages 5401–5407.
    Google ScholarLocate open access versionFindings
  • Yue Zhang and Jie Yang. 2018. Chinese NER Using Lattice LSTM. In ACL, pages 1554–1564.
    Google ScholarLocate open access versionFindings
  • Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In ACL, pages 1441–1451.
    Google ScholarLocate open access versionFindings
  • Joey Tianyi Zhou, Hao Zhang, Di Jin, Hongyuan Zhu, Meng Fang, Rick Siow Mong Goh, and Kenneth Kwok. 2019. Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition. In ACL, pages 3461–3471.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科