Tuning Multilingual Transformers For Named Entity Recognition On Slavic Languages

Mikhail Arkhipov,Maria Trofimova,Yuri Kuratov,Alexey Sorokin

7TH WORKSHOP ON BALTO-SLAVIC NATURAL LANGUAGE PROCESSING (BSNLP'2019)（2019）

引用 17|浏览3

暂无评分

摘要

Our paper addresses the problem of multilingual named entity recognition on the material of 4 languages: Russian, Bulgarian, Czech and Polish. We solve this task using the BERT model. We use a hundred languages multilingual model as base for transfer to the mentioned Slavic languages. Unsupervised pre-training of the BERT model on these 4 languages allows to significantly outperform baseline neural approaches and multilingual BERT. Additional improvement is achieved by extending BERT with a word-level CRF layer. Our system was submitted to BSNLP 2019 Shared Task on Multilingual Named Entity Recognition and took the 1st place in 3 competition metrics out of 4 we participated in. We open-sourced NER models and BERT model pre-trained on the four Slavic languages.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要