Enhancing Cross-lingual Medical Concept Alignment by Leveraging Synonyms and Translations of the Unified Medical Language System.

HPCC/DSS/SmartCity/DependSys(2022)

引用 1|浏览7
暂无评分
摘要
Well-developed medical terminology systems like the Unified Medical Language System (UMLS) improve the ability of language models to handle medical entity linking tasks. However, such magnificent terminology systems are only available for few languages, such as English. For Chinese, both simplified and traditional, the lack of well-developed terminology systems remains a big challenge to unify Chinese medical terminologies by linking medical entities as concepts. In this study, we purpose a translation enhanced contrastive learning scheme which leverages translations and synonyms of UMLS to infuse knowledge into the language model, and present a cross-lingual pre-trained language model called TeaBERT that aligns cross-lingual Chinese and English medical synonyms well at semantic level. Comparing with former cross-lingual language models, TeaBERT significantly outperforms on evaluation datasets, with 93.21%, 89.89% and 76.45% of Top 5 accuracy on ICDI0-CN, CHPO and RealWorld dataset respectively, and achieves new state-of-theart performance without task specific fine-tuning. Our contrastive learning scheme can not only be used for enhancing Chinese-English medical concepts alignment, but also be applied to other languages facing the same challenges.
更多
查看译文
关键词
NLP,pre-trained language model,cross-lingual medical entity linking,UMLS
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要