Evaluation of Five Sentence Similarity Models on Electronic Medical Records.

BCB(2019)

引用 2|浏览58
暂无评分
摘要
Capturing the semantic similarity between sentences plays a vital role in several primary applications in biomedical and clinical domains: biomedical sentence search, evidence attribution, question-answering and text summarization. In this pilot study, we evaluated the effectiveness of five representative sentence similarity models, ranging from traditional machine learning methods to the latest bidirectional transformers in the clinical domain. The evaluation was performed on a dataset consisting of over 1K sentence pairs from EMRs - the largest public dataset in this domain by far. The results show that embeddings on large biomedical corpora are the most effective methods. It also demonstrates that CNN and BERT are effective to capture sentence similarity under relatively small datasets.
更多
查看译文
关键词
Natural language processing, EHR, textual similarity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要