Can I find information about rare diseases in some other language?

PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)(2018)

引用 4|浏览201
暂无评分
摘要
Natural Language Processing (NLP) is a field that joins computer science and linguistics in an attempt to mimic, artificially, human language understanding. This paper applied NLP in the medical domain. The trigger that motivated this research was an expert reading an article about a rare disease who was interested in finding related documents. Being aware of the fact that language boundaries often limit, unnecessarily, the amount of information found, the goal of our work is to retrieve information without bounding to translation methods. Semantic similarity approaches offer a framework to represent related words and sentences in a dense space. In this work, we turned to cross-lingual dense spaces to represent bilingual documents in a shared dense space. Our approach helped to retrieve both intra-and cross-lingual documents just resting upon a few parallel documents to infer the optimal mapping from. From the experimental results we learned that an important issue is to keep aligned the mapping space and the cross-lingual search space. The cosine similarity outperforms both Euclidean and Manhattan distance. The results obtained in our preliminary experiments suggest that, although there is room for improvement, our approach performs satisfactorily achieving a P@10 of 71.72 searching English documents and returning Spanish related documents and 70.80 in the opposite direction.
更多
查看译文
关键词
Clinical text mining, Cross-lingual information-retrieval, Natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要