Finding Relevant Data in a Sea of Languages

Michael Coury,Elizabeth Salesky, Jennifer Drexler

user-5ebe28d54c775eda72abcdf7(2016)

引用 0|浏览19
暂无评分
摘要
A cross-language search engine combines language identification, machine translation, information retrieval, and query-biased summarization techniques to enable English monolingual analysts to find foreign language documents relevant to their investigations. About 6,000 languages are currently spoken in the world today, says Elizabeth Salesky of Lincoln Laboratorys Human Language Technology (HLT) Group. Within the law enforcement community, there are not enough multilingual analysts who possess the necessary level of proficiency to understand and analyze content across these languages, she continues. This problem of too many languages and too few specialized analysts is one Salesky and her colleagues are now working to solve for law enforcement agencies, but their work has potential application for the Department of Defense and Intelligence Community. The research team is taking advantage of major advances in language recognition, speaker recognition, speech recognition, machine translation, and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken foreign languages can be used more efficiently. With HLT, an equivalent of 20 times more foreign language analysts are at your disposal, says Salesky. One area in which Laboratory researchers are focusing their efforts is cross-language information retrieval (CLIR). The Cross-LAnguage Search Engine, or CLASE, is a CLIR tool developed by the HLT Groupfor the Federal Bureau of Investigation (FBI). CLASE is a fusion of Laboratory research in language identification, machine …
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要