Cross-language Phonetic Similarity Measure on Terms Appeared in Asian Languages

Ohnmar Htun,Shigeaki Kodama,Yoshiki Mikami

international conference on intelligent information processing（2011）

引用 5|浏览8

暂无评分

摘要

This study aims to develop a phonetic similarity measurement method across Asian languages. The method, cross-language similarity algorithm aggregates the transcription of language-specific Romanization, the International Phonetic Alphabet, the Soundex algorithm, and Levenshtein distance. To evaluate the proposed algorithm, this study involves an experiment using ninety-two chemical element names in nine different languages. The scores of the similarity of names were calculated between a source language and each target language. We could draw a line of threshold between the scores of similarities in each language into two groups (phonetic and semantic adoption groups). After evaluating the ratios of precision, recall, and F-measure, the results show that the proposed methodology successfully differentiates between phonetic and semantic groups by allocating the thresholds in all Asian languages, with the exception of Chinese. The results reported here prove that the proposed method has the potential to be applied to cross-language information retrieval and various linguistic studies.

查看译文

关键词

languages,similarity,cross-language

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要