Discovering Latent Country Words: A Step Towards Cross-Cultural Emotional Communication

COLLABORATION TECHNOLOGIES AND SOCIAL COMPUTING (CRIWG+COLLABTECH 2019)(2019)

引用 0|浏览9
暂无评分
摘要
Knowing what concepts are substantial to each country can be helpful in enhancing emotional communication between two countries. As a concrete example of identifying substantial country concepts, we focus on a task of finding latent country words from cross-cultural texts of two countries. We do this by combining word embedding and tensor decomposition: common words that appear in both countries' texts are selected; their country specific word embeddings are learned; a three-way tensor consisting of word factor, word embedding factor, and country factor are constructed; and CANDECOMP/PARAFAC decomposition is performed on the three-way tensor while fixing the country factor values of the decomposed result. We tested our method on a motivating example of finding latent country words from J-pop lyrics from Japan and K-pop lyrics from South Korea. We found that J-pop lyrics words feature nature related motifs such as 'petal', 'cloud', 'universe', 'star', and 'sky', whereas K-pop lyrics words highlight human body related motifs such as `style', 'shirt', 'head', 'foot', and 'skin'.
更多
查看译文
关键词
Cross-cultural text analysis, Tensor decomposition, Word embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要