A Vietnamese Language Model Based On Recurrent Neural Network

Viet-Trung Tran,Kiem-Hieu Nguyen, Duc-Hanh Bui

2016 EIGHTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE)（2016）

引用 3|浏览4

暂无评分

摘要

Language modeling plays a critical role in many natural language processing (NLP) tasks such as text prediction, machine translation and speech recognition. Traditional statistical language models (e.g. n-gram models) can only offer words that have been seen before and can not capture long word context. Neural language model provides a promising solution to surpass this shortcoming of statistical language model. This paper investigates Recurrent Neural Networks (RNNs) language model for Vietnamese, at character and syllable-levels. Experiments were conducted on a large dataset of 24M syllables, constructed from 1,500 movie subtitles. The experimental results show that our RNN-based language models yield reasonable performance on the movie subtitle dataset. Concretely, our models outperform n-gram language models in term of perplexity score.

查看译文

关键词

Vietnamese language model,natural language processing tasks,NLP,statistical language models,natural language model,RNN-based language models,recurrent neural network language model,movie subtitle dataset

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要