ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model

Raki Lachraf,El Moatez Billah Nagoudi, Youcef Ayachi,Ahmed Abdelali,Didier Schwab

FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019)（2019）

引用 5|浏览75

暂无评分

摘要

Word Embeddings (WE) are getting increasingly popular and widely applied in many Natural Language Processing (NLP) applications due to their effectiveness in capturing semantic properties of words; Machine Translation (MT), Information Retrieval (IR) and Information Extraction (IE) are among such areas. In this paper, we propose an open source ArbEngVec which provides several Arabic-English cross-lingual word embedding models. To train our bilingual models, we use a large dataset with more than 93 million pairs of Arabic-English parallel sentences. In addition, we perform both extrinsic and intrinsic evaluations for the different word embedding model variants. The extrinsic evaluation assesses the performance of models on the cross-language Semantic Textual Similarity (STS), while the intrinsic evaluation is based on the Word Translation (WT) task.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要