$f_{BGD}$: Learning Embeddings From Positive-Only Data with BGD

uncertainty in artificial intelligence(2018)

引用 23|浏览92
暂无评分
摘要
Learning embeddings from sparse positive data is a fundamental task for problems of several domains, such as natural language processing (NLP), computer vision (CV), and information retrieval (IR). By far, the most widely used optimization methods rely on stochastic gradient descent (SGD) with negative sampling (NS), particularly for learning from large-scale data. However, the convergence and effectiveness of SGD depend largely on the sampling distribution of negative examples. Moreover, SGD suffers from dramatic fluctuation due to its one-sample learning scheme. To address the above common issues of existing embedding methods, we present a generic batch gradient descent optimizer ($f_{BGD}$) to learn embeddings from emph{all} training examples without sampling. Our main contribution is that we accelerate $f_{BGD}$ by several magnitudes, making its time complexity the same level as the NS-based SGD. We evaluate $f_{BGD}$ on three well-known tasks across domains, namely, word embedding (NLP), image classification (CV), and item recommendation (IR). Experiments show that $f_{BGD}$ significantly outperforms NS-based SGD models on all three tasks with comparable efficiency. Codes will be made available.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要