Accelerating Recommender Model Training by Dynamically Skipping Stale Embeddings
CoRR(2024)
摘要
Training recommendation models pose significant challenges regarding resource
utilization and performance. Prior research has proposed an approach that
categorizes embeddings into popular and non-popular classes to reduce the
training time for recommendation models. We observe that, even among the
popular embeddings, certain embeddings undergo rapid training and exhibit
minimal subsequent variation, resulting in saturation. Consequently, updates to
these embeddings lack any contribution to model quality. This paper presents
Slipstream, a software framework that identifies stale embeddings on the fly
and skips their updates to enhance performance. This capability enables
Slipstream to achieve substantial speedup, optimize CPU-GPU bandwidth usage,
and eliminate unnecessary memory access. SlipStream showcases training time
reductions of 2x, 2.4x, 1.2x, and 1.175x across real-world datasets and
configurations, compared to Baseline XDL, Intel-optimized DRLM, FAE, and
Hotline, respectively.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要