Data Driven Resource Allocation for Distributed Learning

AAAI Workshops(2015)

引用 15|浏览193
暂无评分
摘要
In distributed machine learning, data is dispatched to multiple machines for processing. Motivated by the fact that similar data points are often belonging to the same or similar classes, and more generally, classification rules of high accuracy tend to be "locally simple but globally complex", we propose data dependent dispatching that takes advantage of such structures. Our main technical contribution is to provide algorithms with provable guarantees for data-dependent dispatching, that partition the data in a way that satisfies important conditions for accurate distributed learning, including fault tolerance and balancedness. We show the effectiveness of our method over the widely used random partitioning scheme in several real world image and advertising datasets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要