Graphene: Packing And Dependency-Aware Scheduling For Data-Parallel Clusters

OSDI'16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation(2016)

引用 251|浏览285
暂无评分
摘要
We present a new cluster scheduler, GRAPHENE, aimed at jobs that have a complex dependency structure and heterogeneous resource demands. Relaxing either of these challenges, i.e., scheduling a DAG of homogeneous tasks or an independent set of heterogeneous tasks, leads to NP-hard problems. Reasonable heuristics exist for these simpler problems, but they perform poorly when scheduling heterogeneous DAGs. Our key insights are: (1) focus on the long-running tasks and those with tough to -pack resource demands, (2) compute a DAG schedule, offline, by first scheduling such troublesome tasks and then scheduling the remaining tasks without violating dependencies. These offline schedules are distilled to a simple precedence order and are enforced by an online component that scales to many jobs. The online component also uses heuristics to compactly pack tasks and to trade-off fairness for faster job completion. Evaluation on a zoo-server cluster and using traces of production DAGs at Microsoft, shows that GRAPHENE improves median job completion time by 25% and cluster throughput by 30%.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要