Robust and Probabilistic Failure-Aware Placement
ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018.
Abstract:
Motivated by the growing complexity and heterogeneity of modern data centers, and the prevalence of commodity component failures, this article studies the failure-aware placement problem of placing tasks of a parallel job on machines in the data center with the goal of increasing availability. We consider two models of failures: adversari...More
Code:
Data:
Tags
Comments