Automated HPC Workload Generation Combining Statistical Modeling and Autoregressive Analysis

BENCHMARKING, MEASURING, AND OPTIMIZING, BENCH 2023(2024)

引用 0|浏览4
暂无评分
摘要
Understanding the characteristics of workloads is essential to improving the management of a High Performance Computing (HPC) cluster. However, due to the restrictions of privacy and confidentiality, real HPC workloads are rarely open for studying. Generating synthetic workloads that mimic real workloads can facilitate related research, such as cluster planning and scheduling. Thus automated HPC workload generation has long been an active research topic. In this paper, we introduce a workload modeling approach that combines statistical modeling and autoregressive analysis. The model we built can generate complex, realistic HPC workloads with features that clearly describe the scheduling process, including job arrival time and other job attributes that affect scheduling such as job run time and job requested resources. Job arrivals in HPC clusters are generally represented by stochastic processes. In our proposed approach, job arrivals will be generated by a statistical model that consists of multiple Poisson processes with constraints provided by Gamma distribution. Then, we perform autoregressive analysis on the changing trends of job attributes to extract sequence information from historical workload trends that reflect user habits and scheduling habits in the cluster. Our approach generates job attributes based on the extracted sequence information for each job in the generated job arrival sequence. We evaluate the performance of the proposed approach using multiple metrics as well as a real-world use case. Experiments on real workloads from four supercomputing centers validate the effectiveness of the proposed method.
更多
查看译文
关键词
Workload generation,Workload characterization,Cluster scheduling,Statistical modeling,Autoregressive analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要