Optimizing HPC I/O Performance with Regression Analysis and Ensemble Learning

2023 IEEE International Conference on Cluster Computing (CLUSTER)(2023)

引用 0|浏览3
暂无评分
摘要
To improve parallel I/O performance, it is imperative to optimize the adjustable parameters across the different layers of the I/O software stack. Finding an optimal configuration for different scenarios is hampered by the complex interaction dynamics between these parameters and the large parameter space. Previous research efforts have focused on tuning these parameters using independent algorithms; however, these approaches exhibit certain shortcomings such as unstable performance results and delayed convergence rates. This paper introduces OPRAEL, an auto-tuning approach on parallel I/O tasks by ensembles and performance modeling using regression analysis. To test its effectiveness, we applied this approach on the Tianhe-II supercomputer using one well-known I/O benchmark(IOR) and two I/O kernels(S3D-I/O, BT-I/O). Leveraging our experience in predictive modeling, we optimized the tuning of the I/O stack parameters. Our experimental results show a remarkable 10.2X improvement in write performance speedup for the optimization task with BT-I/O and a 500x500x500 input. We also compared the potential of using a single search algorithm versus using reinforcement learning search in the I/O parameter auto-optimization task. Our results show that OPRAEL outperforms the traditional approach, resulting in a maximum 8.4X improvement in write performance for the 128-process IOR optimization.
更多
查看译文
关键词
HPC,Parallel I/O,Performance Optimization,Auto-tuning,Ensemble Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要