A Configurable Model-Based Reinforcement Learning Framework for Disaggregated Storage Systems.

IEEE Access(2023)

引用 0|浏览5
暂无评分
摘要
With the rapid growth of data-intensive jobs and the use of different hardware in storage, disaggregated storage architecture systems are being used to improve the operational cost efficiency of data centers. The hardware heterogeneity and mixed configurations of disaggregated storage systems, along with the diversity of workloads, often make it difficult for administrators to operate them optimally. In this work, we investigate model-based reinforcement learning (RL) schemes to develop automated system operations and maintain the storage performance across various system settings and workloads in self-managed storage systems. Specifically, we propose a novel configurable model structure in which a system environment is abstracted with a two-level hierarchy of storage devices and a platform and thus the environment can be reconfigured according to a given system specification. Using that novel model structure, we implement a configurable model-based RL framework CoMoRL by which RL agents are trained through model variants that represent a variety of storage system specifications; thus, their learned management policy can be highly robust to the diverse operation conditions of real-world storage systems. We evaluate our CoMoRL framework using a storage cluster that relies on NVMe-oF devices and demonstrate that the framework can be adapted to different scenarios such as volume placement scenarios with Kubernetes and primary affinity control scenarios with Ceph. The learned management policy outperforms an IOPS-based heuristic method and a model-based method by 0.7%similar to 5.1% and 11.8%similar to 29.7%, respectively, for various Kubernetes system specifications, and by 1.6%similar to 5.6% and 8.2%similar to 16.5%, respectively, for various Ceph system specifications, without requiring model and policy retraining. This zero-shot adaptation superiority of our framework makes it possible to realize RL-based self-managing storage systems in data centers with frequent system changes.
更多
查看译文
关键词
Model-based reinforcement learning,configurable model,meta learning,policy adaptation,data placement,disaggregated storage,heterogeneous storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要