DRAM Characterization under Relaxed Refresh Period Considering System Level Effects within a Commodity Server

2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS)(2018)

引用 4|浏览47
暂无评分
摘要
Today's rapid generation of data and the increased need for higher memory capacity has triggered a lot of studies on aggressive scaling of refresh period, which is currently set according to rare worst case conditions. Such studies analysed in detail the data-dependent circuit level factors and indicated the need for online DRAM characterization due to the variable cell retention time. They have done so by executing few test data patterns on FPGAs under controlled temperatures by using thermal testbeds, which however cannot be available in the field. Moreover, the existing studies were not able to reveal any system level effects, which may be excited under the execution of workloads on real systems and directly or indirectly affect DRAM reliability. In this paper, we develop an experimental framework based on a state-of-the-art 64-bit ARM based server with Linux OS, in which we enabled the DRAM characterization under relaxed refresh period by executing conventional test data patterns as well as popular HPC and Cloud workloads. Our results indicate that common test patterns are ineffective in identifying error-prone locations at low DRAM temperatures. Furthermore, we reveal that there is a strong correlation between the SOC utilization and DRAM reliability. By exploiting such findings, we developed a benchmark, which can indirectly stress the DRAM temperature and thus used for characterization in the field without needing any complicated thermal equipment. Our study shows that the refresh period can be relaxed by 35 times on such a commodity system with all errors being corrected by the available error correcting codes, resulting in 11.5% power savings on average.
更多
查看译文
关键词
commodity server,data-dependent circuit level factors,online DRAM characterization,variable cell retention time,thermal testbeds,DRAM reliability,DRAM temperature,memory capacity,relaxed refresh period,ARM based server,system level effects,Linux OS,FPGA,SOC utilization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要