Leveraging ECC to Mitigate Read Disturbance, False Reads and Write Faults in STT-RAM

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)(2016)

引用 25|浏览88
暂无评分
摘要
Designing reliable systems using scaled Spin-Transfer Torque Random Access Memory (STT-RAM) has become a significant challenge as the memory technology feature size is scaled down. The introduction of a more prominent read disturbance is a key contributor in this reliability challenge. However, techniques to address read disturbance are often considered in a vacuum that assumes other concerns like transient read errors (false reads) and write faults do not occur. This paper studies several techniques that leverage ECC to mitigate persistent errors resulting from read disturbance and write faults of STT-RAM while still considering the impact of transient errors of false reads. In particular, we study three policies to enable better-than-conservative read disturbance mitigation. The first policy, write after error (WAE), uses ECC to detect errors and write back data to clear persistent errors. The second policy, write after persistent error (WAP), filters out false reads by reading a second time when an error is detected leading to trade-off between write and read energy. The third policy, write after error threshold (WAT), leaves cells with incorrect data behind (up to a threshold) when the number of errors is less than the ECC capability. To evaluate the effectiveness of the different schemes and compare with the simple previously proposed scheme of writing after every read (WAR), we model these policies using Markov processes. This approach allows the determination of appropriate bit error rates in the context of both persistent and transient errors to accurately estimate the system reliability and the energy consumption of different error correction approaches. Our evaluations show that each of these policies provides benefits for different error scenarios. Moreover some approaches can save energy by an average of 99.5%, while incurring the same reliability as other approaches.
更多
查看译文
关键词
STT-RAM,Markov Model,Persistent Error,Reliability,ECC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要