Improving Reliability of Soft Real-Time Embedded Systems on Integrated CPU and GPU Platforms

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2020)

引用 13|浏览145
暂无评分
摘要
Multiprocessor systems on a chip consisting of integrated CPUs and GPUs are suitable platforms for real-time embedded applications requiring massively parallel processing. For such applications, lifetime reliability due to permanent faults and soft-error reliability due to transient faults are major concerns. Detailed execution profiling has revealed that a CUDA task’s CPU execution time significantly increases if the task executes on a different core than the operating system (OS). Based on this observation, an extended task model is introduced to consider the execution time dependencies among tasks and the OS. A hybrid framework is proposed to improve soft-error reliability while satisfying a lifetime reliability constraint for soft real-time systems executing on integrated CPU and GPU platforms. This framework: 1) reduces the total utilization of cores and improves soft-error reliability via off-line task mapping; 2) achieves a higher lifetime reliability through task migration at run time; and 3) improves soft-error reliability by dynamically scaling frequencies of CPU and GPU cores. The experimental results show that the proposed framework leads to a system that can execute without soft errors for at least 4 days (4 times) and 6 days (6 times) longer, on average, than existing approaches.
更多
查看译文
关键词
Graphics processing units,Task analysis,Central Processing Unit,Real-time systems,Circuit faults,Integrated circuit reliability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要