Data-driven modeling of reconfigurable multi-accelerator systems under dynamic workloads

Microprocessors and Microsystems(2024)

引用 0|浏览5
暂无评分
摘要
Reconfigurable multi-accelerator systems used as computing offloading platforms in edge-cloud continuum scenarios usually have to deal with highly dynamic workloads and operating conditions. In order to properly take advantage of their parallel processing capabilities and increase execution performance for a given workload, these systems need to continuously adapt their configuration (i.e., number and type of accelerators) at run time. When working at the edge, additional requirements such as energy efficiency must be also met. In this paper, Machine Learning techniques are applied to extract predictive models of the execution of different combinations of hardware accelerators on a reconfigurable multi-accelerator platform, aiming at satisfying the previously mentioned continuous optimization needs. One of the key benefits of the proposed approach is that its data-driven models can transparently estimate the impact of the complex interactions between hardware accelerators due to run-time resource contention among them and with the rest of the system, as opposed to traditional modeling approaches that cannot include that information in an easy and scalable way (e.g., analytical models). The proposed models are complemented with a complete infrastructure to generate, execute and monitor dynamic workloads in FPGA-based systems. This infrastructure has been used to (i) quantitatively analyze resource contention in reconfigurable multi-accelerator systems and (ii) produce the training and evaluation datasets for the ML-based models using annotated power consumption and execution performance traces. Experimental results obtained with a reconfigurable multi-accelerator platform based on the ARTICo3 framework running the MachSuite benchmarks show that the proposed modeling approach is highly effective, with a relative prediction error of less than 5% on average for both power consumption and execution performance. Result also show that the ML-based models achieve high accuracy levels when predicting the impact of resource contention and accelerator interaction on both metrics, with a mean relative prediction error of less than 0.6% and a standard deviation below 4.7% for the worst case.
更多
查看译文
关键词
Multi-accelerator systems,Reconfigurable computing,Dynamic workloads,Run-time monitoring,Machine learning,System modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要