Towards Real-Time Inference Offloading with Distributed Edge Computing: the Framework and Algorithms

Quan Chen,Song Guo, Kaijia Wang,Wenchao Xu,Jing Li,Zhipeng Cai,Hong Gao,Albert Zomaya

IEEE Transactions on Mobile Computing（2023）

引用 0|浏览4

暂无评分

摘要

By combining edge computing and parallel computing, distributed edge computing has emerged as a new paradigm to exploit the booming IoT devices at the edge. To accelerate computation at the edge, i.e. , the inference tasks for DNN-driven applications, the parallelism of both computation and communication needs to be considered for distributed edge computing, and thus, the problem of Minimum Latency joint Communication and Computation Scheduling (MLCCS) is proposed. However, existing works have rigid assumptions that the communication time of each device is fixed and the workload can be split arbitrarily small. Aiming at making the work more practical and general, the MLCCS problem without the above assumptions is studied in this paper. Firstly, the MLCCS problem under a general model is formulated and proved to be NP-hard. Secondly, a pyramid-based computing model is proposed to consider the parallelism of communication and computation jointly, which has an approximation ratio of

$1+\delta$

, where

$\delta$

is related to devices' communication rates. An interesting property under such a computing model is identified and proved, i.e. , the optimal latency can be obtained under arbitrary scheduling order when all the devices share the same communication rate. When the workload cannot be split arbitrarily, an approximation algorithm with a ratio of at most

$2\cdot (1+\delta )$

is proposed. Additionally, for handling the dynamically changing network scenarios, several algorithms are also proposed accordingly. Finally, the theoretical analysis and simulation results verify that the proposed algorithm has high performance in terms of latency. Two testbed experiments are also conducted, which show that the proposed method outperforms the existing methods, reducing the latency by up to 29.2% for inference tasks at the edge.

查看译文

关键词

Distributed edge computing,parallel computing,edge inference,task assignment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要