FrankenSplit: Efficient Neural Feature Compression with Shallow Variational Bottleneck Injection for Mobile Edge Computing
IEEE Transactions on Mobile Computing(2023)
摘要
The rise of mobile AI accelerators allows latency-sensitive applications to
execute lightweight Deep Neural Networks (DNNs) on the client side. However,
critical applications require powerful models that edge devices cannot host and
must therefore offload requests, where the high-dimensional data will compete
for limited bandwidth. This work proposes shifting away from focusing on
executing shallow layers of partitioned DNNs. Instead, it advocates
concentrating the local resources on variational compression optimized for
machine interpretability. We introduce a novel framework for resource-conscious
compression models and extensively evaluate our method in an environment
reflecting the asymmetric resource distribution between edge devices and
servers. Our method achieves 60
method without decreasing accuracy and is up to 16x faster than offloading with
existing codec standards.
更多查看译文
关键词
Data compression,distributed inference,edge computing,edge intelligence,feature compression,knowledge distillation,learned image compression,neural data compression,split computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要