SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments

PROCEEDINGS OF THE VLDB ENDOWMENT(2023)

引用 0|浏览31
暂无评分
摘要
Many IoT applications require the use of multiple deep neural networks (DNNs) to perform various tasks on low-cost edge devices with limited computation resources. However, existing DNN model serving platforms, such as TensorFlow Serving and TorchServe, are resource-intensive and require high-performance GPUs that are often not available on low-cost edge devices. In this paper, we propose SmartLite, a lightweight DBMS that addresses these challenges by storing the parameters and structural information of neural networks as database tables and implementing neural network operators inside the DBMS engine. SmartLite quantizes model parameters as binarized values, applies neural pruning techniques to compress the models, and transforms tensor manipulations into value lookup operations of the DBMS to reduce computation overhead. Experimental results show that SmartLite requires 98% less memory while achieving about a 134% performance speedup compared to TorchServe. Our proposed solution addresses the challenges of running multiple DNN models on low-cost edge devices and provides a significant contribution to the field of IoT applications.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要