Support NNEF execution model for NNAPI

The Journal of Supercomputing(2021)

引用 3|浏览3
暂无评分
摘要
With growing applications such as image recognition, speech recognition, ADAS, and AIoT, artificial intelligence (AI) frameworks are becoming popular in various industries. Currently, many choices for neural network frameworks exist for executing AI models in applications, especially for training/inference purposes, including TensorFlow, Caffe, MXNet, PyTorch, Core ML, TensorFlow Lite, and NNAPI. With so many different emerging frameworks, exchange formats are needed for different AI frameworks. Given this requirement, the Khronos group created a standard draft known as the Neural Network Exchange Format (NNEF). However, because NNEF is new, conversion tools for various AI frameworks that would allow the exchange of various AI frameworks remain missing. In this work, we fill this gap by devising NNAPI conversion tools for NNEF. Our work allows NNEF to execute inference tasks on host and Android platforms and flexibly invokes Android neural networks through the API (NNAPI) on the Android platform to speed up inference operations. We invoke NNAPI by dividing the input NNEF model into multiple submodels and let NNAPI execute these submodels. We develop an algorithm named BFSelector that is based on a classic breadth-first search and includes cost constraints to determine how to divide the input model. Our preliminary experimental results show that our support of NNEF on NNAPI can obtain a speedup of 1.32 to 22.52 times over the baseline for API 27 and of 4.56 to 211 times over the baseline for API 28, where the baseline is the NNEF-to-Android platform conversion without invoking NNAPI. The experiment includes AI models such as LeNet, AlexNet, MobileNet_V1, MobileNet_V2, VGG-16, and VGG-19.
更多
查看译文
关键词
AI model compilers, NNEF, NNAPI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要