ParalOS: A Scheduling & Memory Management Framework for Heterogeneous VPUs
2021 24th Euromicro Conference on Digital System Design (DSD)(2021)
Abstract
Embedded systems are presented today with the challenge of a very rapidly evolving application diversity followed by increased programming and computational complexity. Customised heterogeneous System-on-Chip (SoC) processors emerge as an attractive HW solution in various application domains, however, they still require sophisticated SW development to provide efficient implementations at the expense of slower adaptation to algorithmic changes. In this context, the current paper proposes a framework for accelerating the SW development of computationally intensive applications on Vision Processing Units (VPUs), while still enabling the exploitation of their full HW potential via low-level kernel optimisations. Our framework is tailored for heterogeneous architectures and integrates a dynamic task scheduler, a novel scratchpad memory management scheme, I/O & inter-process communication techniques, as well as a visual profiler. We evaluate our work on the Intel Movidius Myriad VPUs using synthetic benchmarks and real-world applications, which vary from Convolutional Neural Networks (CNNs) to computer vision algorithms. In terms of execution time, our results range from a limited similar to 8% performance overhead vs optimised CNN programs to 4.2 x performance gain in content-dependent applications. We achieve up to 33% decrease in scratchpad memory usage vs well-established memory allocators and up to 6 x smaller inter-process communication time.
MoreTranslated text
Key words
Vision Processing Unit,System on Chip,Myriad,Heterogeneous Computing,Framework,Scheduling,Scratchpad Memory Management
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined