YOLOMH: You Only Look Once for Multi-Task Driving Perception with High Efficiency

Liu Fang, Sun Bowen, Miao Jianxi,Su Weixing

Machine Vision and Applications（2024）

Cited 0|Views6

No score

Abstract

Aiming at the requirements of high accuracy, lightweight and real-time performance of the panoptic driving perception system, this paper proposes an efficient multi-task network (YOLOMH). The network uses a shared encoder and three independent decoding heads to simultaneously complete the three major panoptic driving perception tasks of traffic object detection, road drivable area segmentation and road lane segmentation. Thanks to our innovative design of the YOLOMH network structure: first, we design an appropriate information input structure based on the different information requirements between different tasks, and secondly, we propose a Hybrid Deep Atrous Spatial Pyramid Pooling module to efficiently complete the feature fusion work of the neck network, and finally effective approaches such as Anchor-free detection head and Depthwise Separable Convolution are introduced into the network, making the network more efficient while being lightweight. Experimental results show that our model achieves competitive results in both accuracy and speed on the challenging BDD100K dataset, especially in terms of inference speed, The model’s inference speed on NVIDIA TESLA V100 is as high as 107 Frames Per Second (FPS), far exceeding the 49 FPS of the YOLOP network under the same experimental settings. This well meets the requirements of autonomous vehicles for high system accuracy and low latency.

Translated text

Key words

Panoptic driving perception,Multi-task network,HDASPP,YOLOMH

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined