Chrome Extension
WeChat Mini Program
Use on ChatGLM

Pedestrian Crossing Intention Prediction Based on Cross-Modal Transformer and Uncertainty-Aware Multi-Task Learning for Autonomous Driving

IEEE transactions on intelligent transportation systems(2024)

Cited 0|Views33
No score
Abstract
Accurate prediction of whether pedestrians will cross the street is prevalently recognized as an indispensable function of autonomous driving systems, especially in urban environments. How to utilize the complementary information present in different types of data (or modalities) is one of the major challenges. This paper makes the first attempt to develop a cross-modal transformer-based crossing intention prediction model merely using bounding boxes and ego-vehicle speed as input features. The cross-modal transformer can leverage self-attention and cross-modal attention to mine the modality-specific and complementary correlation. A bottleneck feature fusion is presented to obtain the compressed feature representation. To facilitate the network training, we further put forward a novel uncertainty-aware multi-task learning method that jointly predicts the future bounding box as well as crossing action such that the commonalities and differences across two tasks can be exploited. To evaluate the proposed method, extensive comparative experiments and ablation studies are performed on two benchmark datasets. The results demonstrate that by only using the bounding box and ego-vehicle speed as input features, our model is on a par with other state-of-the-art approaches that rely on more inputs, and even achieves superior performance in most cases. The source code will be released at https://github.com/xbchen82/PedCMT.
More
Translated text
Key words
Crossing intention prediction,cross-modal transformer,multi-task learning,homoscedastic uncertainty
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined