Sitpose: a siamese convolutional transformer for relative camera pose estimation

Kai Leng,Cong Yang,Wei Sui,Jie Liu, Zhijun Li

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME(2023)

Cited 0|Views17
No score
Abstract
Relative Camera Pose Estimation (RCPE) aims to calculate the translation and rotation between two frames with overlapped regions, which is crucial to computer vision and robotics. This paper presents a novel siamese convolutional transformer model, SiTPose, to regress relative camera pose directly. SiTPose is distinguished in three aspects: (1) With a cross-attention feature extractor and a compact transformer encoder, extreme rotation errors (> 150 degrees) are significantly reduced: from 9.7% with the state-of-the-art 8-Points to 1. on the 7Scenes dataset. (2) SiTPose is also robust to narrow-baseline cases (slight rotation angle and large translation between neighboring frames), while existing RCPE methods mainly focus on wide-baseline cases. (3) SiTPose can be flexibly extended to geometry-based vSLAM (namely SiT-SLAM) in a multi-threaded way to prevent tracking lost and scale ambiguity problems. Results on multiple datasets show that SiT-SLAM yields a marked improvement in robustness and localization accuracy in complex scenarios, e.g., RMSE error is reduced from 26.36m with the classic ORBSLAM3 method to 6.94m on the KITTI-09.
More
Translated text
Key words
Relative Pose Estimation, SLAM, Camera Pose Estimation, Cross Attention
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined