A Scene Graph Encoding and Matching Network for UAV Visual Localization

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing(2024)

引用 0|浏览0
暂无评分
摘要
This paper tackles the visual localization of unmanned aerial vehicles (UAVs) in the presence of multi-source and cross-view images are involved. We present a lightweight end-to-end scene graph encoding and matching network that finds the best matches for the airborne camera views from the reference image maps. The scene graph addresses the challenges of encoding the semantic scene by aggregating the image convolutional features into global and structured semi-global descriptors. The principal contributions of this paper are as follows: First, we develop a new network architecture that embeds a non-local block and a modified vector of locally aggregated descriptors network (NetVLAD) into a backbone convolutional neural network (CNN). The main component of the modified NetVLAD is a cluster similarity masking graph (CSMG) encoder, which is proposed to replace the feature-cluster residuals computing in NetVLAD with cluster consensus feature aggregation and structure-aware scene graph extraction. In addition, a global descriptor is extracted by a non-local block to label each image with a discriminative global feature descriptor. Second, we develop a new triplet loss for the network training procedure to learn the features at different semantic levels. The proposed global descriptor and CSMG encoder are trained together according to a weighted sum of cosine triplet losses. Third, the global descriptor from the non-local block and semi-global descriptor from the CSMG encoder work hierarchically for coarse-to-fine image retrieval and can achieve real-time efficiency and favorable accuracy of image searching and matching from the reference image map. We train and test the model on two challenging benchmark datasets. We also test the pre-trained model on a dataset collected by a Fixed-wing UAV to further evaluate the model's generalizability. The benchmark evaluations and ablation experiments show that the developed method outperforms state-of-the-art methods and achieves superior performance in the real-time matching of UAV images and reference image maps for UAV visual localization. Open-source code is available on GitHub: https://github.com/rduan036/scene-graph-matching-demo.git .
更多
查看译文
关键词
End-to-end network,image matching,scene graph,UAV visual localization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要