Semantic Object Alignment and Region-Aware Learning for Change Captioning.

Weidong Tian,Quan Ren,Zhongqiu Zhao,Ruihua Tian

IJCNN（2023）

引用 0|浏览4

暂无评分

摘要

The change captioning task, a downstream task of the image captioning task, is an emerging deep learning task. It aims to output sentences to describe the differences between two images, one of which is the original image and the other is the image changed from the original. Many current methods for generating disparity descriptions are based on the encoder-decoder model. The process is to first obtain the grid features using pre-trained ResNet on the image and encode the features, and then to obtain the sentences via a decoder. However, using traditional methods of encoding area features and applying them to grid features regardless can lead to performance degradation. Therefore, we propose a new model incorporating a novel design that is more robust to describe variations of image pairs, and a new addition of pseudo-region features that reduce the information missing from the grid features. Intensive experiments we have done can adequately demonstrate that our proposed model achieves state-of-the-art performance.

查看译文

关键词

change captioning task,disparity descriptions,downstream task,emerging deep learning task,encoder-decoder model,encoding area features,grid features,image captioning task,image pairs,model achieves state-of-the-art performance,output sentences,pre-trained ResNet,pseudoregion features,region-aware learning,semantic object alignment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要