Semantic Object Alignment and Region-Aware Learning for Change Captioning.

IJCNN(2023)

引用 0|浏览4
暂无评分
摘要
The change captioning task, a downstream task of the image captioning task, is an emerging deep learning task. It aims to output sentences to describe the differences between two images, one of which is the original image and the other is the image changed from the original. Many current methods for generating disparity descriptions are based on the encoder-decoder model. The process is to first obtain the grid features using pre-trained ResNet on the image and encode the features, and then to obtain the sentences via a decoder. However, using traditional methods of encoding area features and applying them to grid features regardless can lead to performance degradation. Therefore, we propose a new model incorporating a novel design that is more robust to describe variations of image pairs, and a new addition of pseudo-region features that reduce the information missing from the grid features. Intensive experiments we have done can adequately demonstrate that our proposed model achieves state-of-the-art performance.
更多
查看译文
关键词
change captioning task,disparity descriptions,downstream task,emerging deep learning task,encoder-decoder model,encoding area features,grid features,image captioning task,image pairs,model achieves state-of-the-art performance,output sentences,pre-trained ResNet,pseudoregion features,region-aware learning,semantic object alignment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要