Generating captions without looking beyond objects.

Hendrik Heuer,Christof Monz,Arnold W. M. Smeulders

arXiv: Computer Vision and Pattern Recognition（2016）

引用 26|浏览54

暂无评分

摘要

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on the precision-oriented metric BLEU. The paper also investigates lower and upper bounds of how much individual word categories in the captions contribute to the final BLEU score. A large possible improvement exists for nouns, verbs, and prepositions.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要