VisualNews : A Large Multi-source News Image Dataset

arxiv(2021)

引用 0|浏览94
暂无评分
摘要
We introduce VisualNews, a large-scale dataset collected from four news agencies consisting of more than one million news images along with associated news articles, image captions, author information, and other metadata. We also propose VisualNews-Captioner, a model for the task of news image captioning. Unlike the standard image captioning task, news images depict situations where people, locations, andevents are of paramount importance. Our proposed method is able to effectively combine visual and textual features to generate captions with richer information such as events and entities. More specifically, we propose an Entity-Aware module along with an Entity-Guide attention layer to encourage more accurate predictions for named entities. Our method achieves new state-of-the-art results on both GoodNews and VisualNews datasets while having significantly fewer parameters than competing methods. Our larger and more diverse VisualNews dataset further highlights the remaining challenges in news image captioning.
更多
查看译文
关键词
benchmark,news,image,challenges
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要