Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning
MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020, pp. 4181-4189, 2020.
EI
Abstract:
When we humans tell a long paragraph about an image, we usually first implicitly compose a mental "script'' and then comply with it to generate the paragraph. Inspired by this, we render the modern encoder-decoder based image paragraph captioning model such ability by proposing Hierarchical Scene Graph Encoder-Decoder (HSGED) for generati...More
Code:
Data:
Full Text
Tags
Comments