OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Visual Contexts

Shuhe Wang
Shuhe Wang
Qinghong Han
Qinghong Han
Xiaofei Sun
Xiaofei Sun
Cited by: 0|Views13

Abstract:

When humans converse, what a speaker will say next significantly depends on what he sees. Unfortunately, existing dialogue models generate dialogue utterances only based on preceding textual contexts, and visual contexts are rarely considered. This is due to a lack of a large-scale multi-module dialogue dataset with utterances paired wi...More

Code:

Data:

Full Text
Bibtex
Your rating :
0

 

Tags
Comments