DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator

Cited by: 0|Bibtex|Views35
Other Links: arxiv.org

Abstract:

Audio Visual Scene-aware Dialog (AVSD) is the task of generating a response for a question with a given scene, video, audio, and the history of previous turns in the dialog. Existing systems for this task employ the transformers or recurrent neural network-based architecture with the encoder-decoder framework. Even though these techniqu...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments