How To Evaluate Your Dialogue System: Probe Tasks as an Alternative for Token-level Evaluation Metrics
Abstract:
Though generative dialogue modeling is widely seen as a language modeling task, the task demands an agent to have a complex natural language understanding of its input text to carry a meaningful interaction with an user. The automatic metrics used evaluate the quality of the generated text as a proxy to the holistic interaction of the a...More
Code:
Data:
Full Text
Tags
Comments