Evaluating Question Answering Evaluation

MRQA@EMNLP, pp. 119-124, 2019.

Cited by: 2|Bibtex|Views84|DOI:https://doi.org/10.18653/v1/D19-5817
EI
Other Links: dblp.uni-trier.de

Abstract:

As the complexity of question answering (QA) datasets evolve, moving away from restricted formats like span extraction and multiple-choice (MC) to free-form answer generation, it is imperative to understand how well current metrics perform in evaluating QA. This is especially important as existing metrics (BLEU, ROUGE, METEOR, and F1) are...More

Code:

Data:

Your rating :
0

 

Tags
Comments