Show Your Work: Improved Reporting of Experimental Results
EMNLP/IJCNLP (1), pp. 2185-2194, 2019.
Research in natural language processing proceeds, in part, by demonstrating that new models achieve superior performance (e.g., accuracy) on held-out test data, compared to previous results. In this paper, we demonstrate that test-set performance scores alone are insufficient for drawing accurate conclusions about which model performs b...More
PPT (Upload PPT)