Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm

INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION（2022）

引用 0|浏览1

暂无评分

摘要

This study evaluated the construct validity of six scoring traits of an automated writing evaluation (AWE) system called MI Write . Persuasive essays ( N = 100) written by students in grades 7 and 8 were randomized at the sentence-level using a script written with Python’s NLTK module. Each persuasive essay was randomized 30 times ( n = 3000 total randomizations), and the mean trait scores for each set of randomized iterations were compared to those of the control text across all traits. We were specifically interested in evaluating the effects of randomization on the high-level traits of idea development and organization . Given the rubrics and qualitative feedback provided by MI Write, we hypothesized that these high-level traits ought to be sensitive to sentence-level randomization (i.e., scores should decrease). Overall, complete randomizations did not consistently significantly impact trait scoring for these high-level writing traits. In fact, more than a third of the essays saw significant increases in one or both high-level traits despite randomization, indicating a disconnect between MI Write’s formative feedback and its underlying constructs. Findings have implications for consumers and developers of AWE.

查看译文

关键词

Automated essay scoring, Automated writing evaluation, Feedback, Writing assessment, Validity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要