ASSESSING INTER-RATER AGREEMENT ABOUT ITEM-WRITING FLAWS IN MULTIPLE-CHOICE QUESTIONS OF CLINICAL ANATOMY

Beatriz G Guimaraes, J Pais, Edmeia De Almeida Cardoso Coelho,Ana C P Da Silva,Ana Povo, I Lourinho,M Severo,Marcio Alcântara Ferreira

EDULEARN Proceedings（2013）

引用 23|浏览1

暂无评分

摘要

Multiple-choice questions (MCQs) are regularly used in exams in order to assess students in health science disciplines. Despite this fact, MCQ items often have item-writing flaws, and few educators have formal instruction in writing MCQs. The major purpose of our study was to estimate the inter-rater agreement about item classification as either standard or flawed. In order to achieve this goal, four judges (2 teacher/2 students), blinded to all item performance data, independently classified each one of 920 test items from 10 examinations as either standard or flawed. If flawed the exact type of item flaw or flaws present in the question stem and respective options were recorded. In this study, the standard item was operationally defined as any item that did not violate one or more of the 31 principles noted in a review article which summarized current educational measurement recommendations concerning item writing. The Fleiss' Kappa was use to evaluate the inter-rater agreement between 4 judges previous the consensus process. In respect to the agreement about item classification as either standard or flawed was fair (kappa=0.3). Despite the agreement was substantial for the more prevalent principles, generally the results showed many disagreements among judges about item classification, previous the consensus process. In a future investigation it is important to assess if presence of flaw or flaws in the MCQ item have impact its quality, namely, if there are interference with difficulty and discrimination indices of the MCQ item.

查看译文

关键词

Multiple-choice questions,assessment,medical education and clinical anatomy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要