Discerning the Quality of Questions in Educational Q&Ausing Textual Features.

CHIIR（2017）

引用 3|浏览17

暂无评分

摘要

In an information seeking episode, attributes such as relevance, quality, and the nature of the information sought/obtained are directly related to the nature and the quality of the query or question that represents an information need. It is, therefore, imperative that we identify potential problems with such representation to make the information seeking outcome and the experience more successful. In this paper, we investigate the question quality for the educational community question answering (CQA) site Brainly by examining 2,000 questions, of which 1,000 were answered and 1,000 were unanswered. Two human assessors rated the quality of each question on a scale from 1-5 based on factors such as ambiguity, poor syntax, lack of information, complexity, inappropriateness, and inconsistency. We then identified different textual features that are needed to detect question quality. A logistic regression classifier was built to categorize question features based on the rating scale and textual features present in the question. The results show higher ROC curves for ambiguity, lack of information, inappropriateness, complexity and excessive information; and lower ROC values for poor syntax and inconsistency among the questions. The findings demonstrate that the classifier failed to perform when faced with ill-framed or inconsistent phrases in a question. The work described here presents a method for identifying high and low-quality questions, the knowledge of which could be instrumental in helping reformulate users' questions and present them to a system or a community for more successful processes and outcomes.

查看译文

关键词

Community Q&A, question quality, machine learning, logistic regression

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要