Towards Effectively Test Report Classification to Assist Crowdsourced Testing.

ESEM(2016)

引用 64|浏览105
暂无评分
摘要
Context: Automatic classification of crowdsourced test reports is important due to their tremendous sizes and large proportion of noises. Most existing approaches towards this problem focus on examining the performance of different machine learning or information retrieval techniques, and most are evaluated on open source dataset. However, our observation reveals that these approaches generate poor and unstable performances on real industrial crowdsourced testing data. We further analyze the deep reason and find that industrial data have significant local bias, which degrades existing approaches. Goal: We aim at designing an approach to overcome the local bias in industrial data and automatically classifying true fault from the large amounts of crowdsourced reports. Method: We propose a cluster-based classification approach, which first clusters similar reports together and then builds classifiers based on most similar clusters with ensemble method. Results: Evaluation is conducted on 15,095 test reports of 35 industrial projects from Chinese largest crowdsourced testing platform and results are promising, with 0.89 precision and 0.97 recall on average. In addition, our approach improves the existing baselines by 17% - 63% in average precision and 15% - 61% in average recall. Conclusions: Results imply that our approach can effectively discriminate true fault from large amounts of crowdsourced reports, which can reduce the effort required for manual inspection and facilitate project management in crowdsourced testing. To the best of our knowledge, this is the first work to address the test report classification problem in real industrial crowdsourced testing practice.
更多
查看译文
关键词
Crowdsourced testing, Report classification, Cluster
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要