Towards Privacy-Preserving Evaluation For Information Retrieval Models Over Industry Data Sets

INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2017(2017)

引用 1|浏览82
暂无评分
摘要
The development of Information Retrieval (IR) techniques heavily depends on empirical studies over real world data collections. Unfortunately, those real world data sets are often unavailable to researchers due to privacy concerns. In fact, the lack of publicly available industry data sets has become a serious bottleneck hindering IR research. To address this problem, we propose to bridge the gap between academic research and industry data sets through a privacy-preserving evaluation platform. The novelty of the platform lies in its "data-centric" mechanism, where the data sit on a secure server and IR algorithms to be evaluated would be uploaded to the server. The platform will run the codes of the algorithms and return the evaluation results. Preliminary experiments with retrieval models reveal interesting new observations and insights about state of the art retrieval models, demonstrating the value of an industry data set.
更多
查看译文
关键词
Test collections, Privacy, Evaluation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要