AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We present the Idomaar framework, which enables the e cient, reproducible evaluation of recommender algorithms in real-world stream-based scenarios

Idomaar: A Framework for Multi-dimensional Benchmarking of Recommender Algorithms.

RecSys Posters, (2016)

被引用8|浏览39
EI
下载 PDF 全文
引用
微博一下

摘要

In real-world scenarios, recommenders face non-functional requirementsof technical nature and must handle dynamic data in the formof sequential streams. Evaluation of recommender systems musttake these issues into account in order to be maximally informative.In this paper, we present Idomaar—a framework that enables theefficient multi-dim...更多

代码

数据

0
简介
  • AND MOTIVATION

    Increasingly, the authors witness a shift of recommender system research toward large-scale systems developed for industry settings.
  • Given commercial systems’ complexity and the demand for high performance, evaluation is subject to additional requirements: contribution of complementary information, reliablility on handling large-scale problems, and use of di↵erent methods and metrics.
  • The authors introduce Idomaar to address this challenge.1 It enables researchers to evaluate di↵erent algorithm with respect to multiple criteria.
  • The framework uses large-scale static data sets to simulate live data streams, bringing o✏ine evaluation closer to online A/B testing.
重点内容
  • AND MOTIVATION

    Increasingly, we witness a shift of recommender system research toward large-scale systems developed for industry settings
  • We witness a shift of recommender system research toward large-scale systems developed for industry settings
  • By comparing the performance of recommender algorithms operating in a live system and these simulated data streams, the framework can
  • Idomaar can be considered as a suited tool for recommender system research: reuse of code speeds up prototyping and standardization of datasets helps merging di↵erent data sources
  • We present the Idomaar framework, which enables the e cient, reproducible evaluation of recommender algorithms in real-world stream-based scenarios
结果
  • By comparing the performance of recommender algorithms operating in a live system and these simulated data streams, the framework can
  • Idomaar enables multi-dimensional evaluation which simultaneously measures the performance of algorithms with respect to precision-related and technical aspects.
  • The reference framework Idomaar is a tool to evaluate recommendation services in real-world settings.
  • Idomaar mimics the work flow of such real-world scenario by using state-of-the-art technologies (e.g., Apache Flume and Apache Kafka) to manage data streaming.
  • Part of the data bootstraps the recommender system for training algorithms, while most of the remaining data feeds the recommender system at real-time while it has to serve incoming recommendation requests for test purposes.
  • The remaining subset of the data, the ground truth, is hidden from the recommender system and used to evaluate the quality of the service in terms of the user metrics.
  • The orchestrator coordinates all processes, including launching and provisioning the computing environment, instructing the evaluator to split data into training, test, and ground truth, feeding the recommender system with the incoming messages in accordance to their timestamp, collecting the generated recommendations, and computing the quality metrics.
  • Moving from an o✏ine toward an online scenario means either replacing the Apache Flume source with another one or ingesting the data directly into the Apache Kafka queue.
  • Various frameworks have been proposed to facilitate evaluating recommender systems.
  • The presented tools support evaluation, all presented frameworks measure quality only in terms of predictive performance.
  • The authors propose Idomaar a language-agnostic framework with cloud-support and the ability to measure time and space complexity.
结论
  • Idomaar can be considered as a suited tool for recommender system research: reuse of code speeds up prototyping and standardization of datasets helps merging di↵erent data sources.
  • Supporting generic objects and additional evaluation functions promise to establish Idomaar as standard research tool for recommender system.
  • The authors present the Idomaar framework, which enables the e cient, reproducible evaluation of recommender algorithms in real-world stream-based scenarios.
  • Idomaar simplifies the multidimensional evaluation taking into account precision-related metrics as well as technical aspects.
总结
  • AND MOTIVATION

    Increasingly, the authors witness a shift of recommender system research toward large-scale systems developed for industry settings.
  • Given commercial systems’ complexity and the demand for high performance, evaluation is subject to additional requirements: contribution of complementary information, reliablility on handling large-scale problems, and use of di↵erent methods and metrics.
  • The authors introduce Idomaar to address this challenge.1 It enables researchers to evaluate di↵erent algorithm with respect to multiple criteria.
  • The framework uses large-scale static data sets to simulate live data streams, bringing o✏ine evaluation closer to online A/B testing.
  • By comparing the performance of recommender algorithms operating in a live system and these simulated data streams, the framework can
  • Idomaar enables multi-dimensional evaluation which simultaneously measures the performance of algorithms with respect to precision-related and technical aspects.
  • The reference framework Idomaar is a tool to evaluate recommendation services in real-world settings.
  • Idomaar mimics the work flow of such real-world scenario by using state-of-the-art technologies (e.g., Apache Flume and Apache Kafka) to manage data streaming.
  • Part of the data bootstraps the recommender system for training algorithms, while most of the remaining data feeds the recommender system at real-time while it has to serve incoming recommendation requests for test purposes.
  • The remaining subset of the data, the ground truth, is hidden from the recommender system and used to evaluate the quality of the service in terms of the user metrics.
  • The orchestrator coordinates all processes, including launching and provisioning the computing environment, instructing the evaluator to split data into training, test, and ground truth, feeding the recommender system with the incoming messages in accordance to their timestamp, collecting the generated recommendations, and computing the quality metrics.
  • Moving from an o✏ine toward an online scenario means either replacing the Apache Flume source with another one or ingesting the data directly into the Apache Kafka queue.
  • Various frameworks have been proposed to facilitate evaluating recommender systems.
  • The presented tools support evaluation, all presented frameworks measure quality only in terms of predictive performance.
  • The authors propose Idomaar a language-agnostic framework with cloud-support and the ability to measure time and space complexity.
  • Idomaar can be considered as a suited tool for recommender system research: reuse of code speeds up prototyping and standardization of datasets helps merging di↵erent data sources.
  • Supporting generic objects and additional evaluation functions promise to establish Idomaar as standard research tool for recommender system.
  • The authors present the Idomaar framework, which enables the e cient, reproducible evaluation of recommender algorithms in real-world stream-based scenarios.
  • Idomaar simplifies the multidimensional evaluation taking into account precision-related metrics as well as technical aspects.
相关工作
  • Various frameworks have been proposed to facilitate evaluating recommender systems. Ekstrand et al [3] introduce LensKit to increase comparability of recommender system evaluation. Mahout is a scalable machine learning toolkit implemented in Java. Both frameworks ship with a selection of recommendation algorithms and some evaluators. Gantner et al [4] created MyMediaLite as a lightweight recommender system framework. It comprises some recommendation algorithms along with predefined evaluation protocols. Said and Bellogín [8] proposed RiVal to facilitate comparing various recommendation algorithms. The framework’s architecture supports cross-framework comparisons. The variety in frameworks emphasizes the demand for tools to evaluate recommender systems. Although the presented tools support evaluation, all presented frameworks measure quality only in terms of predictive performance. Operating recommender systems face additional challenges. For instance, they might be subject to response time restrictions or experience heavy load. Finally, running above mentioned frameworks on di↵erent hardware still yields inconsistent results. For these reasons, we propose Idomaar a language-agnostic framework with cloud-support and the ability to measure time and space complexity.
基金
  • The research leading to these results was performed in the CrowdRec project, which has received funding from the EU 7th Framework Programme FP7/2007-2013 under grant agreement No 610594
引用论文
  • X. Amatriain. Building industrial-scale real-world recommender systems. In RecSys ’12, pages 7–8, 2012.
    Google ScholarLocate open access versionFindings
  • T. Brodt and F. Hopfgartner. Shedding light on a living lab: the CLEF NEWSREEL open recommendation platform. In IIiX ’14, pages 223–226, 2014.
    Google ScholarLocate open access versionFindings
  • M. D. Ekstrand, M. Ludwig, J. A. Konstan, and J. T. Riedl. Rethinking the recommender research ecosystem: Reproducibility, openness, and lenskit. In RecSys’11, 2011.
    Google ScholarLocate open access versionFindings
  • Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Mymedialite: A free recommender system library. In RecSys’11, pages 305–308. ACM, 2011.
    Google ScholarLocate open access versionFindings
  • Y. Koh, R. Knauerhase, P. Brett, M. Bowman, Z. Wen, and C. Pu. An analysis of performance interference e↵ects in virtual environments. In ISPASS’07. IEEE, 2007.
    Google ScholarFindings
  • M. Levy. O✏ine evaluation of recommender systems: all pain and no gain? In RecSys 2013, page 1, 2013.
    Google ScholarLocate open access versionFindings
  • A. Said and A. Bellogín. Comparative recommender system evaluation: Benchmarking recommendation frameworks. In RecSys’14, RecSys ’14, pages 129–136. ACM, 2014.
    Google ScholarLocate open access versionFindings
  • A. Said and A. Bellogín. Rival: A toolkit to foster reproducibility in recommender system evaluation. In RecSys’14, pages 371–372, 2014.
    Google ScholarLocate open access versionFindings
  • A. Said, B. Loni, R. Turrin, and A. Lommatzsch. An extended data model format for composite recommendation. In RecSys’14 (Posters), 2014.
    Google ScholarLocate open access versionFindings
  • O. Tickoo, R. Iyer, R. Illikkal, and D. Newell. Modeling virtual machine performance: challenges and approaches. SIGMETRICS Perf. Evaluation Review, 37(3):55–60, 20
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
小科