Multimodal Pretraining and Generation for Recommendation: A Tutorial
Companion Proceedings of the ACM on Web Conference 2024(2024)
摘要
Personalized recommendation stands as a ubiquitous channel for users to
explore information or items aligned with their interests. Nevertheless,
prevailing recommendation models predominantly rely on unique IDs and
categorical features for user-item matching. While this ID-centric approach has
witnessed considerable success, it falls short in comprehensively grasping the
essence of raw item contents across diverse modalities, such as text, image,
audio, and video. This underutilization of multimodal data poses a limitation
to recommender systems, particularly in the realm of multimedia services like
news, music, and short-video platforms. The recent surge in pretraining and
generation techniques presents both opportunities and challenges in the
development of multimodal recommender systems. This tutorial seeks to provide a
thorough exploration of the latest advancements and future trajectories in
multimodal pretraining and generation techniques within the realm of
recommender systems. The tutorial comprises three parts: multimodal
pretraining, multimodal generation, and industrial applications and open
challenges in the field of recommendation. Our target audience encompasses
scholars, practitioners, and other parties interested in this domain. By
providing a succinct overview of the field, we aspire to facilitate a swift
understanding of multimodal recommendation and foster meaningful discussions on
the future development of this evolving landscape.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要