Dataset and Models for Item Recommendation Using Multi-Modal User Interactions
arxiv(2024)
摘要
While recommender systems with multi-modal item representations (image,
audio, and text), have been widely explored, learning recommendations from
multi-modal user interactions (e.g., clicks and speech) remains an open
problem. We study the case of multi-modal user interactions in a setting where
users engage with a service provider through multiple channels (website and
call center). In such cases, incomplete modalities naturally occur, since not
all users interact through all the available channels. To address these
challenges, we publish a real-world dataset that allows progress in this
under-researched area. We further present and benchmark various methods for
leveraging multi-modal user interactions for item recommendations, and propose
a novel approach that specifically deals with missing modalities by mapping
user interactions to a common feature space. Our analysis reveals important
interactions between the different modalities and that a frequently occurring
modality can enhance learning from a less frequent one.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要