AlignRec: Aligning and Training in Multimodal Recommendations
CoRR(2024)
摘要
With the development of multimedia systems, multimodal recommendations are
playing an essential role, as they can leverage rich contexts beyond
interactions. Existing methods mainly regard multimodal information as an
auxiliary, using them to help learn ID features; However, there exist semantic
gaps among multimodal content features and ID-based features, for which
directly using multimodal information as an auxiliary would lead to
misalignment in representations of users and items. In this paper, we first
systematically investigate the misalignment issue in multimodal
recommendations, and propose a solution named AlignRec. In AlignRec, the
recommendation objective is decomposed into three alignments, namely alignment
within contents, alignment between content and categorical ID, and alignment
between users and items. Each alignment is characterized by a specific
objective function and is integrated into our multimodal recommendation
framework. To effectively train AlignRec, we propose starting from pre-training
the first alignment to obtain unified multimodal features and subsequently
training the following two alignments together with these features as input. As
it is essential to analyze whether each multimodal feature helps in training
and accelerate the iteration cycle of recommendation models, we design three
new classes of metrics to evaluate intermediate performance. Our extensive
experiments on three real-world datasets consistently verify the superiority of
AlignRec compared to nine baselines. We also find that the multimodal features
generated by AlignRec are better than currently used ones, which are to be
open-sourced in our repository https://github.com/sjtulyf123/AlignRec_CIKM24.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要