Chrome Extension
WeChat Mini Program
Use on ChatGLM

Multimodal Representation Learning for Real-World Applications

PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022(2022)

Cited 0|Views6
No score
Abstract
Multimodal representation learning has shown tremendous improvements in recent years. An extensive set of works for fusing multiple modalities have shown promising results on the public benchmarks. However, most famous works target unrealistic settings or toy datasets, and a considerable gap exists between the real-world implications of the existing methods. In this work, we aim to bridge the gap between the well-defined benchmark settings and the real-world use cases. We aim to explore architectures inspired by existing promising approaches that have the potential to be implemented in real-world instances. Moreover, we also try to move the research forward by addressing questions that can be solved using multimodal approaches and have a considerable impact on the community. With this work, we attempt to leverage the multimodal representation learning methods, which directly apply to real-world settings.
More
Translated text
Key words
Multimodal Representations,Multimodal Fusion,Cross-modal Processing,Deep Learning Architectures,Machine Learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined