Toward Generic Cross-Modal Transmission Strategy

Xin Wei, Junqi Liao,Liang Zhou,Hikmet Sari,Weihua Zhuang

IEEE Transactions on Communications（2024）

引用 0|浏览3

暂无评分

摘要

Multi-modal services, integrating various modalities such as audio, visual, and haptic, have emerged as leading multimedia applications in the 5G era and beyond. To fulfill the demands for low latency, high reliability, and large capacity, cross-modal transmission schemes have been proposed. Typically, these schemes emphasize on either audio-visual or haptic modality, and prioritize flawless transmission of one modality to assist the other modality streaming. However, these prerequisite and assumption do not hold for generic multi-modal services and communication environments, where determining the priority of modality and guaranteeing flawless transmission becomes challenging. To address this fundamental problem, in this paper, we introduce a strategy toward generic cross-modal transmission, enabling visual and haptic modalities to assist each other as needed. The strategy includes a visual-haptic mutual stream delivery mechanism at the sender and a visual-haptic mutual signal reconstruction approach at the receiver. The former aims to eliminate redundancy in visual and haptic streams through mutual assistance, while the latter adaptively handles impaired, missing, or delayed visual or haptic signals by leveraging modality-aware knowledge transfer and semantic-aware signal generation techniques. The proposed strategy demonstrates excellent performance through experiments conducted on a standard multi-modal dataset and a practical visual-haptic communication platform.

查看译文

关键词

Cross-modal transmission,mutual aided,stream delivery,signal reconstruction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要