End-to-End Neural Speech Coding for Real-Time Communications.

Xue Jiang,Xiulian Peng,Chengyu Zheng,Huaying Xue,Yuan Zhang,Yan Lu

user-61447a76e55422cecdaf7d19（2022）

引用 3|浏览8

暂无评分

摘要

Deep-learning based methods have shown their advantages in audio coding over traditional ones but limited attention has been paid on real-time communications (RTC). This paper proposes the TFNet, an end-to-end neural speech codec with low latency for RTC. It takes an encoder-temporal filtering-decoder paradigm that has seldom been investigated in audio coding. An interleaved structure is proposed for temporal filtering to capture both short-term and long-term temporal dependencies. Furthermore, with end-to-end optimization, the TFNet is jointly optimized with speech enhancement and packet loss concealment, yielding a one-for-all network for three tasks. Both subjective and objective results demonstrate the efficiency of the proposed TFNet.

查看译文

关键词

communications,speech,end-to-end,real-time

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要