Exploiting time-frequency patterns with LSTM-RNNs for low-bitrate audio restoration
Neural Computing and Applications(2019)
摘要
Perceptual audio coding is heavily and successfully applied for audio compression. However, perceptual audio coders may inject audible coding artifacts when encoding audio at low bitrates. Low-bitrate audio restoration is a challenging problem, which tries to recover a high-quality audio sample close to the uncompressed original from a low-quality encoded version. In this paper, we propose a novel data-driven method for audio restoration, where temporal and spectral dynamics are explicitly captured by a deep time-frequency-LSTM recurrent neural networks. Leveraging the captured temporal and spectral information can facilitate the task of learning a nonlinear mapping from the magnitude spectrogram of low-quality audio to that of high-quality audio. The proposed method substantially attenuates audible artifacts caused by codecs and is conceptually straightforward. Extensive experiments were carried out and the experimental results show that for low-bitrate audio at 96 kbps (mono), 64 kbps (mono), and 96 kbps (stereo), the proposed method can efficiently generate improved-quality audio that is competitive or even superior in perceptual quality to the audio produced by other state-of-the-art deep neural network methods and the LAME-MP3 codec.
更多查看译文
关键词
Audio restoration,LSTM,MP3,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要