Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality

Chenyang Li, Long Zhang,Qiusheng Zheng

ELECTRONICS（2024）

引用 0|浏览4

暂无评分

摘要

Diffusion models have achieved tremendous success in modeling continuous data modalities, such as images, audio, and video, yet their application in discrete data domains (e.g., natural language) has been limited. Existing methods primarily represent discrete text in a continuous diffusion space, incurring significant computational overhead during training and resulting in slow sampling speeds. This paper introduces LaDiffuSeq, a latent diffusion-based text generation model incorporating an encoder–decoder structure. Specifically, it first employs a pretrained encoder to map sequences composed of attributes and corresponding text into a low-dimensional latent vector space. Then, without the guidance of a classifier, it performs the diffusion process for the sequence’s corresponding latent space. Finally, a pretrained decoder is used to decode the newly generated latent vectors, producing target texts that are relevant to themes and possess multiple emotional granularities. Compared to the benchmark model, DiffuSeq, this model achieves BERTScore improvements of 0.105 and 0.009 on two public real-world datasets (ChnSentiCorp and a debate dataset), respectively; perplexity falls by 3.333 and 4.562; and it effectively quadruples the text generation sampling speed.

查看译文

关键词

diffusion model,sequence diffusion,pretrained models,prompt,text generation,controllable emotion generation,fine-grained emotion

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要