EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models
CoRR(2023)
摘要
We introduce EmphAssess, a prosodic benchmark designed to evaluate the
capability of speech-to-speech models to encode and reproduce prosodic
emphasis. We apply this to two tasks: speech resynthesis and speech-to-speech
translation. In both cases, the benchmark evaluates the ability of the model to
encode emphasis in the speech input and accurately reproduce it in the output,
potentially across a change of speaker and language. As part of the evaluation
pipeline, we introduce EmphaClass, a new model that classifies emphasis at the
frame or word level.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要