You Talk Too Much: Limiting Privacy Exposure Via Voice Input

2019 IEEE Security and Privacy Workshops (SPW)（2019）

引用 22|浏览29

暂无评分

摘要

Voice synthesis uses a voice model to synthesize arbitrary phrases. Advances in voice synthesis have made it possible to create an accurate voice model of a targeted individual, which can then in turn be used to generate spoofed audio in his or her voice. Generating an accurate voice model of target's voice requires the availability of a corpus of the target's speech. This paper makes the observation that the increasing popularity of voice interfaces that use cloud-backed speech recognition (e.g., Siri, Google Assistant, Amazon Alexa) increases the public's vulnerability to voice synthesis attacks. That is, our growing dependence on voice interfaces fosters the collection of our voices. As our main contribution, we show that voice recognition and voice accumulation (that is, the accumulation of users' voices) are separable. This paper introduces techniques for locally sanitizing voice inputs before they are transmitted to the cloud for processing. In essence, such methods employ audio processing techniques to remove distinctive voice characteristics, leaving only the information that is necessary for the cloud-based services to perform speech recognition. Our preliminary experiments show that our defenses prevent state-of-the-art voice synthesis techniques from constructing convincing forgeries of a user's speech, while still permitting accurate voice recognition.

查看译文

关键词

Privacy,Voice Assistants

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要