Towards Speaker Verification for Crowdsourced Speech Collections.

International Conference on Language Resources and Evaluation (LREC)(2022)

Cited 0|Views21
No score
Abstract
Crowdsourcing the collection of speech provides a scalable setting to access a customisable demographic according to each dataset's needs. The correctness of speaker metadata is especially relevant for speaker-centred collections - ones that require the collection of a fixed amount of data per speaker. This paper identifies two different types of misalignment present in these collections: Multiple Accounts misalignment (different contributors map to the same speaker), and Multiple Speakers misalignment (multiple speakers map to the same contributor). Based on state-of-the-art approaches to Speaker Verification, this paper proposes an unsupervised method for measuring speaker metadata plausibility of a collection, i.e., evaluating the match (or lack thereof) between contributors and speakers. The solution presented is composed of an embedding extractor and a clustering module. Results indicate high precision in automatically classifying contributor alignment (> 0.94).
More
Translated text
Key words
Crowdsourcing, Speaker Verification, Datasets
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined