Chrome Extension
WeChat Mini Program
Use on ChatGLM

A crowd-sourcing approach for translations of minority language user-generated content (UGC)

Prague Bulletin of Mathematical Linguistics(2017)

Cited 1|Views4
No score
Abstract
Data sparsity is a common problem for machine translation of minority and less-resourced languages. While data collection for standard, grammatical text can be challenging enough, efforts for collection of parallel user-generated content can be even more challenging. In this paper we describe an approach to collecting English↔Irish translations of user-generated content (tweets) that overcomes some of these hurdles. We show how a crowd-sourced data collection campaign, which was tailored to our target audience (the Irish language community), proved successful in gathering data for a niche domain. We also discuss the reliability of crowd-sourcing English↔Irish tweet translations in terms of quality by reporting on a self-rating approach along with qualified reviewer ratings.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined