Towards the Automatic Detection of the Source Language of a Literary Translation.

COLING (Posters)(2012)

引用 29|浏览16
暂无评分
摘要
Experiments on the detection of the source language of literary translations are described. Two feature types are exploited, n-gram based features and document-level statistics. Crossvalidation results on a corpus of twenty 19th-century texts including translations from Russian, French, German and texts written in English are promising: single feature classifiers yield significant gains on the baseline, although classifiers containing a combination of feature types outperform these, bringing L1 detection accuracy to ~80% using ten-fold training set cross validation. Average test set results are slightly lower but still comparable to the crossvalidation results. Relative frequencies of a number of salient features are studied, including several English contractions (I’ll, that’s, etc.) and uncontracted forms; we articulate hypotheses, anchored in source languages, towards explaining differences.
更多
查看译文
关键词
stylistics,computational linguistics,stylometry,translation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要