Finnish Paraphrase Corpus

Jenna Kanerva,Filip Ginter,Li-Hsin Chang,Iiro Rastas,Valtteri Skantsi,Jemina Kilpeläinen,Hanna-Mari Kupari,Jenna Saarni,Maija Sevón,Otto Tarkka

NoDaLiDa（2021）

引用 0|浏览20

暂无评分

摘要

In this paper, we introduce the first fully manually annotated paraphrase corpus for Finnish containing 53,572 paraphrase pairs harvested from alternative subtitles and news headings. Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts. Additionally, we establish a manual candidate selection method and demonstrate its feasibility in high quality paraphrase selection in terms of both cost and quality.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要