Exploring The Potentiality Of Semantic Features For Paraphrase Detection
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020(2020)
摘要
Paraphrase is defined as the repetition of something written or spoken using different words. In this paper, we adopt a feature engineering strategy to perform paraphrase detection at the sentence level. In particular, we explore the potentiality of semantic features, as the similarity between two semantic graphs, a distance function between sentences and the cosine similarity between embedded sentences, using them within several machine learning-based classifiers. We evaluate our approach on the ASSIN benchmark corpus and achieve 80.5% of F-score, outperforming some other detection methods for Portuguese.
更多查看译文
关键词
Paraphrase detection, Semantics, Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要