Identification of Parallel Passages Across a Large Hebrew/Aramaic Corpus
Journal of Data Mining & Digital Humanities, 2018.
Abstract:
We propose a method for efficiently finding all parallel passages in a large corpus, even if the passages are not quite identical due to rephrasing and orthographic variation. The key ideas are the representation of each word in the corpus by its two most infrequent letters, finding matched pairs of strings of four or five words that diff...More
Code:
Data:
Upload PDF
1.Your uploaded documents will be check within 24h, and coins will be credited to your account.
2.As the current system does not support cash withdrawal, you can add staff WeChat (AMxiaomai) to receive it as a red packet.
3.10 coins will be exchanged for 1 yuan.
?
¥
Upload a single paper
for 5 coins
Wechat's Red Packet
?
¥
Upload 50 articles
for 280 coins
Wechat's Red Packet
?
¥
Upload 200 articles
for 1200 coins
Wechat's Red Packet
?
¥
Upload 500 articles
for 3000 coins
Wechat's Red Packet
?
¥
Upload 1000 articles
for 7000 coins
Wechat's Red Packet
Tags
Comments