Naive terminological annotation of legal texts in slovak - can it be useful?

RASPRAVE(2022)

引用 1|浏览0
暂无评分
摘要
Correct automatic terminological annotation of texts in a corpus can be sometimes a chal-lenging task, especially for moderately or heavily inflected languages with relatively free word order. We explore the possibility of simple annotation based on sequence matching of lemmatized texts to annotate Slovak language corpus with IATE terminological entries. The accuracy of annotating legal language is very good when annotating multiword terms, while accuracy of single-word terms can be increased by applying simple filters based on word lengths and blacklisting most frequent false positives.
更多
查看译文
关键词
terminology,corpus,Slovak language,corpus annotation,IATE
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要