Chrome Extension
WeChat Mini Program
Use on ChatGLM

Native Language Identification for Russian.

ICDM Workshops(2019)

Cited 3|Views3
No score
Abstract
The task of recognizing the author’s native language based on a text (Native Language Identification - NLI) is the task of automatically recognizing native language (L1) based on texts written in a language that is not native to the author. The NLI task was studied in detail for the English language, and two shared tasks were conducted in 2013 [1] and 2017 [2], where TOEFL English essays and essay samples were used as data. There is also a small number of works where the NLI problem was solved for other languages, among which Russian has not yet been studied. This paper discusses the use of well-established approaches in the NLI Shared Task 2013 and 2017 competitions to solve the problem of recognizing the authoru0027s native language, as well as to recognize the type of speaker — learners of Russian or Heritage Russian speakers. The classifier presented in this paper is based on the support vector machine (SVM) using the TF-IDF metric. This study is data-driven and is possible thanks to the Russian Learner Corpus developed by the HSE Learner Russian Research Group [3] on the basis of which experiments are being conducted.
More
Translated text
Key words
native language identification,NLI,support vector machine,SVM,term frequency,inverse term frequency,TF-IDF,Russian Learner Corpus
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined