TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus
LREC, pp. 6279-6286, 2020.
This article describes the constitution process of the first morpho-syntactically annotated Tunisian Arabish Corpus (TArC). Arabish, also known as Arabizi, is a spontaneous coding of Arabic dialects in Latin characters and arithmographs (numbers used as letters). This code-system was developed by Arabic-speaking users of social media in...More
PPT (Upload PPT)