$\texttt{ArCOV-19}$: The First Arabic COVID-19 Twitter Dataset with Propagation Networks
arxiv(2020)
摘要
In this paper, we present $\texttt{ArCOV-19}$, an Arabic COVID-19 Twitter dataset that covers the period from 27$^{th}$ of January till 31$^{st}$ of March 2020. $\texttt{ArCOV-19}$ is the $first$ publicly-available Arabic Twitter dataset covering COVID-19 pandemic that includes around 748k $popular$ tweets (according to Twitter search criterion) alongside the $\textit{propagation networks}$ of the most-popular subset of them. The propagation networks include both retweets and conversational threads (i.e., threads of replies). $\texttt{ArCOV-19}$ is designed to enable research under several domains including natural language processing, information retrieval, and social computing, among others. Preliminary analysis shows that $\texttt{ArCOV-19}$ captures rising discussions associated with the first reported cases of the disease as they appeared in the Arab world. In addition to the source tweets and the propagation networks, we also release the search queries and the language-independent crawler used to collect the tweets to encourage the curation of similar datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络