PESTD: a large-scale Persian-English scene text dataset

MULTIMEDIA TOOLS AND APPLICATIONS(2023)

引用 1|浏览6
暂无评分
摘要
Extracting text from natural scene images has become a vital issue. The uncertainty of size, color, background, and alignment of the characters make text recognition in natural scene images a demanding challenge. Also, another recent challenge has been the development and expansion of intelligent systems in the field of transportation, especially the recognition of traffic signs, which help ensure safer and easier driving. Therefore, existing a scene-text dataset as a benchmark to generalize researchers’ algorithms is critical. This study, as one of the first studies in the field of text-based traffic signs, intends to prepare a Persian-English multilingual dataset (PESTD) that includes 5832 instances including letters, digits, and symbols in three categories: Persian, English, and Persian-English. Due to the similarity of the calligraphy of numbers and letters in Persian (Farsi), Arabic and Urdu languages, The PESTD can be used in all countries with these languages. To prepare PESTD instances, the text detection process was performed on the traffic signs in Iran. The CRAFT feature extraction algorithm with YOLO and the Tesseract engine have been combined to take an effective step to recognize cursive and multilingual languages despite their specific challenges. Experimental results depict that the values of the evaluation criteria in YOLOv5 are better than its older versions. The accuracy and F1-score values on the PESTD have been attained at 95.3% and 92.3%, respectively.
更多
查看译文
关键词
Cursive script,Deep learning,Farsi,Arabic,Urdu,Farsi-English,Multilingual,Persian-English,Scene text dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要