Text Detection and Language Identification in Natural Scene Images using YOLOv5

R.S. Latha,G.R. Sreekanth,R.C. Suganthe,R. Rajadevi, V.V. Jagadeeswaran, R. Logesh, A. Maheshvar

2023 International Conference on Computer Communication and Informatics (ICCCI)（2023）

引用 0|浏览3

暂无评分

摘要

Deep learning has immensely evolved ever since digital era. Deep learning also includes feature extraction as a facet. Text snipping from a picture is a difficult task since the image comprises text in a variety of sizes, styles, orientations, alignments, low contrast, noise, and with a complicated backdrop structure. Transformation of an image into different perspective for feature identification is the first step towards text recognition. Scene texts provide rich contextual information that can be applied to several types of vision-based applications, hence over the last few years we have witnessed an increase in interest in the detection and recognition of scene texts. In order to address the issue of language detection from multilingual scene text photos, a deep learning-based solution is suggested in this paper. In this study, the underlying model of a Convolutional neural network is employed to detect objects in real-time with high accuracy. This study employs a single neural network “you only look once" known as YOLO, since it offers predictions with just a single forward propagation trip through the neural network to evaluate the full image. We used COCO ‘Common Objects in Context' dataset which is a large-scale object detection, segmentation, and captioning dataset. To evaluate the image YOLO divides the image into smaller parts and forecasts boundary areas and probabilities for every part. The predicted probability weighs these region proposals. It then provides identified objects after non-max linear suppression. We used F1-score which combines accuracy and recall into a single metric by computing their harmonic means.

查看译文

关键词

complicated backdrop structure,Convolutional neural network,deep learning-based solution,digital era,feature extraction,feature identification,image YOLO,language detection,language identification,large-scale object detection,multilingual scene text photos,natural scene images,rich contextual information,scene texts,single forward propagation trip,single neural network,text detection,text recognition,vision-based applications,YOLOv5

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要