Document Layout Analysis: A Maximum Homogeneous Region Approach

2018 1st International Conference on Multimedia Analysis and Pattern Recognition (MAPR)(2018)

引用 2|浏览1
暂无评分
摘要
This paper presents a method for document layout analysis. This method applies the analyzing of whitespace in maximum homogeneous regions. This method focuses on the balance between processing time and performance. It consists of two main stages: classification and segmentation. Firstly, by using the analysis of whitespace analysis on Maximum multi-layer horizontal homogeneous regions, the text and non-text elements are classified. Then, text regions are extracted by using mathematical morphology. Besides, non-text elements are classified into separators, tables, images via a machine learning approach. The proposed method's effectiveness is proved by the tests on UW-III (A1) datasets.
更多
查看译文
关键词
machine learning approach,document layout analysis,Maximum homogeneous region approach,whitespace analysis,Maximum multilayer horizontal homogeneous regions,nontext elements,text regions extraction,mathematical morphology,UW-III (A1) datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要