Layout analysis of historical Tibetan documents

2019 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD)(2019)

引用 3|浏览13
暂无评分
摘要
Layout analysis of historical Tibetan document is very important task in the processing of document rebuilding. The result of layout analysis can be used to segment text regions. The task includes skew correction, region segmentation, page layout and structure. Skew correction process uses Tibetan baseline feature and Hough transform to get baseline, so as to get the skew angle of the document. After skew correction, getting the position of border is to realize region segmentation. To obtain the border, we have adopted a series of treatments such as median filtering, Gaussian smoothing, Sobel edge and edge smoothing, removing small area regions, and obtaining the border position. we determine the location of other regions according to the position relationship between border and other regions, e.g. text region, left comments, right comments. Finally, we use XML format to save page information such as author, date time and page information and region positions. Our method is simple and realize batch image layout analysis.
更多
查看译文
关键词
page analysis,historical Tibetan document,binarization,automatic layout analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要