OPMP: An Omnidirectional Pyramid Mask Proposal Network for Arbitrary-Shape Scene Text Detection

IEEE Transactions on Multimedia(2021)

引用 26|浏览129
暂无评分
摘要
Scene text detection methods have achieved significant progresses. However, stack-omnidirectional text dilemma, under-segmentation of very close text words, and over-segmentation of arbitrary-shape long text lines, are still main challenges. Motivated by these problems, we proposed a two stage method called omnidirectional pyramid mask proposal text detector (OPMP). OPMP removes anchor mechanism that requires heuristic non-maximum suppress processing. Instead, it uses an effective pyramid lengthwise and sidewise residual sequence modeling method to produce arbitrary-shape proposals. To accurately extract the features of text shape, OPMP enhances the backbone layers by a multiple arbitrary-shape fitting mechanism. Finally, a multi-grain text classification module is proposed, which reclassifies each text region robustly. Comprehensive ablation studies demonstrate the effectiveness of each proposed component. In addition, experiments on various benchmarks, including ICDAR2015, MLT, MSRA-TD500, CTW1500, and Total-text, show that our method outperforms previous state-of-the-art methods.
更多
查看译文
关键词
Text detection,pyramid sequence modeling,omnidirectional pyramid mask proposal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要