A Clique-Based Method Using Dynamic Programming For Computing Edit Distance Between Unordered Trees

JOURNAL OF COMPUTATIONAL BIOLOGY(2012)

引用 19|浏览13
暂无评分
摘要
Many kinds of tree-structured data, such as RNA secondary structures, have become available due to the progress of techniques in the field of molecular biology. To analyze the tree-structured data, various measures for computing the similarity between them have been developed and applied. Among them, tree edit distance is one of the most widely used measures. However, the tree edit distance problem for unordered trees is NP-hard. Therefore, it is required to develop efficient algorithms for the problem. Recently, a practical method called clique-based algorithm has been proposed, but it is not fast for large trees.This article presents an improved clique-based method for the tree edit distance problem for unordered trees. The improved method is obtained by introducing a dynamic programming scheme and heuristic techniques to the previous clique-based method. To evaluate the efficiency of the improved method, we applied the method to comparison of real tree structured data such as glycan structures. For large tree-structures, the improved method is much faster than the previous method. In particular, for hard instances, the improved method achieved more than 100 times speed-up.
更多
查看译文
关键词
dynamic programming,glycan,maximum clique,tree edit distance,unordered trees
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要