Sparse Dynamic Programming on DAGs with Small Width

ACM Transactions on Algorithms(2019)

引用 12|浏览42
暂无评分
摘要
The minimum path cover problem asks us to find a minimum-cardinality set of paths that cover all the nodes of a directed acyclic graph (DAG). We study the case when the size k of a minimum path cover is small, that is, when the DAG has a small width. This case is motivated by applications in pan-genomics, where the genomic variation of a population is expressed as a DAG. We observe that classical alignment algorithms exploiting sparse dynamic programming can be extended to the sequence-against-DAG case by mimicking the algorithm for sequences on each path of a minimum path cover and handling an evaluation order anomaly with reachability queries. Namely, we introduce a general framework for DAG-extensions of sparse dynamic programming. This framework produces algorithms that are slower than their counterparts on sequences only by a factor k. We illustrate this on two classical problems extended to DAGs: longest increasing subsequence and longest common subsequence. For the former, we obtain an algorithm with running time O(k|E|log |V|). This matches the optimal solution to the classical problem variant when the input sequence is modeled as a path. We obtain an analogous result for the longest common subsequence problem. We then apply this technique to the co-linear chaining problem, which is a generalization of the above two problems. The algorithm for this problem turns out to be more involved, needing further ingredients, such as an FM-index tailored for large alphabets and a two-dimensional range search tree modified to support range maximum queries. We also study a general sequence-to-DAG alignment formulation that allows affine gap costs in the sequence. The main ingredient of the proposed framework is a new algorithm for finding a minimum path cover of a DAG (V,E) in O(k|E|log |V|) time, improving all known time-bounds when k is small and the DAG is not too dense. In addition to boosting the sparse dynamic programming framework, an immediate consequence of this new minimum path cover algorithm is an improved space/time tradeoff for reachability queries in arbitrary directed graphs.
更多
查看译文
关键词
Pattern matching, co-linear chaining, longest common subsequence, pan-genomics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要