On the Minimum Copy Number Generation Problem in Cancer Genomics.

BCB(2018)

引用 25|浏览9
暂无评分
摘要
In cancer genomics, due to the fast somatic mutations (mainly random segment duplications and deletions), copy number profiles (CNPs), (i.e., a file containing each of the gene numbers) are used more often than genome themselves. On the other hand, algorithms with performance analysis for processing CNPs are lacking. In a recent CPM'16 paper, Shamir et al. studied the copy number transformation problem, which is to use the minimum number of duplications and deletions (on the CNPs) to convert one CNP to another, and gave a linear time algorithm. In this paper, we consider a slightly different problem which is called Minimum Copy Number Generation (MCNG), namely, given a genome G and a specific CNP C, use the minimum number of duplications and deletions on G to obtain some genome H which has a CNP C. We show that the problem is NP-hard if G is generic (i.e., contains duplicated genes) and when the duplications are tandem. On the other hand, when only tandem duplications are allowed, if G is exemplar (or is a permutation) and all components in C are power of two's, then the problem can be solved in time linear in the length of the input (or $|C|$) plus $O(|G|log |G]|)$ (the cost for sorting $|G|$ elements). That naturally extends to a practical heuristic algorithm for the problem (when G is exemplar and the components in C are arbitrary). We also show that two variations of the MCNG problem are at least as hard as Set Cover in terms of approximability and FPT tractability. For the general Minimum Copy Number Generation problem, i.e., when both (arbitrary) segment duplications and deletions are allowed, we also design a practical greedy algorithm, present some non-trivial cases and discuss the directions for future research.
更多
查看译文
关键词
Cancer genomics, copy number profile, complexity, algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要