On Computing Centroids According to the p-Norms of Hamming Distance Vectors

Leibniz International Proceedings in Informatics(2018)

引用 12|浏览70
暂无评分
摘要
In this paper we consider the p-Norm Hamming Centroid problem which asks to determine whether some given binary strings have a centroid with a bound on the p-norm of its Hamming distances to the strings. Specifically, given a set of strings S and a real k, we consider the problem of determining whether there exists a string s^* with (∑_s ∈ Sd^p(s^*,s))^1/p≤ k, where d(,) denotes the Hamming distance metric. This problem has important applications in data clustering, and is a generalization of the well-known polynomial-time solvable Consensus String (p=1) problem, as well as the NP-hard Closest String (p=∞) problem. Our main result shows that the problem is NP-hard for all fixed rational p > 1, closing the gap for all rational values of p between 1 and ∞. Under standard complexity assumptions the reduction also implies that the problem has no 2^o(n+m)-time or 2^o(k^p/(p+1))-time algorithm, where m denotes the number of input strings and n denotes the length of each string, for any fixed p > 1. Both running time lower bounds are tight. In particular, we provide a 2^k^p/(p+1)+ε-time algorithm for each fixed ε > 0. In the last part of the paper, we complement our hardness result by presenting a fixed-parameter algorithm and a factor-2 approximation algorithm for the problem.
更多
查看译文
关键词
Strings,Clustering,Multiwinner Election,Hamming Distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要