AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
For every pair of vertices u and v, the integer program has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: we present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors

Correlation Clustering with Asymmetric Classification Errors

ICML, pp.4641-4650, (2020)

被引用2|浏览21
EI
下载 PDF 全文
引用
微博一下

摘要

In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labeled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study ...更多

代码

数据

0
简介
  • In the Correlation Clustering problem, the authors are given a set of objects with pairwise similarity information.
  • The authors study the Correlation Clustering problem on complete graphs with edge weights.
  • The assumptions made by the Correlation Clustering on Complete Graphs model are too strong, since rarely do real world instances have equal edge weights.
重点内容
  • In the Correlation Clustering problem, we are given a set of objects with pairwise similarity information
  • The pairwise information is represented as a weighted graph G whose edges are labelled as “positive/similar” and “negative/dissimilar” by a noisy binary classifier
  • The goal is to find a clustering C that minimizes the weight of edges disagreeing with this clustering: A positive edge is in disagreement with C, if its endpoints belong to distinct clusters; and a negative edge is in disagreement with C if its endpoints belong to the same cluster
  • Charikar, Guruswami, and Wirth (2003) and Demaine, Emanuel, Fiat, and Immorlica (2006) gave an O approximation algorithm, they showed that Correlation Clustering with Partial Noisy Information is as hard as the Multicut problem and, O is likely to be the best possible approximation for this problem
  • We study the Correlation Clustering problem on complete graphs with edge weights
  • For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: we present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors
结果
  • The following examples show how the Correlation Clustering with Asymmetric Classification Errors model can help in capturing real world instances.
  • If the authors were to use the state of the art algorithm for Correlation Clustering on Complete Graphs on the instance for Correlation Clustering with Asymmetric Classification Errors, the authors would get a Θ(max(w+/w−, w−/w+)) approximation to the MinDisagree objective.
  • The authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
  • There exists a polynomial time A = 3 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
  • There exists a polynomial time A = 6 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
  • The authors show a similar integraplity gap result for the Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs problem.
  • The natural Linear Programming relaxation for Correlation Clustering has an integrality gap of Ω for instances of Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
  • The log-likelihood function of the clustering C is, Throughout the paper, the authors denote the set of positive edges by E+ and the set of negative edges by E−.
  • For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: the authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
结论
  • Let them assign arbitrary lengths xuv, xvw, and xuw satisfying the triangle inequality to the edges uv, vw, and uw and run one iteration of the algorithm on the triangle uvw.
  • Let them point out that Theorem 1.1 has dependence A = 3 + 2 loge 1/α because (i) f (x) must be equal to C − e−Ax or a slower growing function so that Claim 3.4 holds Theorem 3.1 requires that f (0) = 0, and the authors will need below that 1 − f
  • Observe that the LP cost of a negative edge (u, v) (which is equal to α(1 − xuv)) is positive if and only if d(u, v) < 1/2 log3 n.
总结
  • In the Correlation Clustering problem, the authors are given a set of objects with pairwise similarity information.
  • The authors study the Correlation Clustering problem on complete graphs with edge weights.
  • The assumptions made by the Correlation Clustering on Complete Graphs model are too strong, since rarely do real world instances have equal edge weights.
  • The following examples show how the Correlation Clustering with Asymmetric Classification Errors model can help in capturing real world instances.
  • If the authors were to use the state of the art algorithm for Correlation Clustering on Complete Graphs on the instance for Correlation Clustering with Asymmetric Classification Errors, the authors would get a Θ(max(w+/w−, w−/w+)) approximation to the MinDisagree objective.
  • The authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
  • There exists a polynomial time A = 3 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
  • There exists a polynomial time A = 6 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
  • The authors show a similar integraplity gap result for the Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs problem.
  • The natural Linear Programming relaxation for Correlation Clustering has an integrality gap of Ω for instances of Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
  • The log-likelihood function of the clustering C is, Throughout the paper, the authors denote the set of positive edges by E+ and the set of negative edges by E−.
  • For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: the authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
  • Let them assign arbitrary lengths xuv, xvw, and xuw satisfying the triangle inequality to the edges uv, vw, and uw and run one iteration of the algorithm on the triangle uvw.
  • Let them point out that Theorem 1.1 has dependence A = 3 + 2 loge 1/α because (i) f (x) must be equal to C − e−Ax or a slower growing function so that Claim 3.4 holds Theorem 3.1 requires that f (0) = 0, and the authors will need below that 1 − f
  • Observe that the LP cost of a negative edge (u, v) (which is equal to α(1 − xuv)) is positive if and only if d(u, v) < 1/2 log3 n.
表格
  • Table1: Approximation factors Athm and Aopt for different α-s
Download tables as Excel
基金
  • Jafar Jafarov and Yury Makarychev were supported in part by NSF CCF-1718820 and NSF TRIPODS CCF-1934843
  • Sanchit Kalhan and Konstantin Makarychev were supported in part by NSF TRIPODS CCF-1934931
引用论文
  • Ailon, N., Charikar, M., and Newman, A. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008.
    Google ScholarLocate open access versionFindings
  • Ailon, N., Chen, Y., and Xu, H. Breaking the small cluster barrier of graph clustering. In International Conference on Machine Learning, pp. 995–1003, 2013.
    Google ScholarLocate open access versionFindings
  • Bansal, N., Blum, A., and Chawla, S. Correlation clustering. Machine learning, 56(1-3):89–113, 2004.
    Google ScholarLocate open access versionFindings
  • Boldi, P. and Vigna, S. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference, pp. 595–601, 2004.
    Google ScholarLocate open access versionFindings
  • Boldi, P., Codenotti, B., Santini, M., and Vigna, S. Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience, 34(8):711–726, 2004.
    Google ScholarLocate open access versionFindings
  • Boldi, P., Rosa, M., Santini, M., and Vigna, S. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the International Conference on World Wide Web, pp. 587– 596, 2011.
    Google ScholarLocate open access versionFindings
  • Boldi, P., Marino, A., Santini, M., and Vigna, S. BUbiNG: Massive crawling for the masses. In Proceedings of the Companion Publication of the International Conference on World Wide Web, pp. 227–228, 2014.
    Google ScholarLocate open access versionFindings
  • Charikar, M., Guruswami, V., and Wirth, A. Clustering with qualitative information. In IEEE Symposium on Foundations of Computer Science. Citeseer, 2003.
    Google ScholarLocate open access versionFindings
  • Chawla, S., Makarychev, K., Schramm, T., and Yaroslavtsev, G. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs. In Proceedings of the Symposium on Theory of Computing, pp. 219–228, 2015.
    Google ScholarLocate open access versionFindings
  • Demaine, E. D., Emanuel, D., Fiat, A., and Immorlica, N. Correlation clustering in general weighted graphs. Theoretical Computer Science, 361(2-3):172–187, 2006.
    Google ScholarLocate open access versionFindings
  • Elsner, M. and Schudy, W. Bounding and comparing methods for correlation clustering beyond ilp. In Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pp. 19–27. Association for Computational Linguistics, 2009.
    Google ScholarLocate open access versionFindings
  • Garg, N., Vazirani, V. V., and Yannakakis, M. Approximate max-flow min-(multi) cut theorems and their applications. SIAM Journal on Computing, 25(2):235–251, 1996.
    Google ScholarLocate open access versionFindings
  • Makarychev, K., Makarychev, Y., and Vijayaraghavan, A. Correlation clustering with noisy partial information. In Conference on Learning Theory, pp. 1321–1342, 2015.
    Google ScholarLocate open access versionFindings
  • Mathieu, C. and Schudy, W. Correlation clustering with noisy input. In Proceedings of the Symposium on Discrete Algorithms, pp. 712–728, 2010.
    Google ScholarLocate open access versionFindings
  • Pan, X., Papailiopoulos, D., Oymak, S., Recht, B., Ramchandran, K., and Jordan, M. I. Parallel correlation clustering on big graphs. In Advances in Neural Information Processing Systems, pp. 82–90, 2015.
    Google ScholarLocate open access versionFindings
  • Tang, S., Andres, B., Andriluka, M., and Schiele, B. Multiperson tracking by multicut and deep matching. In European Conference on Computer Vision, pp. 100–111, 2016.
    Google ScholarLocate open access versionFindings
  • Tang, S., Andriluka, M., Andres, B., and Schiele, B. Multiple people tracking by lifted multicut and person reidentification. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 3539–3548, 2017.
    Google ScholarLocate open access versionFindings
作者
Jafar Jafarov
Jafar Jafarov
Sanchit Kalhan
Sanchit Kalhan
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科