## AI帮你理解科学

## AI 精读

AI抽取本论文的概要总结

微博一下：

# Correlation Clustering with Asymmetric Classification Errors

ICML, pp.4641-4650, (2020)

EI

关键词

摘要

In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labeled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study ...更多

代码：

数据：

简介

- In the Correlation Clustering problem, the authors are given a set of objects with pairwise similarity information.
- The authors study the Correlation Clustering problem on complete graphs with edge weights.
- The assumptions made by the Correlation Clustering on Complete Graphs model are too strong, since rarely do real world instances have equal edge weights.

重点内容

- In the Correlation Clustering problem, we are given a set of objects with pairwise similarity information
- The pairwise information is represented as a weighted graph G whose edges are labelled as “positive/similar” and “negative/dissimilar” by a noisy binary classifier
- The goal is to find a clustering C that minimizes the weight of edges disagreeing with this clustering: A positive edge is in disagreement with C, if its endpoints belong to distinct clusters; and a negative edge is in disagreement with C if its endpoints belong to the same cluster
- Charikar, Guruswami, and Wirth (2003) and Demaine, Emanuel, Fiat, and Immorlica (2006) gave an O approximation algorithm, they showed that Correlation Clustering with Partial Noisy Information is as hard as the Multicut problem and, O is likely to be the best possible approximation for this problem
- We study the Correlation Clustering problem on complete graphs with edge weights
- For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: we present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors

结果

- The following examples show how the Correlation Clustering with Asymmetric Classification Errors model can help in capturing real world instances.
- If the authors were to use the state of the art algorithm for Correlation Clustering on Complete Graphs on the instance for Correlation Clustering with Asymmetric Classification Errors, the authors would get a Θ(max(w+/w−, w−/w+)) approximation to the MinDisagree objective.
- The authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
- There exists a polynomial time A = 3 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
- There exists a polynomial time A = 6 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
- The authors show a similar integraplity gap result for the Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs problem.
- The natural Linear Programming relaxation for Correlation Clustering has an integrality gap of Ω for instances of Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
- The log-likelihood function of the clustering C is, Throughout the paper, the authors denote the set of positive edges by E+ and the set of negative edges by E−.
- For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: the authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.

结论

- Let them assign arbitrary lengths xuv, xvw, and xuw satisfying the triangle inequality to the edges uv, vw, and uw and run one iteration of the algorithm on the triangle uvw.
- Let them point out that Theorem 1.1 has dependence A = 3 + 2 loge 1/α because (i) f (x) must be equal to C − e−Ax or a slower growing function so that Claim 3.4 holds Theorem 3.1 requires that f (0) = 0, and the authors will need below that 1 − f
- Observe that the LP cost of a negative edge (u, v) (which is equal to α(1 − xuv)) is positive if and only if d(u, v) < 1/2 log3 n.

总结

- In the Correlation Clustering problem, the authors are given a set of objects with pairwise similarity information.
- The authors study the Correlation Clustering problem on complete graphs with edge weights.
- The assumptions made by the Correlation Clustering on Complete Graphs model are too strong, since rarely do real world instances have equal edge weights.
- The following examples show how the Correlation Clustering with Asymmetric Classification Errors model can help in capturing real world instances.
- If the authors were to use the state of the art algorithm for Correlation Clustering on Complete Graphs on the instance for Correlation Clustering with Asymmetric Classification Errors, the authors would get a Θ(max(w+/w−, w−/w+)) approximation to the MinDisagree objective.
- The authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
- There exists a polynomial time A = 3 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
- There exists a polynomial time A = 6 + 2 loge 1/α approximation algorithm for Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
- The authors show a similar integraplity gap result for the Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs problem.
- The natural Linear Programming relaxation for Correlation Clustering has an integrality gap of Ω for instances of Correlation Clustering with Asymmetric Classification Errors on complete bipartite graphs.
- The log-likelihood function of the clustering C is, Throughout the paper, the authors denote the set of positive edges by E+ and the set of negative edges by E−.
- For every pair of vertices u and v, the integer program (IP) has a variable xuv ∈ {0, 1}, which indicates whether u and v belong to the same cluster: the authors present an approximation algorithm for Correlation Clustering with Asymmetric Classification Errors.
- Let them assign arbitrary lengths xuv, xvw, and xuw satisfying the triangle inequality to the edges uv, vw, and uw and run one iteration of the algorithm on the triangle uvw.
- Let them point out that Theorem 1.1 has dependence A = 3 + 2 loge 1/α because (i) f (x) must be equal to C − e−Ax or a slower growing function so that Claim 3.4 holds Theorem 3.1 requires that f (0) = 0, and the authors will need below that 1 − f
- Observe that the LP cost of a negative edge (u, v) (which is equal to α(1 − xuv)) is positive if and only if d(u, v) < 1/2 log3 n.

- Table1: Approximation factors Athm and Aopt for different α-s

基金

- Jafar Jafarov and Yury Makarychev were supported in part by NSF CCF-1718820 and NSF TRIPODS CCF-1934843
- Sanchit Kalhan and Konstantin Makarychev were supported in part by NSF TRIPODS CCF-1934931

引用论文

- Ailon, N., Charikar, M., and Newman, A. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):23, 2008.
- Ailon, N., Chen, Y., and Xu, H. Breaking the small cluster barrier of graph clustering. In International Conference on Machine Learning, pp. 995–1003, 2013.
- Bansal, N., Blum, A., and Chawla, S. Correlation clustering. Machine learning, 56(1-3):89–113, 2004.
- Boldi, P. and Vigna, S. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference, pp. 595–601, 2004.
- Boldi, P., Codenotti, B., Santini, M., and Vigna, S. Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience, 34(8):711–726, 2004.
- Boldi, P., Rosa, M., Santini, M., and Vigna, S. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the International Conference on World Wide Web, pp. 587– 596, 2011.
- Boldi, P., Marino, A., Santini, M., and Vigna, S. BUbiNG: Massive crawling for the masses. In Proceedings of the Companion Publication of the International Conference on World Wide Web, pp. 227–228, 2014.
- Charikar, M., Guruswami, V., and Wirth, A. Clustering with qualitative information. In IEEE Symposium on Foundations of Computer Science. Citeseer, 2003.
- Chawla, S., Makarychev, K., Schramm, T., and Yaroslavtsev, G. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs. In Proceedings of the Symposium on Theory of Computing, pp. 219–228, 2015.
- Demaine, E. D., Emanuel, D., Fiat, A., and Immorlica, N. Correlation clustering in general weighted graphs. Theoretical Computer Science, 361(2-3):172–187, 2006.
- Elsner, M. and Schudy, W. Bounding and comparing methods for correlation clustering beyond ilp. In Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pp. 19–27. Association for Computational Linguistics, 2009.
- Garg, N., Vazirani, V. V., and Yannakakis, M. Approximate max-flow min-(multi) cut theorems and their applications. SIAM Journal on Computing, 25(2):235–251, 1996.
- Makarychev, K., Makarychev, Y., and Vijayaraghavan, A. Correlation clustering with noisy partial information. In Conference on Learning Theory, pp. 1321–1342, 2015.
- Mathieu, C. and Schudy, W. Correlation clustering with noisy input. In Proceedings of the Symposium on Discrete Algorithms, pp. 712–728, 2010.
- Pan, X., Papailiopoulos, D., Oymak, S., Recht, B., Ramchandran, K., and Jordan, M. I. Parallel correlation clustering on big graphs. In Advances in Neural Information Processing Systems, pp. 82–90, 2015.
- Tang, S., Andres, B., Andriluka, M., and Schiele, B. Multiperson tracking by multicut and deep matching. In European Conference on Computer Vision, pp. 100–111, 2016.
- Tang, S., Andriluka, M., Andres, B., and Schiele, B. Multiple people tracking by lifted multicut and person reidentification. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 3539–3548, 2017.

标签

评论

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn