AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We studied feature learning in networks as a searchbased optimization problem

node2vec: Scalable Feature Learning for Networks

KDD, (2016): 855-864

被引用4703|浏览649
EI WOS

摘要

Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not exp...更多

代码

数据

0
简介
  • Many important tasks in network analysis involve predictions over nodes and edges.
  • In a typical node classification task, the authors are interested in predicting the most probable labels of nodes in a network [33].
  • In a social network, the authors might be interested in predicting interests of users, or in a protein-protein interaction network the authors might be interested in predicting functional labels of proteins [25, 37].
  • In link prediction, the authors wish to.
重点内容
  • Many important tasks in network analysis involve predictions over nodes and edges
  • We extend node2vec and other feature learning methods based on neighborhood preserving objectives, from nodes to pairs of nodes for edge-based prediction tasks
  • Since our random walks are naturally based on the connectivity structure between nodes in the underlying network, we extend them to pairs of nodes using a bootstrapping approach over the feature representations of the individual nodes
  • Since none of feature learning algorithms have been previously used for link prediction, we evaluate node2vec against some popular heuristic scores that achieve good performance in link prediction
  • We studied feature learning in networks as a searchbased optimization problem
  • Depth-first Sampling can freely explore network neighborhoods which is important in discovering homophilous communities at the cost of high variance
方法
  • The objective in Eq 2 is independent of any downstream task and the flexibility in exploration offered by node2vec lends the learned feature representations to a wide variety of network analysis settings discussed below. 4.1 Case Study

    Les Misérables network

    In Section 3.1 the authors observed that BFS and DFS strategies represent extreme ends on the spectrum of embedding nodes based on the principles of homophily and structural equivalence.
  • The objective in Eq 2 is independent of any downstream task and the flexibility in exploration offered by node2vec lends the learned feature representations to a wide variety of network analysis settings discussed below.
  • The network has 77 nodes and 254 edges.
  • The authors set d = 16 and run node2vec to learn feature representation for every node in the network.
  • The authors visualize the original network in two dimensions with nodes assigned colors based on their clusters
结果
  • The node feature representations are input to a one-vs-rest logistic regression classifier with L2 regularization.
  • A general observation the authors can draw from the results is that the learned feature representations for node pairs significantly outperform the heuristic benchmark scores with node2vec achieving the best AUC improvement on 12.6% on the arXiv dataset over the best performing baseline (Adamic-Adar [1]).
  • Amongst the feature learning algorithms, node2vec outperforms both DeepWalk and LINE in all networks with gain up to 3.8% and 6.5% respectively in the AUC scores for the best possible choices of the binary operator for each algorithm.
  • The Hadamard operator when used with node2vec is highly stable and gives the best performance on average across all networks
结论
  • The authors studied feature learning in networks as a searchbased optimization problem.
  • This perspective gives them multiple advantages.
  • The authors observed that BFS can explore only limited neighborhoods.
  • This makes BFS suitable for characterizing structural equivalences in network that rely on the immediate local structure of nodes.
  • DFS can freely explore network neighborhoods which is important in discovering homophilous communities at the cost of high variance
表格
  • Table1: Choice of binary operators ◦ for learning edge features. The definitions correspond to the ith component of g(u, v)
  • Table2: Macro-F1 scores for multilabel classification on BlogCatalog, PPI (Homo sapiens) and Wikipedia word cooccurrence networks with 50% of the nodes labeled for training
  • Table3: Link prediction heuristic scores for node pair (u, v) with immediate neighbor sets N (u) and N (v) respectively
  • Table4: Area Under Curve (AUC) scores for link prediction. Comparison with popular baselines and embedding based methods bootstapped using binary operators: (a) Average, (b) Hadamard, (c) Weighted-L1, and (d) Weighted-L2 (See Table 1 for definitions)
Download tables as Excel
相关工作
  • Feature engineering has been extensively studied by the machine learning community under various headings. In networks, the conventional paradigm for generating features for nodes is based on feature extraction techniques which typically involve some seed hand-crafted features based on network properties [8, 11]. In contrast, our goal is to automate the whole process by casting feature extraction as a representation learning problem in which case we do not require any hand-engineered features.

    Unsupervised feature learning approaches typically exploit the spectral properties of various matrix representations of graphs, especially the Laplacian and the adjacency matrices. Under this linear algebra perspective, these methods can be viewed as dimensionality reduction techniques. Several linear (e.g., PCA) and non-linear (e.g., IsoMap) dimensionality reduction techniques have been proposed [3, 27, 30, 35]. These methods suffer from both computational and statistical performance drawbacks. In terms of computational efficiency, eigendecomposition of a data matrix is expensive unless the solution quality is significantly compromised with approximations, and hence, these methods are hard to scale to large networks. Secondly, these methods optimize for objectives that are not robust to the diverse patterns observed in networks (such as homophily and structural equivalence) and make assumptions about the relationship between the underlying network structure and the prediction task. For instance, spectral clustering makes a strong homophily assumption that graph cuts will be useful for classification [29]. Such assumptions are reasonable in many scenarios, but unsatisfactory in effectively generalizing across diverse networks.
基金
  • This research has been supported in part by NSF CNS-1010921, IIS-1149837, NIH BD2K, ARO MURI, DARPA XDATA, DARPA SIMPLEX, Stanford Data Science Initiative, Boeing, Lightspeed, SAP, and Volkswagen
引用论文
  • L. A. Adamic and E. Adar. Friends and neighbors on the web. Social networks, 25(3):211–230, 2003.
    Google ScholarLocate open access versionFindings
  • L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, 2011.
    Google ScholarLocate open access versionFindings
  • M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, 2001.
    Google ScholarLocate open access versionFindings
  • Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE TPAMI, 35(8):1798–1828, 2013.
    Google ScholarLocate open access versionFindings
  • B.-J. Breitkreutz, C. Stark, T. Reguly, L. Boucher, A. Breitkreutz, M. Livstone, R. Oughtred, D. H. Lackner, J. Bähler, V. Wood, et al. The BioGRID interaction database. Nucleic acids research, 36:D637–D640, 2008.
    Google ScholarLocate open access versionFindings
  • S. Cao, W. Lu, and Q. Xu. GraRep: Learning Graph Representations with global structural information. In CIKM, 2015.
    Google ScholarLocate open access versionFindings
  • S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75 – 174, 2010.
    Google ScholarLocate open access versionFindings
  • B. Gallagher and T. Eliassi-Rad. Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In Lecture Notes in Computer Science: Advances in Social Network Mining and Analysis. Springer, 2009.
    Google ScholarLocate open access versionFindings
  • Z. S. Harris. Word. Distributional Structure, 10(23):146–162, 1954.
    Google ScholarLocate open access versionFindings
  • K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu, L. Akoglu, D. Koutra, C. Faloutsos, and L. Li. RolX: structural role extraction & mining in large graphs. In KDD, 2012.
    Google ScholarLocate open access versionFindings
  • K. Henderson, B. Gallagher, L. Li, L. Akoglu, T. Eliassi-Rad, H. Tong, and C. Faloutsos. It’s who you know: graph mining using recursive structural features. In KDD, 2011.
    Google ScholarLocate open access versionFindings
  • P. D. Hoff, A. E. Raftery, and M. S. Handcock. Latent space approaches to social network analysis. J. of the American Statistical Association, 2002.
    Google ScholarLocate open access versionFindings
  • D. E. Knuth. The Stanford GraphBase: a platform for combinatorial computing, volume 37. Addison-Wesley Reading, 1993.
    Google ScholarFindings
  • J. Leskovec and A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.
    Findings
  • K. Li, J. Gao, S. Guo, N. Du, X. Li, and A. Zhang. LRBM: A restricted boltzmann machine based approach for representation learning on linked data. In ICDM, 2014.
    Google ScholarLocate open access versionFindings
  • X. Li, N. Du, H. Li, K. Li, J. Gao, and A. Zhang. A deep learning approach to link prediction in dynamic networks. In ICDM, 2014.
    Google ScholarLocate open access versionFindings
  • Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel. Gated graph sequence neural networks. In ICLR, 2016.
    Google ScholarLocate open access versionFindings
  • D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. J. of the American society for information science and technology, 58(7):1019–1031, 2007.
    Google ScholarLocate open access versionFindings
  • A. Liberzon, A. Subramanian, R. Pinchback, H. Thorvaldsdóttir, P. Tamayo, and J. P. Mesirov. Molecular signatures database (MSigDB) 3.0. Bioinformatics, 27(12):1739–1740, 2011.
    Google ScholarLocate open access versionFindings
  • M. Mahoney. Large text compression benchmark. www.mattmahoney.net/dc/textdata, 2011.
    Findings
  • T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.
    Google ScholarFindings
  • T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.
    Google ScholarFindings
  • J. Pennington, R. Socher, and C. D. Manning. GloVe: Global vectors for word representation. In EMNLP, 2014.
    Google ScholarLocate open access versionFindings
  • B. Perozzi, R. Al-Rfou, and S. Skiena. DeepWalk: Online learning of social representations. In KDD, 2014.
    Google ScholarLocate open access versionFindings
  • P. Radivojac, W. T. Clark, T. R. Oron, A. M. Schnoes, T. Wittkop, A. Sokolov, K. Graim, C. Funk, Verspoor, et al. A large-scale evaluation of computational protein function prediction. Nature methods, 10(3):221–227, 2013.
    Google ScholarLocate open access versionFindings
  • B. Recht, C. Re, S. Wright, and F. Niu. Hogwild!: A lock-free approach to parallelizing stochastic gradient descent. In NIPS, 2011.
    Google ScholarLocate open access versionFindings
  • S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
    Google ScholarLocate open access versionFindings
  • J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. LINE: Large-scale Information Network Embedding. In WWW, 2015.
    Google ScholarLocate open access versionFindings
  • L. Tang and H. Liu. Leveraging social media networks for classification. Data Mining and Knowledge Discovery, 23(3):447–478, 2011.
    Google ScholarLocate open access versionFindings
  • J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.
    Google ScholarLocate open access versionFindings
  • F. Tian, B. Gao, Q. Cui, E. Chen, and T.-Y. Liu. Learning deep representations for graph clustering. In AAAI, 2014.
    Google ScholarLocate open access versionFindings
  • K. Toutanova, D. Klein, C. D. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL, 2003.
    Google ScholarFindings
  • G. Tsoumakas and I. Katakis. Multi-label classification: An overview. Dept. of Informatics, Aristotle University of Thessaloniki, Greece, 2006.
    Google ScholarLocate open access versionFindings
  • A. Vazquez, A. Flammini, A. Maritan, and A. Vespignani. Global protein function prediction from protein-protein interaction networks. Nature biotechnology, 21(6):697–700, 2003.
    Google ScholarLocate open access versionFindings
  • S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang, and S. Lin. Graph embedding and extensions: a general framework for dimensionality reduction. IEEE TPAMI, 29(1):40–51, 2007.
    Google ScholarLocate open access versionFindings
  • J. Yang and J. Leskovec. Overlapping communities explain core-periphery organization of networks. Proceedings of the IEEE, 102(12):1892–1902, 2014.
    Google ScholarLocate open access versionFindings
  • S.-H. Yang, B. Long, A. Smola, N. Sadagopan, Z. Zheng, and H. Zha. Like like alike: joint friendship and interest propagation in social networks. In WWW, 2011.
    Google ScholarLocate open access versionFindings
  • R. Zafarani and H. Liu. Social computing data repository at ASU, 2009.
    Google ScholarFindings
  • S. Zhai and Z. Zhang. Dropout training of matrix factorization and autoencoder for link prediction in sparse graphs. In SDM, 2015.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科