AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We have proposed a new model able to label nodes of heterogeneous networks where the nodes are of different types, each type corresponding to a particular set of possible categories

Learning latent representations of nodes for classifying in heterogeneous social networks

WSDM, pp.373-382, (2014)

引用111|浏览149
EI
下载 PDF 全文
引用
微博一下

摘要

Social networks are heterogeneous systems composed of different types of nodes (e.g. users, content, groups, etc.) and relations (e.g. social or similarity relations). While learning and performing inference on homogeneous networks have motivated a large amount of research, few work exists on heterogeneous networks and there are open and ...更多

代码

数据

0
简介
  • Social Media on the Web are most often complex heterogeneous networks with nodes and relations between nodes of different types, corresponding to different objects, concepts and relationships.
  • Many of them rely on the idea of mapping an heterogeneous network onto an homogeneous network so that classical relational techniques are used [8, 10, 14, 2]
  • They do not fully exploit the correlations between the different node labels or characteristics.
  • The authors consider a transductive context where the network is composed of < N labeled nodes (x1, . . . , x ) with ∀i ∈ {1 . . . }, yi ∈ RCti , and N − l unlabeled nodes.
重点内容
  • Social Media on the Web are most often complex heterogeneous networks with nodes and relations between nodes of different types, corresponding to different objects, concepts and relationships
  • We consider the task of node classification in heterogeneous networks composed of different types of nodes, each node type beging associated with its own set of labels
  • Much work has been devoted to classification for homogeneous networks composed of a single node type, e.g. [3, 17, 19, 27] to cite a few
  • We have proposed a new model able to label nodes of heterogeneous networks where the nodes are of different types, each type corresponding to a particular set of possible categories
  • Our experiments on three datasets show that the proposed model outperforms classical approaches, and qualitative analysis show that the proposed method effectively captures the inter-dependencies between the labels of different types of nodes
  • Different extensions of this model are currently being investigated: the first one is to deal with multi-relational heterogeneous networks – i.e. heterogeneous networks where two nodes can be connected with more than one relation – and dynamic heterogeneous networks
结论
  • The authors have proposed a new model able to label nodes of heterogeneous networks where the nodes are of different types, each type corresponding to a particular set of possible categories.
  • The authors' experiments on three datasets show that the proposed model outperforms classical approaches, and qualitative analysis show that the proposed method effectively captures the inter-dependencies between the labels of different types of nodes
  • Different extensions of this model are currently being investigated: the first one is to deal with multi-relational heterogeneous networks – i.e. heterogeneous networks where two nodes can be connected with more than one relation – and dynamic heterogeneous networks.
  • The authors are working on an extension which allows one to learn sparse representation of nodes in the graph
表格
  • Table1: Statistics on the three datasets
  • Table2: Accuracy over the DBLP datasets with a latent space of size 30
  • Table3: P@1 over the Flickr datasets with a latent space of size 200 noted by Mapping to Homogeneous Model (MTH), uses multiple homogeneous graphs, one for each node type, to represent the heterogeneous network. For each homogeneous problem, we use the model proposed in [<a class="ref-link" id="c6" href="#r6">6</a>] that minimizes the loss in Equation 1. It does not make use of the correlations between labels of nodes types
  • Table4: a) Accuracy on DBLP No Content depending on the representation size Z
  • Table5: P@1 and P@k on Flickr depending on the representation size Z w.r.t MTH and ULS
Download tables as Excel
相关工作
  • Graph node classification has motivated a lot of work during the last decade and different models have been proposed. Two main families of models can be distinguished: (i) Collective classification techniques are extensions of inductive learning to relational data. They consider both node attributes, labels and their dependencies. The classification problem is formulated as an optimal assignment of labels to the vertices of a graph. Since exact algorithms cannot be used in general for this combinatorial problem, approximate iterative algorithms have been developed. Sen et al [19] provide a general introduction and a comparison of these models. They distinguish between local and global models. The former such as Iterative Classification [19] and its variants like SICA [18], Gibbs Sampling [17], or Stacked Learning [15], make use of local classifiers taking as input the node attributes and statistics on the neighbors labels. The latter attempt to optimize a global function using graphical models, e.g. Markov Random Fields trained for example using loopy belief propagation. In practice, they advocate the use of simple local models which offer similar performance as more complex graphical models and do not suffer from convergence problems exhibited by the latter. All these methods have been proposed for homogeneous networks whereas heterogeneous classification was explicitly mentioned as an open problem. (ii) The second family of models consists of semi-supervised and transductive regularized models – [28], [3] and [27] – which are based on the minimization of an objective function that encourages connected nodes to have the same labels. This family of models has been initially proposed for pure relational classification (no content associated to nodes) and for homogeneous networks. It has been extensively used in many different contexts. Extensions for handling node content information have been developed with applications to social network labeling [6] and Web-spam detection [1]. Several extensions have been proposed for dealing with more complex networks. For example, algorithms have been developed for multi-relational graphs, where nodes are all of the same type, but can be connected through multiple relations which will have different influences on the label propagation. These methods can learn from the data the weights of the different relations, and the multi-graph is then reduced to a simple graph by combining the different types of relations –[25, 12, 9]. Note that besides node classification, other methods have been proposed for other tasks like linkprediction [7] for example Mining heterogeneous networks is a much more recent domain and different tasks have been addressed: classification of nodes [11, 10, 8, 2], link prediction [5],[24], influence analysis [16], clustering [20], entity similarity search [26], [21]. We will concentrate here on classification which has been addressed in a few papers.
基金
  • This work has been partially supported by the REMI FUI project and the ANR (French National Research Agency) MLVIS project
引用论文
  • Jacob Abernethy, Olivier Chapelle, and Carlos Castillo. Witch: A new approach to web spam detection. In In Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), 2008.
    Google ScholarLocate open access versionFindings
  • Ralitsa Angelova, Gjergji Kasneci, and Gerhard Weikum. Graffiti: graph-based classification in heterogeneous networks. World Wide Web, 15(2):139–170, 2012.
    Google ScholarLocate open access versionFindings
  • Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res., 7:2399–2434, December 2006.
    Google ScholarLocate open access versionFindings
  • A. Bordes, J. Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowledge bases. In AAAI, 2011.
    Google ScholarLocate open access versionFindings
  • Darcy Davis, Ryan Lichtenwalter, and N.V. Chawla. Multi-Relational Link Prediction in Heterogeneous Information Networks. In ASONAM, 2011.
    Google ScholarLocate open access versionFindings
  • Ludovic Denoyer and Patrick Gallinari. A ranking based model for automatic image annotation in a social network. In ICWSM, 2010.
    Google ScholarLocate open access versionFindings
  • Sheng Gao, Ludovic Denoyer, and Patrick Gallinari. Link pattern prediction with tensor decomposition in multi-relational networks. In CIDM, 2011.
    Google ScholarLocate open access versionFindings
  • Taehyun Hwang and Rui Kuang. A heterogeneous label propagation algorithm for disease gene discovery. In SDM, page 12, 2010.
    Google ScholarLocate open access versionFindings
  • Yann Jacob, Ludovic Denoyer, and Patrick Gallinari. Classification and annotation in social corpora using multiple relations. In CIKM, pages 1215–1220, 2011.
    Google ScholarLocate open access versionFindings
  • Ming Ji, Jiawei Han, and Marina Danilevsky. Ranking-based classification of heterogeneous information networks. In KDD, pages 1298–1306. ACM, 2011.
    Google ScholarLocate open access versionFindings
  • Ming Ji, Yizhou Sun, Marina Danilevsky, Jiawei Han, and Jing Gao. Graph regularized transductive classification on heterogeneous information networks. In ECML PKDD, volume 53, pages 570–586, 2010.
    Google ScholarLocate open access versionFindings
  • T. Kato, H. Kashima, and M. Sugiyama. Integration of multiple networks for robust label propagation. In SIAM Conf. on Data Mining, pages 716–726, 2008.
    Google ScholarLocate open access versionFindings
  • Xiangnan Kong, Bokai Cao, and Philip S. Yu. Multi-label classification by mining label and instance correlations from heterogeneous information networks. KDD ’13, pages 614–622, 2013.
    Google ScholarFindings
  • Xiangnan Kong, Philip S. Yu, Ying Ding, and David J. Wild. Meta path-based collective classification in heterogeneous information networks. In CIKM, pages 1567–1571, 2012.
    Google ScholarLocate open access versionFindings
  • Zhenzhen Kou. Stacked graphical models for efficient inference in markov random fields. In In Proc. of the 2007 SIAM International Conf. on Data Mining, 2007.
    Google ScholarLocate open access versionFindings
  • Lu Liu, Jie Tang, Jiawei Han, and Meng Jiang. Mining topic-level influence in heterogeneous networks. In CIKM, 2010.
    Google ScholarLocate open access versionFindings
  • Sofus A. Macskassy and Foster Provost. A simple relational classifier. In Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003, pages 64–76, 2003.
    Google ScholarLocate open access versionFindings
  • Francis Maes, Stephane Peters, Ludovic Denoyer, and Patrick Gallinari. Simulated iterative classification a new learning procedure for graph labeling. Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases Part II, II:47–62, 2009.
    Google ScholarLocate open access versionFindings
  • Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Gallagher, and Tina Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93–106, 2008.
    Google ScholarLocate open access versionFindings
  • Yizhou Sun, Charu C. Aggarwal, and Jiawei Han. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. PVLDB, 5(5):394–405, 2012.
    Google ScholarLocate open access versionFindings
  • Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. PVLDB, 4(11):992–1003, 2011.
    Google ScholarLocate open access versionFindings
  • Yizhou Sun, Yintao Yu, and Jiawei Han. Ranking-based clustering of heterogeneous information networks with star network schema. In KDD, pages 797–806, 2009.
    Google ScholarLocate open access versionFindings
  • Lei Tang, Xufei Wang, and Huan Liu. Community detection via heterogeneous interaction analysis. Data Min. Knowl. Discov., 25(1):1–33, 2012.
    Google ScholarLocate open access versionFindings
  • Chi Wang, Rajat Raina, David Fong, Ding Zhou, Jiawei Han, and Greg J. Badros. Learning relevance from heterogeneous social network and its application in online targeting. In SIGIR, pages 655–664, 2011.
    Google ScholarLocate open access versionFindings
  • Meng Wang, Xian-Sheng Hua, Richang Hong, Jinhui Tang, Guo-Jun Qi, and Yan Song. Unified video annotation via multigraph learning. Circuits and Systems for Video Technology, IEEE Transactions on, 19(5):733 –746, may 2009.
    Google ScholarLocate open access versionFindings
  • Xiao Yu, Yizhou Sun, Brandon Norick, Tiancheng Mao, and Jiawei Han. User guided entity similarity search using meta-path selection in heterogeneous information networks. CIKM ’12, pages 2025–2029, New York, NY, USA, 2012. ACM.
    Google ScholarFindings
  • Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bernhard Scholkopf. Learning with local and global consistency. In Sebastian Thrun, Lawrence Saul, and Bernhard Scholkopf, editors, Advances in Neural Inform. Process. Systems 16. 2004.
    Google ScholarLocate open access versionFindings
  • Dengyong Zhou, Jiayuan Huang, and Bernhard Scholkopf. Learning from labeled and unlabeled data on a directed graph. In Proc. of the 22nd intern. conf. on Mach. learn., ICML ’05, pages 1036–1043, 2005.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn