Distant Meta-Path Similarities for Text-Based Heterogeneous Information NetworksEI
Measuring network similarity is a fundamental data mining problem. The mainstream similarity measures mainly leverage the structural information regarding to the entities in the network without considering the network semantics. In the real world, the heterogeneous information networks (HINs) with rich semantics are ubiquitous. However, the existing network similarity doesnu0027t generalize well in HINs because they fail to capture the HIN semantics. The meta-path has been proposed and demonstrated as a right way to represent semantic...更多
- 2Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko. Similarity Search: The Metric Space Approach (Advances in Database Systems).Similarity Search: The Metric Space Approach (Advances in Database Systems), 2005.
- 5Terry Hedges, 'An empirical modification to linear wave theory'., 1977.
- 6Inder Jeet Taneja, New Developments in Generalized Information Measures.Advances in Imaging and Electron Physics, pp. 37-135, 1995.
- 8Yizhou Sun, Jiawei Han. Mining Heterogeneous Information Networks: Principles and Methodologies.Mining Heterogeneous Information Networks: Principles and Methodologies, 2012.
- 9Daniel G Gavin, W.Wyatt Oswald, Eugene R Wahl, John W Williams. A statistical approach to evaluating distance metrics and analog assignments for pollen records.Quaternary Research, pp. 356-367, 2003.
- 11Jiawei Han, Yizhou Sun, Xifeng Yan, Philip S. Yu. Mining knowledge from databases: an information network analysis approach.international conference on data engineering, 2012.
- 14J. C. Gower, A general coefficient of similarity and some of its properties.Biometrics, 1971.
- 15Kameo Matusita, Decision Rules, Based on the Distance, for Problems of Fit, Two Samples, and Estimation.The Annals of Mathematical Statistics, pp. 631-640, 1955.
- 18Alexander Strehl, Joydeep Ghosh. Cluster ensembles --- a knowledge reuse framework for combining multiple partitions.Journal of Machine Learning Research, pp. 583-617, 2002.
- 23J Roger Bray, J T Curtis. An ordination of the upland forest communities of southern Wisconsin.Ecological Monographs, 2005.
- 24Flemming Tops?e, Some inequalities for information divergence and related measures of discrimination.IEEE Transactions on Information Theory, pp. 1602-1609, 2000.
- 25Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, Jiawei Han. KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks.IEEE International Conference on DataMining, 2015.
- 27Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, Jiawei Han. Text Classification with Heterogeneous Information Network Kernels.AAAI, pp. 2130-2136, 2016.
CIKM, pp. 1629-1638, 2017.