Clustering Embedded Approaches for Efficient Information Network Inference

Data Science and Engineering(2015)

引用 11|浏览45
暂无评分
摘要
Nowadays, the message diffusion links among users or Web sites drive the development of countless innovative applications. However, in reality, it is easier for us to observe the time stamps when different nodes in the network react on a message, while the connections empowering the diffusion of the message remain hidden. This motivates recent extensive studies on the network inference problem : unveiling the edges from the records of messages disseminated through them. Existing solutions are computationally expensive, which motivates us to develop an efficient two-step general framework, Clustering Embedded Network Inference (CENI). CENI integrates clustering strategies to improve the efficiency of network inference. By clustering nodes directly on the time lines of messages, we propose two naive implementations of CENI: Infection-centric CENI and Cascade-centric CENI . Additionally, we point out the critical dimension problem of CENI: Instead of one-dimensional time lines, we need to first project the nodes to an Euclidean space of certain dimension before clustering. A CENI adopting clustering method on the projected space can better preserve the structure hidden in the cascades and generate more accurately inferred links. By addressing the critical dimension problem, we propose the third implementation of the CENI framework: Projection-based CENI . Through extensive experiments on two real datasets, we show that the three CENI models only need around 20–50 % of the running time of state-of-the-art methods. Moreover, the inferred edges of Projection-based CENI preserve or even outperform the effectiveness of state-of-the-art methods.
更多
查看译文
关键词
Network inference,Information networks,Information diffusion,Survival models,Clustering,Embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要