Topological Graph Sketching for Incremental and Scalable Analytics
ACM International Conference on Information and Knowledge Management(2016)
摘要
We propose a novel, scalable, and principled graph sketching
technique based on minwise hashing of local neighborhood.
For an n-node graph with e-edges (e >> n), we incrementally
maintain in real-time a minwise neighbor sampled subgraph
using k hash functions in O(n * k) memory, limit being
user-configurable by the parameter k. Symmetrization
and similarity based techniques can recover from these data
structures a significant portion of the original graph. We
present theoretical analysis of the minwise sampling strategy
and also derive unbiased estimators for important graph
properties such as triangle count and neighborhood overlap.
We perform an extensive empirical evaluation of our framework
on a wide variety of real-world graph data sets drawn
from different application domains using three fundamental
large network analysis algorithms: local and global clustering
coefficient, PageRank, and local graph sparsification. With bounded memory,
the quality of results using the sketch representation is
competitive against baselines which use the full graph,
and the computational performance is significantly better.
Our framework is flexible and configurable to be leveraged
by numerous other graph analytics algorithms,
potentially reducing the information mining time on large
streamed graphs for a variety of applications.
更多查看译文
关键词
Graph Sampling,Min-wise Hashing,Scalable Analysis Algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要