I/O-Efficient Butterfly Counting at Scale.

Zhibin Wang,Longbin Lai, Yixue Liu, Bing Shui,Chen Tian,Sheng Zhong

Proc. ACM Manag. Data(2023)

引用 0|浏览22
暂无评分
摘要
Butterfly (a cyclic graph motif) counting is a fundamental task with many applications in graph analysis, which aims at computing the number of butterflies in a large graph. With the rapid growth of graph data, it is more and more challenging to do butterfly counting due to the super-linear time complexity and large memory consumption. In this paper, we study I/O-efficient algorithms for doing butterfly counting on hierarchical memory. Existing algorithms of the kind cannot guarantee I/O optimality. Observing that in order to count butterflies, it suffices to "witness" a subgraph instead of the whole structure, a new class of algorithms called semi-witnessing algorithm is proposed. We prove that a semi-witnessing algorithm is not restricted by the lower bound Ømega(|E|2/MB) of a witnessing algorithm, and give a new bound of Ømega(min(|E|2/MB, |E|/|V| √M B)). We further develop the IOBufs algorithm that manages to approach the I/O lower bound, and thus claim its optimality. Finally, we make efforts to parallelize IOBufs to further improve the performance and scalability. We show in the experiment that IOBufs significantly outperforms the state-of-the-art algorithms EMRC and BFC-EM. In addition, IOBufs can scale to conducting butterfly counting on the Clueweb graph with 37 billion edges and quintillions (10^18 ) of butterflies.
更多
查看译文
关键词
scale,o-efficient
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要