Faster Algorithms for Mining Shortest-Path Distances from Massive Time-Evolving Graphs


引用 0|浏览0
Computing shortest-path distances is a fundamental primitive in the context of graph data mining, since this kind of information is essential in a broad range of prominent applications, which include social network analysis, data routing, web search optimization, database design and route planning. Standard algorithms for shortest paths (e.g., Dijkstra's) do not scale well with the graph size, as they take more than a second or huge memory overheads to answer a singlequery on the distancefor large-scale graph datasets. Hence, they are not suited to mine distances from big graphs, which are becoming the norm in most modern application contexts. Therefore, to achieve faster query answering, smarter and more scalable methods have been designed, the most effective of them based on precomputing and querying a compact representation of the transitive closure of the input graph, called the 2-hop-coverlabeling. To use such approaches in realistictime-evolvingscenarios, when the managed graph undergoes topological modifications over time, specificdynamic algorithms, carefully updating the labeling as the graph evolves, have been introduced. In fact, recomputing from scratch the 2-hop-coverstructure every time the graph changes is not an option, as it induces unsustainable time overheads. While the state-of-the-art dynamic algorithm to update a 2-hop-coverlabeling againstincrementalmodifications (insertions of arcs/vertices, arc weights decreases) offers very fast update times, the only known solution fordecrementalmodifications (deletions of arcs/vertices, arc weights increases) is still far from being considered practical, as it requires up to tens of seconds of processing per update in several prominent classes of real-world inputs, as experimentation shows. In this paper, we introduce a new dynamic algorithm to update 2-hop-coverlabelings against decremental changes. We prove its correctness, formally analyze its worst-case performance, and assess its effectiveness through an experimental evaluation employing both real-world and synthetic inputs. Our results show that it improves, by up to several orders of magnitude, upon average update times of the only existing decremental algorithm, thus representing a step forward towards real-time distance mining in general, massive time-evolving graphs.
large graph mining,algorithm engineering,experimental algorithmics,time-evolving data,big data processing
AI 理解论文