Query-driven parallel exploration of large datasets

Atanas Atanasov,Madhusudhanan Srinivasan,Tobias Weinzierl

LDAV（2012）

引用 8|浏览11

暂无评分

摘要

Recent advances in supercomputing capabilities pose a multi-faceted data retrieval challenge to the exploration and visualisation of the obtained results: the bandwidth between visualisation devices and the high-performance computing (HPC) clusters neither scales with the simulation data nor with the compute power, the total memory footprint of the data on the supercomputer often exceeds the aggregate memory on the visualisation, and the data has to be distributed among several visualisation nodes working in parallel to render a visual. In the present paper, we introduce an on-demand data exploration paradigm that leverages HPC capabilities and distributed visualisation without requiring a large memory footprint on the visualisation cluster. Regions of interest within the data are specified by the user in the form of queries. These queries, augmented by node identifiers on the visualisation cluster, are automatically distributed among multiple compute nodes of the HPC cluster. The compute nodes work in parallel to assemble and merge data in response to the user query until the data distribution matches the visualisation cluster's topology. Query results are then simultaneously streamed to the right visualisation nodes. Our approach allows for interactive exploration of data residing on HPC resources, irrespective of memory footprint. The streaming of data to the visualisation nodes scales with the bandwidth of the interconnecting network and the HPC cluster's domain decomposition, while the latter is hidden from the visualisation and can change dynamically. We demonstrate the capability of our query-driven approach with a turbulent mixing dataset, and show that it supports interactive data exploration on HPC systems.

查看译文

关键词

large-scale data,parallel processing,large datasets,interactive data exploration,supercomputer,high-performance computing,on-demand data exploration paradigm,turbulent mixing dataset,rendering (computer graphics),multifaceted data retrieval,data visualisation,visualisation devices,domain decomposition,parallel visualisation nodes,simulation data,distributed visualisation,user query response,computational steering,rendering,hpc clusters,query-driven approach,interconnecting network bandwidth,interactive systems,on-demand data exploration,data distribution,query-driven parallel exploration,query processing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要