Contributions to High-Performance Big Data Computing

Geoffrey Fox,Judy Qiu,David Crandall,Gregor Von Laszewski,Oliver Beckstein,John Paden,Ioannis Paraskevakos,Shantenu Jha,Fusheng Wang,Madhav Marathe,Anil Vullikanti,Thomas Cheatham

Advances in Parallel Computing（2019）

引用 4|浏览39

暂无评分

摘要

Our project is at the interface of Big Data and HPC - High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.

查看译文

关键词

HPC,Big Data,Clouds,Graph Analytics,Polar Science,Pathology,Biomolecular simulations,Network Science,MIDAS,SPIDAL

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要